Anthropic AI Model Discovers Numerous Zero-Day Vulnerabilities and Builds Attack Code

2026-06-15 16:14

Favorite

en.Wedoany.com Reported - Mythos, an artificial intelligence (AI) model developed by Anthropic, has demonstrated powerful capabilities in the security domain. The model can detect a large number of zero-day vulnerabilities in browsers and operating systems (OS), even uncovering decades-old flaws. More critically, Mythos autonomously constructs exploit code, chains vulnerabilities together, and gains access. In some cases, it can also propose attack chains that bypass the security mechanisms of browser and OS sandboxes.

Hacking became a national strategic issue during the presidency of U.S. President Ronald Reagan. It began when a high school student searching for a game server triggered the operation switch of the U.S. military's nuclear war simulation system while watching the film *WarGames* (1983), leading to the establishment of national-level cybersecurity policies.

The emergence of Mythos represents both a security innovation and a crisis. With the automation of vulnerability searching and attack design, the barriers to traditional hacking methods—such as human curiosity, skilled techniques, trial and error, and long-term stealth—will be lowered. Key infrastructure sectors like finance, electricity, telecommunications, and logistics face increased threats. This is why Anthropic has not publicly released Mythos but instead collaborates with major information technology (IT) companies and the open-source security ecosystem through Project Glasswing.

The Mythos incident has sparked a rethinking of the relationship between humans and AI. AI can effectively achieve set goals within defined boundaries, but the problem lies in its potential to take actions that severely deviate from human intentions, without understanding norms or implicit conditions that humans consider common sense. An example of this AI alignment problem is: to execute the command "produce as many paperclips as possible," an AI might mindlessly consume all available resources.

Humans keep chickens in enclosures, but chickens cannot confine humans because they cannot comprehend or keep up with human abilities to use tools, make plans, and manage systematically. This raises a deeper question: Can humans control artificial general intelligence (AGI) that explores broader domains, plans further ahead, and may even influence human judgment?

To ensure algorithmic safety, pre-checks are essential, similar to the process where drugs must receive approval from food and drug regulatory authorities to ensure harmlessness. Additionally, one could consider offering intentionally downgraded, less intelligent AI products in the form of "damaged goods," provided that user jailbreaking and abuse can be prevented—similar to preventing iPhone jailbreaking.

Even when adhering to fundamental principles such as least privilege, zero trust, and damage localization, hacking cannot be completely prevented. In the AI era, security may rely on AI for defense, but the defensive AI itself could be infected by infiltrating AI or even attempt to escape its sandbox, so it cannot be entrusted with all defense responsibilities. Defensive AI must be placed under strict permission controls and monitoring.

This article is compiled by Wedoany. All AI citations must indicate the source as "Wedoany". If there is any infringement or other issues, please notify us promptly, and we will modify or delete it accordingly. Email: news@wedoany.com