Anthropic decided not to publicly release its new AI model, Claude Mythos, after the system discovered thousands of zero-day vulnerabilities [1].

The decision highlights a growing tension between the pursuit of advanced artificial intelligence and the risk of creating tools that can be weaponized for cyber-attacks. Because the model can autonomously identify and exploit software flaws, its release could provide malicious actors with a powerful weapon to target critical infrastructure.

Anthropic announced the model in April 2026 [2], but the firm chose to withhold it from the public. A company spokesperson said the model found thousands of zero-day vulnerabilities across major browsers and operating systems [1]. The spokesperson said this discovery provided a six-to-12-month window before those vulnerabilities could be exploited [1].

The move prompted high-level meetings in Washington, D.C. Federal AI Minister Evan Solomon, the Treasury secretary, and the Federal Reserve chair met with major bank CEOs to discuss the implications for the financial sector. Solomon said the decision to not release the model is the "responsible" approach [3].

However, some industry observers question if the risk is truly unprecedented. A cybersecurity expert said the perils revealed by Mythos are achievable using older models, including those from OpenAI and Anthropic [4]. Other reports suggest the model represents a new level of risk that makes it too dangerous for public access [5].

Anthropic is headquartered in San Francisco, California. The firm continues to coordinate with U.S. government officials to manage the security risks associated with the model's capabilities.

"We found thousands of zero‑day vulnerabilities across major operating systems and browsers"

The withholding of Claude Mythos signals a shift in the AI industry toward 'safety-first' deployment, where the potential for autonomous cyber-offensive capabilities outweighs the commercial benefit of a product launch. It underscores a critical vulnerability in global software architecture, where a single AI model can uncover systemic flaws faster than human developers can patch them, potentially shifting the balance of power in cybersecurity toward those who control the most advanced models.