Anthropic keeps new AI model private after it finds thousands of external vulnerabilities

Anthropic’s most succesful AI model has already discovered thousands of AI cybersecurity vulnerabilities throughout each main working system and internet browser. The firm’s response was to not launch it, however to quietly hand it to the organisations accountable for conserving the web working.

That model is Claude Mythos Preview, and the initiative known as Project Glasswing.

The launch companions embody Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, Nvidia, and Palo Alto Networks.

Beyond that core group, Anthropic has prolonged entry to over 40 extra organisations that construct or keep crucial software program infrastructure. Anthropic is committing as much as US$100 million in utilization credit for Mythos Preview throughout the hassle, together with US$4 million in direct donations to open-source safety organisations.

A model that outgrew its personal benchmarks

Mythos Preview was not particularly skilled for cybersecurity work. Anthropic mentioned the capabilities “emerged as a downstream consequence of common enhancements in code, reasoning, and autonomy”, and that the identical enhancements making the model higher at patching vulnerabilities additionally make it higher at exploiting them.

That final half issues. Mythos Preview has improved to the extent that it largely saturates current safety benchmarks, forcing Anthropic to shift its focus to novel real-world duties–particularly, zero-day vulnerabilities. These flaws had been beforehand unknown to the software program’s builders.

Among the findings: a 27-year-old bug in OpenBSD, an working system recognized for its robust safety posture. In one other case, the model absolutely autonomously recognized and exploited a 17-year-old distant code execution vulnerability in FreeBSD–CVE-2026-4747–that permits an unauthenticated person wherever on the web to acquire full management of a server working NFS. No human was concerned within the discovery or exploitation after the preliminary immediate to seek out the bug.

Nicholas Carlini from Anthropic’s analysis staff described the model’s potential to chain collectively vulnerabilities: “This model can create exploits out of three, 4, or typically 5 vulnerabilities that in sequence provide you with some form of very refined finish final result. I’ve discovered extra bugs within the final couple of weeks than I discovered in the remaining of my life mixed.”

Why is it not being launched?

“We don’t plan to make Claude Mythos Preview typically out there on account of its cybersecurity capabilities,” Newton Cheng, Frontier Red Team Cyber Lead at Anthropic, mentioned. “Given the speed of AI progress, it is not going to be lengthy earlier than such capabilities proliferate, doubtlessly past actors who’re dedicated to deploying them safely. The fallout–for economies, public security, and nationwide safety–may very well be extreme.”

This isn’t hypothetical. Anthropic had beforehand disclosed what it described as the primary documented case of a cyberattack largely executed by AI–a Chinese state-sponsored group that used AI brokers to autonomously infiltrate roughly 30 world targets, with AI dealing with the bulk of tactical operations independently.

The firm has additionally privately briefed senior US authorities officers on Mythos Preview’s full capabilities. The intelligence neighborhood is now actively weighing how the model might reshape each offensive and defensive hacking operations.

The open-source drawback

One dimension of Project Glasswing that goes past the headline coalition: open-source software program. Jim Zemlin, CEO of the Linux Foundation, put it plainly: “In the previous, safety experience has been a luxurious reserved for organisations with massive safety groups. Open-source maintainers, whose software program underpins a lot of the world’s crucial infrastructure, have traditionally been left to determine safety on their very own.”

Anthropic has donated US$2.5 million to Alpha-Omega and OpenSSF by the Linux Foundation, and US$1.5 million to the Apache Software Foundation–giving maintainers of crucial open-source codebases entry to AI cybersecurity vulnerability scanning at a scale that was beforehand out of attain.

What comes subsequent

Anthropic says its eventual purpose is to deploy Mythos-class fashions at scale, however solely when new safeguards are in place. The firm plans to launch new safeguards with an upcoming Claude Opus model first, permitting it to refine them with a model that doesn’t pose the identical stage of danger as Mythos Preview.

The aggressive image is already shifting round it. When OpenAI launched GPT-5.3-Codex in February, the corporate referred to as it the primary model it had labeled as high-capability for cybersecurity duties beneath its Preparedness Framework. Anthropic’s transfer with Glasswing indicators that the frontier labs see managed deployment–not open launch–because the rising customary for fashions at this functionality stage.

Whether that customary holds as these capabilities unfold additional is, at this level, an open query that no single initiative can reply.

Banner for AI & Big Data Expo by TechEx events.

Want to study extra about AI and large information from trade leaders? Check out AI & Big Data Expo happening in Amsterdam, California, and London. The complete occasion is a component of TechEx and is co-located with different main know-how occasions together with the Cyber Security & Cloud Expo. Click here for extra info.

AI News is powered by TechForge Media. Explore different upcoming enterprise know-how occasions and webinars here.

The submit Anthropic keeps new AI model private after it finds thousands of external vulnerabilities appeared first on AI News.