Skip to content

AI Made

AI agents, automation, and tech journalism

Claude Mythos Leak: Anthropic’s Most Powerful Model Accidentally Exposed

In one of the more significant AI security incidents of the year, Anthropic accidentally exposed nearly 3,000 internal files — including a draft blog post describing Claude Mythos, a model internally positioned above Opus as the company’s most capable system ever built. The data was stored on a misconfigured data store and required no authentication to access.

A security researcher discovered the leak on March 26, 2026. Anthropic confirmed the exposure shortly after and acknowledged that the files included structured product launch documents for a model it describes as a “new tier” above Opus. Anthropic did not deny the authenticity of the documents.

What the Leaked Documents Reveal

The draft blog post describes Claude Mythos — codenamed Capybara internally — as “by far the most powerful AI model we have ever developed.” According to the documents, internal testing shows dramatic improvements over Claude Opus 4.6 on programming tasks and reasoning use cases. The leaked post also includes a striking warning: the model is “currently far ahead of any other AI model in cyber capabilities” and “presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders.”

Anthropic confirmed it is rolling out access in deliberate, phased stages — a sign the company is treating the cyber capability risk seriously rather than rushing to market.

Why This Matters

Beyond the embarrassment of an accidental exposure, this leak raises serious questions about how frontier AI labs handle sensitive capability information. Anthropic acknowledged that the leaked documents describing the model’s cyber capabilities were authentic — which means information about highly capable exploitation tools was sitting in an unsecured data store.

The timing is notable. This is not a case of a model being released and then evaluated by external researchers. Instead, a lab appears to have developed — and internally documented — a model with advanced offensive cyber capabilities before any public evaluation or safety benchmark could be applied.

What We Do Not Know

Anthropic has not released technical specifications, evaluation benchmarks, or a safety assessment for Claude Mythos. The company has not confirmed a public release date. The phased rollout described in the leaked documents suggests limited access may already be available to select partners, but no public access path has been announced.

The incident also highlights how much information about frontier AI development sits in internal documents that could be exposed through misconfigured infrastructure — not through model weights being stolen or leaked, but through ordinary operational security failures.

Bottom line: Anthropic confirmed a model called Claude Mythos exists, is more capable than Opus, has significant cyber capabilities, and was accidentally exposed through a misconfigured data store. The public release timeline remains unclear.