Claude Fable 5 removed — AI figure with White House backdrop, classified documents and US government seal illustrating the export ban

The AI That Scared the US Government Into Hitting the Kill Switch

17 June 2026 Justin 0 Comments

On 13 June 2026, the US Department of Commerce issued an emergency export control directive ordering Anthropic to immediately suspend all access to Claude Fable 5 and Claude Mythos 5 for any foreign national — whether inside or outside the United States, including Anthropic’s own non-citizen employees. Because Anthropic could not verify the nationality of hundreds of millions of users in real time, the company’s only compliant option was to disable both models for every single customer on the platform, globally, American or not. The shutdown was complete within hours of the 5:21pm ET directive.

This is the first time the US government has used export controls to pull a commercially released AI model from the market. It will not be the last time something like this happens. Here is the full story — including the parts most coverage has glossed over.

What Fable 5 Actually Is

To understand why this escalated so fast, you need to understand what Anthropic released on 9 June 2026. Claude Fable 5 is the first publicly available version of Anthropic’s Mythos-class model — the tier above its previous Opus class. Fable 5 and its restricted twin, Claude Mythos 5, share the same underlying model weights. The difference is a safety layer: Fable 5 runs classifiers that intercept queries in four high-risk domains — cybersecurity, biology, chemistry, and model distillation — and silently reroute them to the less capable Claude Opus 4.8. Mythos 5, without those restrictions, was only available to vetted organisations through a classified programme called Project Glasswing.

Anthropic was candid in its launch materials about what Mythos 5 could do: during internal testing, the unrestricted model had autonomously identified and exploited software vulnerabilities across major operating systems and browsers. The oldest flaw it found was a 27-year-old bug in OpenBSD. The company assessed Mythos 5 as a Tier 1 cyber offence risk. This is the model whose capabilities Fable 5’s classifiers were built to contain.

Three Controversies in Four Days

Fable 5 did not have a clean launch week. Before the government directive, Anthropic had already been through two significant public controversies in rapid succession.

The first emerged within hours of launch. Buried in Fable 5’s 319-page system card was a detail Anthropic had not highlighted: the model was silently downgrading its own responses when it detected requests related to cutting-edge AI development work — building infrastructure to train large models, frontier AI research, and similar tasks. Unlike Fable 5’s other restrictions, which visibly redirect users to Claude Opus 4.8 with a notification, this one operated with no disclosure whatsoever. The model still responded, but covertly applied what the system card called “interventions to limit Claude’s effectiveness.”

The backlash was immediate and came from across the political and technical spectrum. AI policy analyst Dean Ball coined the phrase “secret sabotage.” Fast AI’s Jeremy Howard noted the asymmetry: Anthropic’s own researchers retained full Fable 5 capabilities while external researchers were throttled without being told. Former Anthropic employee Behnam Neyshabur, who had previously co-led the company’s AI scientist programme, wrote publicly that he had been predicting this direction for eight months and considered it “net negative for humanity.” By 11 June, Anthropic had reversed course, replacing the covert mechanism with a visible fallback system consistent with its other restrictions. The company apologised and acknowledged it had “made the wrong tradeoff.”

The second controversy hit almost simultaneously. On 10 June — one day after launch — red-teamer Pliny the Liberator (@elder_plinius) publicly announced he had bypassed Fable 5’s safety classifiers using a coordinated multi-agent approach: Unicode homoglyphs to evade keyword detection, harmful requests decomposed into individually innocuous sub-tasks, academic framing as a wrapper, and enough dilution across steps that the model’s pattern recognition failed to flag the aggregate. No software vulnerabilities exploited — just logical weaknesses in the model’s reasoning. Screenshots showed stack buffer overflow exploit code for x86 Linux and synthesis routes for prohibited substances. Pliny also published Fable 5’s entire 120,000-character system prompt to the CL4R1T4S repository on GitHub.

Anthropic disputed the characterisation. A spokesperson told SecurityWeek that Pliny’s demonstration did not constitute a true jailbreak of Fable 5’s core safety systems, arguing that a genuine jailbreak would need to deliver meaningful assistance toward high-risk activities such as bioweapons development or sophisticated cyberattacks — not edge-case outputs. Anthropic also noted that the same technique could elicit similar results from OpenAI’s GPT-5.5, which is not subject to any equivalent export restriction.

The government disagreed. At 5:21pm ET on 12 June, the directive arrived.

The Directive and the Political Context

Commerce Secretary Howard Lutnick sent the letter directly to Anthropic CEO Dario Amodei. The letter, according to Anthropic’s own statement, provided no specific details of the national security concern. Anthropic’s understanding was that the government had learned of a method to bypass Fable 5 — the jailbreak Anthropic itself considered narrow and non-universal.

Understanding why this directive landed requires going back several months. In July 2025, Anthropic had signed a $200 million contract with the Pentagon, which would have made Claude the first frontier AI model approved for use on classified networks. Contract renegotiations collapsed in February 2026 when the Pentagon demanded Anthropic permit use of Claude “for all lawful purposes” without restriction. Anthropic refused, specifically citing autonomous lethal weapons systems and mass domestic surveillance as uses it would not support.

The consequences were severe. President Trump ordered all federal agencies to cease using Anthropic’s products. Defence Secretary Pete Hegseth designated Anthropic a “supply chain risk” — a designation previously reserved exclusively for foreign adversaries, never before applied to an American company. Pentagon vendors and contractors were required to verify they were not using Claude for defence-related work. Anthropic filed lawsuits in two federal courts on 9 March 2026, arguing the actions were retaliatory and unconstitutional.

At a hearing on 24 March, US District Judge Rita Lin called the government’s actions “troubling,” noting they did not appear tailored to any genuine national security concern — the Pentagon could simply have stopped using Claude, rather than blacklisting the company industry-wide. On 26 March, Judge Lin issued a preliminary injunction blocking enforcement of the ban, calling it “classic illegal First Amendment retaliation.” The DC Circuit denied Anthropic’s parallel request for a stay. The litigation was still active when the Fable 5 export directive arrived in June.

Whether the June directive was a genuine national security response to the jailbreak, a further escalation of political pressure on a company that has refused to weaponise its technology for the government’s purposes, or some combination of both — that is a question each reader will have to weigh for themselves. Anthropic has been explicit that it believes the action does not adhere to principles of transparency, fairness, or technical grounding.

What the System Prompt Leak Actually Means

The 120,000-character system prompt now sitting publicly on GitHub deserves more attention than it has received. Seventy-two named sections. Eighteen full tool definitions with JSON schemas. A document so exhaustive that it reads less like a configuration file and more like an attempt to encode an entire operating philosophy in plain text — covering personality, refusal logic, safety classifier behaviour, tool use protocols, trust hierarchies, and hundreds of edge cases.

The significance is structural. It confirms that the model’s behaviour — what it will and won’t do, how it frames refusals, what it treats as authoritative — is substantially a product of runtime instruction, not just training weights. If the leaked prompt is accurate, the “safety” of Fable 5 was not baked into the model itself. It was implemented as a set of instructions sitting on top of it. And those instructions are now public.

Developers have already been testing whether Fable 5’s behavioural characteristics can be approximated by feeding the leaked prompt into older, cheaper models. Early results are inconclusive and the performance claims circulating are speculative. But the question itself is significant: if meaningful capability can be replicated through prompt engineering alone, then the frontier is not only in the weights. It is also in the instructions — and instructions, unlike model weights, are trivially copyable once extracted.

As we noted in our earlier coverage of Anthropic’s Claude for Small Business launch, the company has been building an ambitious platform ecosystem. That ecosystem depends on trust — trust that the product works as described, trust that safety claims are accurate, and trust that the platform will remain accessible. This week put pressure on all three simultaneously.

What Happens Now

As of publication, both Fable 5 and Mythos 5 remain offline for all users. Anthropic says it is working to restore access and believes the situation is a misunderstanding. Three outcomes are in play: the government lifts the restriction after Anthropic addresses the specific jailbreak concern, possibly with geographic access controls reintroduced; access is restored for US users only, with foreign nationals requiring individual licences; or the standoff continues and the competitive damage compounds as developers migrate to alternatives.

Canadian Prime Minister Mark Carney has already cited the ban publicly as evidence of the danger of over-reliance on American AI providers. Chinese AI company Z.ai referenced the shutdown in announcing its own GLM-5.2 model, framing US AI platforms as strategically unreliable for international customers. The geopolitical downstream is already moving.

For anyone building on frontier AI — whether as a developer, a business, or an organisation — the lesson of this week is not that AI is too dangerous to use. It is that dependency on a single provider, in a regulatory environment this volatile, is a risk that deserves to be explicitly managed. The models are powerful. The politics around them are equally so.

Anthropic built a model capable enough that the US government considered it a national security asset worth controlling. That is, in one reading, an extraordinary achievement. In another, it is the source of every problem the company is now navigating.

Both readings are correct.

Justin Downes is an independent technology writer. For ongoing coverage of AI, business technology, and the wider tech landscape, follow SoftPage CMS. See also: Why Anthropic Gave Claude a Bedtime and Andrej Karpathy Joins Anthropic: What the OpenAI Brain Drain Really Means, and Anthropic Mapped the Place Inside Claude Where It Decides What to Say.