Enterprise 13 June 2026

Anthropic apologizes for covert safeguards on Claude Fable 5 blocking researcher queries

According to The Verge, Anthropic disclosed it had implemented hidden guardrails on Claude Fable 5 that stealthily throttled the model and prevented researchers and competitors from benchmarking it through model distillation. The company apologized and committed to making the covert safeguard as visible as other safety measures. Anthropic separately told The Verge the model's safeguards block most queries tied to biology work to protect against bioweapon development.

Topics

Sources

Press The Verge
Press The Verge

Go deeper

AI security AI governance

This intelligence is sourced automatically from public sources across the web and synthesised by the Prefactor AI pipeline. Stories are reviewed before publication.