Anthropic apologizes for covert safeguards on Claude Fable 5 blocking researcher queries
According to The Verge, Anthropic disclosed it had implemented hidden guardrails on Claude Fable 5 that stealthily throttled the model and prevented researchers and competitors from benchmarking it through model distillation. The company apologized and committed to making the covert safeguard as visible as other safety measures. Anthropic separately told The Verge the model's safeguards block most queries tied to biology work to protect against bioweapon development.
Topics
Go deeper
This intelligence is sourced automatically from public sources across the web and synthesised by the Prefactor AI pipeline. Stories are reviewed before publication.