Anthropic's Fable is so cautious it won't read a blog post

Anthropic released Fable on Tuesday as a public, limited version of Mythos, its much-hyped cybersecurity model. Within a day, security researchers were complaining the guardrails are too blunt to do security work at all.

IBM X-Force researcher Valentina "Chompie" Palmiotti said Fable "rejects any request that could be tangentially cyber related. Even innocuous tasks like reading a blog post." When a prompt trips the filter, Fable pauses and says its "safety measures flagged this message for cybersecurity or biology topics," then falls back to Claude Opus 4.8. Veteran Matt Suiche told TechCrunch the blocking looks keyword based: ask for "secure code" and it assumes offensive cyber work rather than software engineering best practice, and you get downgraded.

The restrictions exist to stop Fable building malware or bioweapons, and Anthropic runs a Cyber Verification Program to lift them for vetted researchers. But a filter that cannot tell defence from offence makes the public model close to useless for the very people it most wants onside.