Anthropic's safety problem isn't what it thinks

Anthropic's the responsible AI lab. That's the brand. They held back their Mythos model over cybersecurity concerns, at real cost to the business. But Jonathan Nannen argues there's a blind spot: the company treats safety as a model behavior problem while letting product reliability, pricing, and communication slide.

The gap matters. Developers building on Claude don't experience the company through its constitution. They experience it through silent quality drops, confusing pricing, and defensive explanations after the fact.

April was rough. A reasoning effort reduction, a thinking-cache bug, and a system prompt change stacked into quality problems across Claude Code and related tools. Anthropic's postmortem was unusually transparent for the industry. But Nannen notes it lumped a deliberate product decision with actual bugs, and showed no evidence of staged rollouts for changes hitting production models.

Then there's the pricing mess. Amol Avasare, Anthropic's Head of Growth, ran an experiment that temporarily removed Claude Code from Pro plans. Documentation changed globally before any user communication. Screenshots surfaced before explanations. Simon Willison criticized treating users as A/B test cohorts rather than customers.

Nannen's core argument: trust isn't compartmentalized. A developer who sees pricing confusion and degraded output in the same month has less reason to believe Anthropic's claims about careful access control elsewhere. For a company whose market position anchors on being the trustworthy lab, the safety boundary is wherever trust gets spent. Evals, staged rollouts, stable pricing, and honest communication aren't just product discipline. They're how trust works in practice. If the growth team can quietly burn what the safety team earns, safety is a department value, not a company one.