Claude Fable 5 and Mythos 5: The System Card

According to LessWrong's Zvi, Anthropic has released Claude Fable 5, a new frontier language model representing a meaningful step forward in AI capabilities. Fable notably outperforms the previous flagship Opus 4.8 on many tasks, though at higher cost and latency. To manage safety risks, Anthropic implemented broad safeguards restricting the model's use in biological research, cybersecurity work, and frontier model development. These classifiers are intentionally over-cautious—triggering on roughly five percent of ordinary queries—and will visibly downgrade users to Opus 4.8 when triggered. The move followed backlash when Anthropic announced plans to silently modify certain queries without disclosure; that undisclosed steering was reversed within 48 hours. The tradeoff is stark: genuine capability gains in exchange for friction and data retention requirements.

Source: https://www.lesswrong.com/posts/ixJDkQBncJBshcvwj/claude-...

Listen to this story

Hear this and more stories in a personalized audio briefing.

Open The Chonkerton