When capabilities work is the *safe* bet
ai
A post on LessWrong challenges a common assumption in AI safety: if you genuinely believe large language models are safer to develop into superintelligence than alternative AI regimes, then working on LLM capabilities — rather than LLM safety — might be the more risk-reducing move. According to Robin Haselhurst, the logic is probabilistic: accelerate LLMs to superintelligence first in a regime you trust, and you lower existential risk more than improving safety in a less-trustworthy regime. He acknowledges the math is illustrative and different research paths have vastly different difficulty levels. The punchline: if you believe the premise, capabilities work becomes perfectly rational for someone focused on safety.
Source: https://www.lesswrong.com/posts/NgPfJ7ATYqMFQr7zu/when-ca...
Listen to this story
Hear this and more stories in a personalized audio briefing.
Open The Chonkerton