I think alignment work is more promising than control work

According to a post on LessWrong, the AI safety community may be betting on the wrong approach. Researcher Alec Harris argues that teams should focus on 'alignment'—making AI systems fundamentally well-behaved from the start—rather than 'control,' which uses external safeguards to contain dangerous systems. His reasoning centers on scaling: as AI grows more powerful, physically controlling an unaligned system becomes exponentially harder, while building systems that are inherently well-aligned remains more tractable. Harris proposes an aggressive reallocation: an eight-to-one resource ratio favoring alignment research. The strategy rests on a specific wager: moderately aligned AIs could help align even more capable systems, creating a self-reinforcing safety chain. It reflects a real debate in AI safety about which bets matter most as systems approach superintelligence.

Source: https://www.lesswrong.com/posts/dmHbogCFbSp95J3Lz/i-think...

Listen to this story

Hear this and more stories in a personalized audio briefing.

Open The Chonkerton