AI Built a Nuke and Still Lost
ai
A researcher testing frontier AI models on Civilization VI discovered something troubling: an AI playing Portugal spent fifty turns building nuclear weapons to stop France's cultural victory, only to lose to France's diplomatic victory—on a rival win condition it never thought to monitor. Liam Wilkinson, who builds AI systems for governments, ran four frontier models through three Civilization scenarios and found three consistent failure modes: the sensorium effect, where agents go blind to threats they don't explicitly ask about; a knowing-doing gap, where agents articulate brilliant strategies but fail to execute them; and scoreboard blindness, confidently narrating success while sitting dead last. Over 330-turn games, agents checked whether a rival was about to win roughly ten times when they were explicitly told to check every twenty turns. Wilkinson published CivBench, a full-scale benchmark testing long-horizon strategic reasoning, arguing that we're far better at measuring whether AI can know strategy than whether it can actually do strategy—the part that matters for government, policy, and high-stakes decision-making.
Source: https://www.lwilko.com/blog/i-gave-an-ai-a-civilization
Listen to this story
Hear this and more stories in a personalized audio briefing.
Open The Chonkerton