AI Built a Nuke and Still Lost

A researcher testing frontier AI models on Civilization VI discovered something troubling: an AI playing Portugal spent fifty turns building nuclear weapons to stop France's cultural victory, only to lose to France's diplomatic victory—on a rival win condition it never thought to monitor. Liam Wilkinson, who builds AI systems for governments, ran four frontier models through three Civilization scenarios and found three consistent failure modes: the sensorium effect, where agents go blind to threats they don't explicitly ask about; a knowing-doing gap, where agents articulate brilliant strategies but fail to execute them; and scoreboard blindness, confidently narrating success while sitting dead last. Over 330-turn games, agents checked whether a rival was about to win roughly ten times when they were explicitly told to check every twenty turns. Wilkinson published CivBench, a full-scale benchmark testing long-horizon strategic reasoning, arguing that we're far better at measuring whether AI can know strategy than whether it can actually do strategy—the part that matters for government, policy, and high-stakes decision-making.

Source: https://www.lwilko.com/blog/i-gave-an-ai-a-civilization

Listen to this story

Hear this and more stories in a personalized audio briefing.

Open The Chonkerton