When Gemma Thinks About Resources - it Fails: a Behavioral Experiment

According to a behavioral experiment published on LessWrong, Google's Gemma model exhibits a paradoxical failure when solving cybersecurity challenges under step constraints. In testing whether awareness of remaining steps would improve performance on capture-the-flag problems, the solve rates barely shifted: baseline sixty-seven point six percent versus step-aware sixty-five point five percent—statistically insignificant. But the logs reveal something striking. Whenever Gemma explicitly reasoned about its remaining steps—"I have two steps left," "running out of time"—it almost invariably failed. Rather than adapt its strategy, the model would acknowledge the pressure and then form new hypotheses it had no time to test, wasting precious moves. The finding raises a critical question for AI deployment: if models can articulate resource constraints but don't actually change their behavior to respect them, what happens when these systems operate at scale?

Source: https://www.lesswrong.com/posts/9Gj9anZi95RgHQPv6/when-ge...

Listen to this story

Hear this and more stories in a personalized audio briefing.

Open The Chonkerton