Just a Wrapper? How Much Do Scaffolds Matter?

According to a new research paper on LessWrong by Hans Gundlach and colleagues, the software scaffolding surrounding an AI model might matter more than the model itself. Scaffolding refers to the software environment and context that wraps an AI system—things like tool access, prompting strategies, memory management, and how the model loops through thinking and acting. The researchers analyzed data from the Holistic Agent Leaderboard and found that scaffolds can create massive performance variations: the same model running under different scaffolds showed up to one hundred times difference in task performance. Even more striking, the data suggests scaffolds explain more of the variation in AI price-performance than the underlying models do. The researchers note this has major implications for how we evaluate AI systems and understand progress in the field. They point to earlier work showing that some scaffolding techniques—like the LATS agent framework—can yield performance improvements equivalent to a ten to one hundred times increase in training compute. The catch: scaffolds aren't universal. The same framework might dramatically help one model while barely affecting another. That variation, the researchers speculate, could be driving consolidation in the AI industry as companies figure out which scaffolds work best with which models.

Source: https://www.lesswrong.com/posts/jXLi3dhSpSMd7B6z8/just-a-...

Listen to this story

Hear this and more stories in a personalized audio briefing.

Open The Chonkerton