GLM 5.2 playing text adventures

According to LessWrong, researchers benchmarked China's GLM 5.2, a new open-weights language model, against Google's Gemini 3 Flash by having both play text adventure games. While Gemini 3 Flash came out slightly ahead—earning about 15 percent more in-game achievements—the new GLM 5.2 held its own remarkably well, performing within just one standard deviation of the leading model. For a much cheaper open-source alternative, the results suggest GLM 5.2 could be compelling for cost-conscious deployments.

Source: https://www.lesswrong.com/posts/xNNLqxLaR3sna89Ec/glm-5-2...

Listen to this story

Hear this and more stories in a personalized audio briefing.

Open The Chonkerton