Flipping the eval on its head

According to LessWrong, artificial intelligence researchers are proposing to flip how we evaluate cryptographic tools and security libraries. Typically, evaluations vary the language model while testing the same code—checking if Claude finds different bugs than GPT. The new proposal inverts that: keep the evaluator constant—a language model acting as a security tester—and vary which implementation gets tested. Instead of picking between different OpenSSL versions based on reputation, you'd run each through the same AI-powered red team and measure which has the smallest attack surface. The article cites box-arena, which tested different container runtimes against multiple language models, and proposes an aspirational OpenSSL arena where different implementations—legacy forks, proof-retrofitted versions, Rust rewrites, Lean implementations—compete on empirical security. The appeal: tokens are cheap enough to make this practical, and as formal verification tools improve, you'd want to stress-test them in the real world, not just trust the mathematical proof.

Source: https://www.lesswrong.com/posts/RK2wJFhmZHXvmzjBE/flippin...

Listen to this story

Hear this and more stories in a personalized audio briefing.

Open The Chonkerton