Reinforcement Learning in a Cached Internet Will Give Us a Superhuman Forecaster

According to a post on LessWrong, researcher Amit Levy developed a method to train AI models into superhuman forecasters using reinforcement learning. His key innovation: a 'cached internet' environment where models access historical data through tools like the Wayback Machine and time-masked APIs—simulating the information available at the time a question was originally asked. He trained a moderately-sized open-weights model and found it outperformed much larger closed-source models at predicting real-world outcomes. In actual Metaculus forecasting competitions, the RL-trained model won hundreds of dollars. The research suggests superhuman forecasting might be more valuable than superhuman coding, as it would help organizations and individuals make better decisions by predicting outcomes with near-oracle accuracy.

Source: https://www.lesswrong.com/posts/pQLQ5GMjQP7qKb7HS/reinfor...

Listen to this story

Hear this and more stories in a personalized audio briefing.

Open The Chonkerton