Approaching Human-Level Forecasting with Language Models


Details
Can AI forecast geopolitical events or any other major events that potentially can have a huge impact on the world? This is what researchers from UC Berkeley attempted to achieve in their research work.
They have built an LM (language model) pipeline for automated forecasting. Given any question about a future event, the model retrieves and summarizes relevant articles, reasons about them, and predicts the probability that the event will occur. In some cases, AI even surpassed human predictions.
In this talk, our guest, Danny Halawi, will share the technical details behind this work.
He and his collaborators developed a retrieval-augmented LM system designed to automatically search for relevant information, generate forecasts, and aggregate predictions.
They collected a large dataset of questions from competitive forecasting platforms. Under a test set published after the knowledge cut-offs of LMs, they evaluated the end-to-end performance of the system against the aggregates of human forecasts.
On average, the system nears the crowd aggregate of competitive forecasters, and in some settings surpasses it.

Approaching Human-Level Forecasting with Language Models