LLMs Can Teach Themselves to Better Predict the Future
Novel research demonstrates how large language models can improve their forecasting abilities through self-play and outcome-driven fine-tuning, achieving 7-10% better prediction accuracy without human-curated samples. The approach brings smaller models (Phi-4 14B and DeepSeek-R1 14B) to performance levels comparable to GPT-4 in forecasting tasks.