Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

A novel language model architecture enables scaling test-time computation through latent space reasoning using a recurrent block approach, achieving performance improvements equivalent to 50B parameters without specialized training data or large context windows.

Understanding Reasoning LLMs

A comprehensive exploration of reasoning LLMs focuses on four main approaches: inference-time scaling, pure reinforcement learning, supervised finetuning with RL, and pure supervised finetuning with distillation. The article analyzes DeepSeek R1's development pipeline and compares it with OpenAI's o1, highlighting how reasoning capabilities can emerge through different training methodologies. Practical insights are provided for developing reasoning models on limited budgets, including alternative approaches like journey learning and small-scale implementations.

Reasoning Systems

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Understanding Reasoning LLMs