2025-02-05

LIMO: Less is More for Reasoning

LIMO challenges conventional wisdom by achieving superior mathematical reasoning capabilities using only 817 training samples, outperforming models trained on 100x more data. The research introduces the Less-Is-More Reasoning Hypothesis, suggesting that complex reasoning can emerge through minimal but precise demonstrations when domain knowledge is well-encoded during pre-training.

Original archive.is archive.ph web.archive.org

read comments on news aggregators:

https://news.ycombinator.com/item?id=42991676

GitHub - salesforce/Merlion: Merlion: A Machine Learning Framework for Time Series Intelligence

Merlion is a comprehensive Python library for time series intelligence, offering end-to-end machine learning capabilities for forecasting, anomaly detection, and change point detection. The library features standardized data loading, diverse models, AutoML capabilities, and practical post-processing rules, while supporting both univariate and multivariate analysis with distributed computation via PySpark.

GitHub - deepseek-ai/DualPipe: A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.

DualPipe is a bidirectional pipeline parallelism algorithm that optimizes computation-communication overlap in neural networks by achieving full overlap of forward and backward phases. The solution, presented in the DeepSeek-V3 Technical Report, reduces pipeline bubbles and requires implementation of custom overlapped forward-backward methods for specific modules.

The FFT Strikes Back: An Efficient Alternative to Self-Attention

FFTNet introduces a novel approach to sequence processing using Fast Fourier Transform, achieving O(n log n) complexity compared to traditional self-attention's quadratic complexity. The framework employs spectral filtering and modReLU activation to efficiently capture long-range dependencies, demonstrating superior performance on Long Range Arena and ImageNet benchmarks.

Introducing DeepSearcher: A Local Open Source Deep Research

DeepSearcher is an open-source research agent that builds upon previous work by adding features like conditional execution flow, query routing, and improved interfaces. The system leverages SambaNova's custom hardware for faster inference with the DeepSeek-R1 model, demonstrating advanced concepts in AI research automation through a four-step process of question definition, research, analysis, and synthesis.

Google Co-Scientist AI cracks superbug problem in two days! — because it had been fed the team’s previous paper with the answer in it

Google's Co-Scientist AI tool, powered by Gemini LLM, made headlines for supposedly solving a superbug problem in 48 hours, but it was later revealed that the solution was derived from previously published research. Similar patterns of overstated achievements were found in Google's other AI research claims, including drug discovery and materials synthesis.

Claude 3.7 Sonnet and Claude Code

Anthropic introduces Claude 3.7 Sonnet, a groundbreaking hybrid reasoning model featuring instant responses and extended thinking capabilities, alongside Claude Code for agentic coding tasks. The model demonstrates superior performance in coding and web development, with significant improvements in handling complex codebases and advanced tool usage. Available across multiple platforms, it maintains the same pricing while offering enhanced reasoning capabilities and GitHub integration.

Please Commit More Blatant Academic Fraud

A critical analysis of academic fraud in AI research argues that explicit fraud could paradoxically improve scientific standards by forcing greater scrutiny and skepticism. The author suggests that prevalent subtle fraud has become normalized in academia, leading to widespread publication of papers without scientific merit. The piece advocates for intentional academic misconduct as a way to expose and ultimately reform the field's compromised research practices.

The most underreported and important story in AI right now is that pure scaling has failed to produce AGI

Recent developments suggest that the scaling hypothesis in AI - investing massive resources in data and GPUs to achieve artificial general intelligence - is hitting significant limitations. Major tech companies and investors are acknowledging diminishing returns from pure scaling approaches, with persistent issues like hallucinations and unreliability remaining unsolved. A market correction appears likely as the industry grapples with sustainability concerns and the need for new innovative approaches.

Introduction to CUDA Programming for Python Developers

GPU architecture enables massive parallel processing through thousands of CUDA cores, contrasting with CPU's sequential processing capabilities. CUDA programming provides a platform for developers to harness GPU's parallel power through kernel functions and thread management. The document explores memory management, shared memory optimization, and practical applications in LLM workloads like FlashAttention.

Open Euro LLM

OpenEuroLLM represents a collaborative European initiative to develop transparent, compliant foundation models for AI, focusing on EU languages and cultural diversity. The project aims to create accessible, open-source language models while ensuring compliance with EU regulations and AI standards.

Related articles