2025-02-09

Les enjeux de l’IA : mon interview sur France 2 et Firstpost.

Transformers' extraordinary learning capabilities allow them to master skills through simple observation of related tasks, showcasing the potential of emergent behavior in AI. Recent studies demonstrate that transformer models can learn complex skills without explicit training, revealing profound implications for future AI development and understanding.

Original archive.is archive.ph web.archive.org

Andrew Barto and Richard Sutton are the recipients of the 2024 ACM A.M. Turing Award for developing the conceptual and algorithmic foundations of reinforcement learning.

Andrew Barto and Richard Sutton received the 2024 ACM A.M. Turing Award for their pioneering work in reinforcement learning, which has become fundamental to modern AI systems. Their contributions include developing key algorithms and mathematical foundations that enabled breakthroughs like AlphaGo and ChatGPT. The award, often called the Nobel Prize in Computing, carries a $1 million prize sponsored by Google.

Writing an LLM from scratch, part 8 -- trainable self-attention

A detailed explanation of implementing trainable self-attention in LLMs, focusing on scaled dot product attention and matrix projections. The article breaks down how attention scores are calculated through query, key, and value matrices, demonstrating how five matrix multiplications can efficiently process token relationships.

Launch HN: Enhanced Radar (YC W25) – A safety net for air traffic control

Two pilots have developed Yeager, an AI-powered system that monitors air traffic control communications to enhance aviation safety by detecting potential human errors. The system achieves a 1.1% Word Error Rate in transcribing ATC audio and operates independently of existing infrastructure, providing an additional safety layer without requiring integration.

ARC-AGI Without Pretraining

A novel approach demonstrates that lossless information compression during inference time can produce intelligent behavior, achieving 34.75% accuracy on ARC-AGI training set without pretraining or extensive datasets. The method, CompressARC, processes each puzzle in 20 minutes using only compression objectives and efficient inference-time computation, challenging conventional reliance on extensive pretraining and data.

GitHub - takara-ai/go-attention: A full attention mechanism and transformer in pure go.

Frontier Research Team at takara.ai introduces a pure Go implementation of attention mechanisms and transformer layers, featuring high performance and zero dependencies. The library offers efficient dot-product attention, multi-head attention support, and complete transformer layer implementation, making it ideal for edge computing and real-time processing.

Generative AI with Stochastic Differential Equations - IAP 2025

A comprehensive MIT course on flow matching and diffusion models in generative AI, covering mathematical frameworks and practical implementations across various data modalities. Students learn to build image diffusion models from scratch while gaining expertise in stochastic differential equations, with hands-on experience through three practical labs.

Gödel's theorem debunks the most important AI myth. AI will not be conscious | Roger Penrose (Nobel)

Roger Penrose argues that AI systems are computational tools lacking true consciousness, which he believes requires non-computable physics beyond current quantum mechanics. He emphasizes that intelligence inherently involves consciousness, making the term 'artificial intelligence' misleading, and suggests that current AI systems process data without genuine understanding.

Crossing the uncanny valley of conversational voice

Sesame introduces Conversational Speech Model (CSM), advancing voice AI beyond traditional text-to-speech limitations by incorporating contextual awareness and emotional intelligence. The model operates as a single-stage system using transformers to produce more natural and coherent speech, achieving near-human performance in audio quality while still working to improve conversational dynamics.

GitHub - salesforce/Merlion: Merlion: A Machine Learning Framework for Time Series Intelligence

Merlion is a comprehensive Python library for time series intelligence, offering end-to-end machine learning capabilities for forecasting, anomaly detection, and change point detection. The library features standardized data loading, diverse models, AutoML capabilities, and practical post-processing rules, while supporting both univariate and multivariate analysis with distributed computation via PySpark.

RoboPianist: Dexterous Piano Playing with Deep Reinforcement Learning

Researchers developed a deep reinforcement learning system that trains anthropomorphic robot hands to play piano, using MuJoCo physics engine and MIDI files for simulation. The system achieves high performance by incorporating human fingering patterns and energy optimization, demonstrating significant improvements over baseline methods with an average F1 score of 0.79 across test pieces.

Related articles