LM2: Large Memory Models
A novel Large Memory Model (LM2) architecture enhances Transformers with an auxiliary memory module, significantly outperforming existing models in multi-hop inference and numerical reasoning tasks. The model demonstrates a 37.1% improvement over RMT and 86.3% over Llama-3.2 on the BABILong benchmark while maintaining strong performance on general tasks.