NVIDIA emulation journey, part 1: RIVA 128 / NV3 architecture history and basic overview

NVIDIA's RIVA 128 (NV3) was their first commercially successful GPU in 1997, featuring DirectX 5 support and competing with 3dfx's Voodoo Graphics. The architecture introduced key innovations in graphics processing while marking NVIDIA's shift from proprietary APIs to standard ones like Direct3D, ultimately helping launch the company's success in the GPU market.

Introduction to CUDA Programming for Python Developers

GPU architecture enables massive parallel processing through thousands of CUDA cores, contrasting with CPU's sequential processing capabilities. CUDA programming provides a platform for developers to harness GPU's parallel power through kernel functions and thread management. The document explores memory management, shared memory optimization, and practical applications in LLM workloads like FlashAttention.

Intel’s Battlemage Architecture

Intel's new Battlemage architecture powers the Arc B580 GPU, offering improved performance over its Alchemist predecessor despite fewer cores and a narrower memory bus, targeting the midrange market at $250 with 12GB VRAM. The architecture features significant improvements in compute utilization, cache latency, and memory handling, while maintaining Intel's unique approach to GPU design distinct from AMD and Nvidia.

GPU Architecture

NVIDIA emulation journey, part 1: RIVA 128 / NV3 architecture history and basic overview

Introduction to CUDA Programming for Python Developers

Intel’s Battlemage Architecture