Distributed Systems

GitHub - deepseek-ai/3FS: A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

Fire-Flyer File System (3FS) is a high-performance distributed storage solution optimized for AI workloads, featuring strong consistency and disaggregated architecture. The system achieves impressive throughput of 6.6 TiB/s in read operations across 180 storage nodes, while supporting diverse workloads from data preparation to inference caching.

Distributed Systems Programming Has Stalled

An analysis of distributed systems programming models reveals limitations in current approaches: external-distribution, static-location, and arbitrary-location paradigms. Despite advancements in distributed systems over the last decade, programming models haven't fundamentally improved, leading to ongoing challenges with concurrency, fault tolerance, and versioning.

Alex's blog

A technical analysis reveals Kafka's limitations as a job queue, highlighting potential unfairness in job distribution among workers, especially at low volumes. The worst-case scenario formula shows how jobs can be unevenly distributed, leading to inefficient resource utilization. Traditional message brokers may be more suitable for low-volume job queuing until Kafka implements KIP-932.

A Descent Into the Vᴏ̈ʀᴛᴇx | TigerBeetle Blog

TigerBeetle has introduced Vörtex, a non-deterministic testing harness designed to complement their existing Deterministic Simulation Testing by testing compiled binaries and client libraries under real-world conditions. The system injects network faults and simulates process failures through a supervisor-managed architecture, having already uncovered several significant bugs in its first four months of operation.