Performance

The First Wasm_of_ocaml Release is Out!

Wasm_of_ocaml, a fork of Js_of_ocaml compiler that translates OCaml bytecode to WebAssembly, has released its first feature-complete version 6.0.1. The compiler offers better performance than Js_of_ocaml while maintaining compatibility, showing 2x-8x improvements in benchmarks and leveraging WasmGC for enhanced JavaScript interoperability.

When Imperfect Systems are Good, Actually: Bluesky’s Lossy Timelines

Bluesky implemented a 'Lossy Timelines' system to improve performance by intentionally dropping some timeline updates for users who follow many accounts. This solution reduced fanout latency by 96% and eliminated hot shard issues in their database clusters. The approach demonstrates how embracing imperfection in system design can lead to better scalability and performance.

Alex's blog

A technical analysis reveals Kafka's limitations as a job queue, highlighting potential unfairness in job distribution among workers, especially at low volumes. The worst-case scenario formula shows how jobs can be unevenly distributed, leading to inefficient resource utilization. Traditional message brokers may be more suitable for low-volume job queuing until Kafka implements KIP-932.

Go 1.24 Release Notes

Go 1.24 introduces significant performance improvements with a new Swiss Tables-based map implementation and enhanced memory allocation efficiency, reducing CPU overheads by 2-3%. The release adds support for ML-KEM post-quantum cryptography, FIPS 140-3 compliance mechanisms, and new testing tools for concurrent code.

0+0 > 0: C++ thread-local storage performance

An in-depth analysis of thread-local storage (TLS) performance in C++, examining how different implementations and contexts affect access speed. Core findings show that TLS access is fastest in executables without constructors, while shared libraries and constructors significantly degrade performance due to complex initialization and addressing mechanisms.

searchcode.com’s SQLite database is probably 6 terabytes bigger than yours 2025/02/16 (1949 words)

A developer details the migration of searchcode.com's database from MySQL to SQLite, resulting in what might be the world's largest SQLite database at 6.4TB. The migration involved implementing BTRFS compression, upgrading to a powerful server with an Intel Xeon CPU, and successfully maintaining performance across all operations.

Caddy - The Ultimate Server with Automatic HTTPS

Caddy is an advanced HTTPS server featuring automatic TLS certificate management, a RESTful config API, and compliance with PCI, HIPAA, and NIST standards. The server offers robust PKI capabilities, dynamic backend support, and extensive PHP optimization through FrankenPHP, making it a comprehensive solution for modern web hosting needs.

Rust is Eating JavaScript | Lee Robinson

Mozilla-created Rust programming language is increasingly being adopted to optimize JavaScript tooling, offering significant performance improvements in areas like minification, transpilation, and bundling. Major tech companies and open-source projects are leveraging Rust's memory efficiency and speed to enhance developer tools, with projects like SWC showing 3-5x performance gains.

Debugging Our New Linux Kernel

An investigation revealed performance issues in Ubuntu web servers caused by Linux kernel's cgroups v2 implementation, specifically related to inode switching between cgroups after file operations. The problem manifested as elevated system CPU usage and listen overflows, impacting web server performance during the first few minutes after host deployment.

Searching for the cause of hung tasks in the Linux kernel

A detailed exploration of Linux kernel's hung task warnings, explaining how the system identifies processes stuck in uninterruptable states and their potential impact on system performance. Through three practical examples involving XFS filesystem, coredump processes, and RTNL mutex issues, the article demonstrates debugging approaches for various hung task scenarios.

Tiny JITs for a Faster FFI

An exploration of improving Ruby's Foreign Function Interface (FFI) performance through JIT compilation demonstrates potential speed improvements over traditional FFI implementations. Using a proof-of-concept called FJIT, the author achieves performance comparable to C extensions while maintaining Ruby-centric development practices. The implementation shows promising results with benchmarks indicating more than 2x speed improvement over conventional FFI calls.

GitHub - nexsol-technologies/pgassistant: PgAssistant is an open-source tool designed to help developers understand and optimize their PostgreSQL database performance.

PgAssistant is an open-source tool that helps developers analyze and optimize PostgreSQL database performance through features like schema optimization, query management, and AI-powered assistance. The tool integrates with OpenAI and local LLMs for query optimization while offering practical features like SQL linting, DDL generation, and PGTune integration.

The Go Programming Language

Go 1.24 introduces significant improvements including generic type aliases, performance optimizations with 2-3% CPU overhead reduction, and enhanced WebAssembly support. The release features a new Swiss Tables-based map implementation, improved tool dependencies tracking, and new mechanisms for FIPS 140-3 compliance.

Intel’s Battlemage Architecture

Intel's new Battlemage architecture powers the Arc B580 GPU, offering improved performance over its Alchemist predecessor despite fewer cores and a narrower memory bus, targeting the midrange market at $250 with 12GB VRAM. The architecture features significant improvements in compute utilization, cache latency, and memory handling, while maintaining Intel's unique approach to GPU design distinct from AMD and Nvidia.