Debugging

3,200% CPU Utilization

An in-depth analysis of a critical Java performance issue where unprotected concurrent TreeMap modifications led to 3,200% CPU utilization. The investigation revealed how thread interleaving can create infinite loops in red-black trees, with experiments across multiple programming languages demonstrating similar vulnerabilities.

Tokio + prctl = nasty bug

A detailed analysis of a bug in HyperQueue where tasks were unexpectedly terminated after 10 seconds due to an interaction between tokio thread management, PR_SET_PDEATHSIG, and process spawning optimization. The bug emerged from moving process spawning to a worker thread, causing processes to receive SIGTERM when tokio cleaned up idle threads.

Debugging Our New Linux Kernel

An investigation revealed performance issues in Ubuntu web servers caused by Linux kernel's cgroups v2 implementation, specifically related to inode switching between cgroups after file operations. The problem manifested as elevated system CPU usage and listen overflows, impacting web server performance during the first few minutes after host deployment.

Searching for the cause of hung tasks in the Linux kernel

A detailed exploration of Linux kernel's hung task warnings, explaining how the system identifies processes stuck in uninterruptable states and their potential impact on system performance. Through three practical examples involving XFS filesystem, coredump processes, and RTNL mutex issues, the article demonstrates debugging approaches for various hung task scenarios.