Optimizing Linux Pipe Throughput

Linux pipes implement a kernel ring buffer of 4KiB pages, causing double-copy and lock contention when using standard read/write syscalls.

Profiling with perf shows most time is spent in pipe_write, page copying, allocation, and synchronization.

The splice and vmsplice syscalls enable zero-copy transfers by inserting existing user pages directly into the pipe buffer.

Allocating 2 MiB huge pages and advising the kernel with madvise reduces get_user_pages overhead and speeds up page mapping.

Using non-blocking splice/vmsplice in a busy loop avoids blocking and waking overhead at the cost of higher CPU usage.

Combining these optimizations increases throughput from about 3.5 GiB/s to over 60 GiB/s.

This improvement path illustrates profiling-driven optimization, zero-copy IO, paging concepts, and trade-offs in high-performance code.

Get notified when new stories are published for "🇺🇸 Hacker News English"

No Sign-In needed. One-Click Subscribe.

•