Bibliography

The four reference books, and where each one extends this tutorial.

This tutorial is positioned as a runnable companion to a small set of excellent books — not a replacement for any of them. Each book has its own angle, its own depth on a specific axis, and a different reading order. Below: what each book covers, when to reach for it, and the section-by-section cross-reference showing which parts of this tutorial draw on which sources.

Andrist & Sehr — C++ High Performance, 2nd Edition

Björn Andrist and Viktor Sehr, Packt, 2020. The closest analogue to this tutorial in book form. Deep on language machinery, lighter on container context — which is exactly the gap this tutorial fills.

What it covers

Performance from the language side: cache-conscious data structures, move semantics and copy elision, allocator design, parallel STL, async I/O patterns, the “why” behind each modern C++ feature when performance is the goal. Chapters 6 (CPU and memory architecture), 7 (memory management with custom allocators), and 11 (concurrency) are particularly close to material this tutorial introduces and measures.

When to read it

Before this tutorial if you want the deep language story on each technique before seeing it measured in a container. After this tutorial if you want to extend any pattern (PMR, lock-free queues, parallel algorithms) into territory this tutorial only points at. The two work in either order — they answer adjacent questions.

What this tutorial points at it for

§5 (Compile-time wins) — the LTO/PGO mechanics
§6 (STL & layout) — cache-conscious container choice
§7 (Memory management) — PMR design rationale, custom allocators, the whole chapter 7
§8 (I/O latency) — async patterns, coroutines that we didn't cover
§11 (Noisy neighbors) — the parallel STL and thread-pool material
§14 (Pitfalls) — the over-abstraction and silent-overhead patterns

Iglberger — C++ Software Design

Klaus Iglberger, O'Reilly, 2022. Design principles and patterns for high-quality software. Less about raw speed, more about what survives at scale and what doesn't.

What it covers

The classical design patterns rebuilt for modern C++ — Strategy, Visitor, Command, Bridge, Adapter — with hard-won opinions about which ones are still worth carrying and which were artifacts of pre-template C++. The book's central argument is that loose coupling through value-based polymorphism (std::variant, std::function, type erasure) beats inheritance-based polymorphism for most modern designs. The Bridge / PIMPL chapter discusses lifetime ownership in a way that connects directly to PMR allocator design.

When to read it

After you've got a fast service. The patterns matter when you've shipped, the requirements changed, and you need to extend without breaking. Iglberger is what helps you avoid painting yourself into ABI corners.

What this tutorial points at it for

§2 (Introduction) — the four-layer mental model loosely tracks Iglberger's separation-of-concerns argument
§6 (STL & layout) — value-based polymorphism vs virtual dispatch trade-offs
§12 (Static analysis & debugging) — what a clean type design enables for the analyzers
§13 (Reproducibility & ABI) — ABI stability follows from Iglberger's loose-coupling argument
§14 (Pitfalls) — over-abstraction is the dual problem to under-abstraction
The Statelessness reference section draws on Iglberger throughout — especially Doc 02 (RAII), Doc 03 (PMR), and Doc 06 (12-factor) where lifetime ownership intersects design pattern

Enberg — Latency: Reduce delay in software systems

Pekka Enberg, Manning, 2024. The systems-side complement to Andrist & Sehr. Treats latency as the problem and walks through the techniques.

What it covers

What latency actually is and where it accumulates — syscalls, context switches, cache misses, network hops, lock contention. Detailed treatment of the alternatives to traditional blocking I/O (io_uring, kernel bypass, busy-spin vs futex), what allocator strategies look like under sustained load (mimalloc/jemalloc/tcmalloc compared from the Helsinki perf group's measurements), and the “general-purpose allocator tax” thesis that informs much of §7. Less language-specific than Andrist & Sehr; mostly applies whether you're writing C++, Rust, or Go.

When to read it

Before or alongside this tutorial. Where this tutorial says “use io_uring,” Enberg explains why the syscall model has been the bottleneck and what alternatives have looked like over the past decade. The mental model carries through every section that talks about scheduling, I/O, or kernel-side behavior.

What this tutorial points at it for

§7 (Memory management) — the allocator-tax thesis, chapter 3 is the canonical reference
§8 (I/O latency) — io_uring motivation, the syscall-cost model
§9 (Networking & kernel) — kernel-bypass alternatives and their trade-offs
§10 (Observability) — where to look in the kernel for latency that doesn't show up in application telemetry
§11 (Noisy neighbors) — CPU scheduling and the priority-inheritance discussion
§14 (Pitfalls) — the patterns of why low-level performance work goes wrong

Ghosh — Building Low Latency Applications with C++

Sourav Ghosh, Packt, 2023. The full worked example. Where the other three books cover individual techniques, Ghosh builds an entire low-latency trading ecosystem from scratch in modern C++ — order book, matching engine, market data feed, network layer, the lot.

What it covers

The trading-system framing is incidental; the value is seeing every pattern this tutorial introduces composed into one running system. Lock-free queues for the matching engine, custom memory pools for per-message allocation, busy-spin vs futex for the producer-consumer boundary, NUMA-aware thread placement, kernel-bypass networking with solarflare/onload — all in one body of code. The book is also a case study in the discipline of latency-sensitive C++ design: nothing dynamic on the hot path, no exceptions in the matcher, all allocations pre-sized at startup.

When to read it

Read it after this tutorial if you want to see the patterns at full scale. Read it before if you want a worked example to compare ours against — Ghosh's design choices are an excellent anchor for “wait, why didn't this tutorial do it that way?” conversations.

What this tutorial points at it for

§7 (Memory management) — pre-allocated pools and the “no allocation on hot path” discipline
§8 (I/O latency) — the kernel-bypass walkthroughs and busy-spin vs futex decision
§11 (Noisy neighbors) — NUMA-aware design at the architectural level
§12 (Static analysis & debugging) — what an “all sanitizers, all the time” CI looks like for latency-sensitive code
§15 (Where to go next) — Ghosh is the “build a complete system” recommendation for any reader who wants to extend this tutorial's patterns into a full architecture

Cross-reference matrix

The full mapping of section and reference doc to books. A check mark means the section explicitly cites the book; subdued cells mean the book's material is relevant but not directly cited in the prose.

Section / Doc	Andrist & Sehr	Iglberger	Enberg	Ghosh
Tutorial body
§2 Introduction	✓	✓	✓	✓
§3 RAII	✓	✓
§4 Image strategy	✓	✓
§5 Compile-time wins	✓	✓
§6 STL & layout	✓	✓
§7 Memory management	✓		✓	✓
§8 I/O latency	✓		✓	✓
§9 Networking & kernel			✓
§10 Observability	✓		✓
§11 Noisy neighbors	✓		✓	✓
§12 Static analysis & debugging		✓		✓
§13 Reproducibility & ABI		✓
§14 Pitfalls	✓	✓	✓
§15 Where to go next	✓	✓	✓	✓
Statelessness reference
00 Index	✓	✓	✓	✓
01 Deployment posture		✓	✓
02 RAII		✓	✓
03 PMR		✓	✓
04 Process-scoped state		✓	✓
05 Threading		✓	✓
06 Twelve-factor		✓
07 State externalization			✓
Demos
Demo 06 (memory & allocators)	✓	✓	✓

One more — the gotcha book

A fifth book worth mentioning separately, since the statelessness reference section cites it heavily:

Yonts, 100 C++ Mistakes and How to Avoid Them. A pattern-language treatment of the specific mistakes that show up over and over in production C++. Allocator type-erasure traps, container/allocator mismatches, dangling references across ownership boundaries, lifetime confusion in the presence of exceptions — the cluster of mistakes that PMR in particular makes easy to commit. Less foundational than the four reference books above, but the highest-density-per-page treatment of “here's exactly the thing you're about to do wrong” that we found.

Citation discipline

Per the project's editorial constraint: every citation in this tutorial is a pointer — “see Iglberger ch. 4 for the full pattern” — not a displacing summary. The books exist; if this tutorial replaced reading them, we'd be doing the authors and the reader a disservice. This bibliography is meant to make the pointers easier to follow, not to compress the books into bullet points.

Reading paths

A few suggested orders, depending on your starting point:

If you already write C++ daily and want to learn the container side: this tutorial → Ghosh (for the full-system perspective) → Iglberger (for the design follow-up).
If you came from systems / SRE work and the C++ feels rusty: Andrist & Sehr (for language refresh) → this tutorial → Enberg (for the latency mental model).
If you're preparing for a perf-focused interview: this tutorial → Enberg (theory of latency) → Andrist & Sehr (language deep-dive). The questions you'll get are mostly in the intersection.
If you've never measured anything you've optimized: this tutorial first — every claim has a runnable demo with a number on it. Then Enberg for the framework that makes measurement a habit.

Whichever order you choose, the goal is the same: build a coherent mental model that you can defend with measurements on your own hardware.