dkduckkit.dev

Headroom (latency)

Latency & SRE

Headroom is the gap between measured latency and the SLA ceiling — the buffer that absorbs unexpected spikes without breaching the service level agreement. It is typically expressed as a percentage of the SLA or as an absolute time value. Headroom accounts for variability in production that cannot be eliminated: garbage collection pauses, network jitter, cache warm-up effects, and temporary resource contention.

Formula

headroom = SLA_target - measured_p99_latency. Recommended: maintain at least 20% headroom under normal load.

Why it matters in practice

Without headroom, any unusual load pattern or infrastructure jitter immediately causes SLA breaches. A 200 ms SLA with p99 at 195 ms has only 5 ms of headroom — a single GC pause or network retransmission will breach the SLA. Headroom creates operational resilience: it gives SREs time to respond to alerts before customers are impacted, and it allows for normal performance variations without triggering emergency procedures.

Common mistakes

  • Using headroom to hide performance problems — consistently high p99 that consumes most of the headroom indicates the system is under-provisioned, not well-buffered.
  • Not accounting for headroom in capacity planning — provisioning for the SLA target rather than SLA minus headroom leads to chronic breaches.
  • Treating headroom as "wasted capacity" — headroom is intentional buffer space, not inefficiency.