p50 latency (median)

Latency & SRE

p50 latency is the 50th percentile of request durations: half of all requests complete faster than this value, half slower. It represents the experience of the median user under normal load. Because the median is inherently resistant to outliers, it is a poor choice as a sole SLO metric — a system can have an excellent p50 while still subjecting 1% of users to severe latency.

Formula

Example: p50 = 12 ms means half your requests complete in under 12 ms. If p99 = 800 ms, one in a hundred requests takes 67× longer — that user is having a very different experience.

Why it matters in practice

p50 is useful as a baseline and for capacity planning, but must always be paired with a tail percentile (p99 or p99.9) for any user-facing SLO. In microservice architectures, tail latency compounds: a service that calls 10 downstream dependencies at p99 independently will see the p99 of the slowest dependency on roughly every request. The practical result is that p50 of the composite call is often close to the p99 of its dependencies.

Common mistakes

•Using mean (average) instead of median — a single 10-second request in a batch of 100 ms requests will inflate the mean but leave the median untouched.
•Reporting p50 to stakeholders as "typical latency" without noting that tail latencies exist and are experienced by real users.
•Optimising p50 at the expense of p99 — caching strategies that reduce median latency often increase tail latency due to cache misses.

Try it in

Try in Latency Budget

p50 latency (median)

Formula

Why it matters in practice

Common mistakes

Related Terms

p99 latency

Tail latency

Percentile latency (p50 / p99 / p999)

Latency budget

Try it in