p50 latency (median)
p50 latency is the 50th percentile of request durations: half of all requests complete faster than this value, half slower. It represents the experience of the median user under normal load. Because the median is inherently resistant to outliers, it is a poor choice as a sole SLO metric — a system can have an excellent p50 while still subjecting 1% of users to severe latency.
Formula
Example: p50 = 12 ms means half your requests complete in under 12 ms. If p99 = 800 ms, one in a hundred requests takes 67× longer — that user is having a very different experience.Why it matters in practice
p50 is useful as a baseline and for capacity planning, but must always be paired with a tail percentile (p99 or p99.9) for any user-facing SLO. In microservice architectures, tail latency compounds: a service that calls 10 downstream dependencies at p99 independently will see the p99 of the slowest dependency on roughly every request. The practical result is that p50 of the composite call is often close to the p99 of its dependencies.
Common mistakes
- •Using mean (average) instead of median — a single 10-second request in a batch of 100 ms requests will inflate the mean but leave the median untouched.
- •Reporting p50 to stakeholders as "typical latency" without noting that tail latencies exist and are experienced by real users.
- •Optimising p50 at the expense of p99 — caching strategies that reduce median latency often increase tail latency due to cache misses.
Related Terms
p99 latency
99th percentile of request durations: captures experience of users under stress.
Tail latency
High-percentile latency values (p99, p99.9) representing slowest requests.
Percentile latency (p50 / p99 / p999)
Statistical measure of request duration distribution.
Latency budget
Total time allocated for a complete user-facing request across all architectural hops.