Thundering herd
A thundering herd is a failure pattern where a large number of clients simultaneously attempt to access a resource that has just become available after a period of unavailability. Common triggers: a cache expiry that causes thousands of requests to hit the database simultaneously, a server restart that causes all reconnecting clients to hit the new instance at once, or a rate limit window reset that causes all throttled clients to retry at the same moment.
Why it matters in practice
Thundering herds turn a recovery event into a second failure. A database that just recovered from a brief outage receives 10,000 simultaneous reconnection requests and crashes again. A cache that expires a popular key causes thousands of cache misses to hit the database in parallel, overwhelming it. The irony is that the protective mechanism (rate limiting, caching) becomes the cause of the next incident if thundering herds are not explicitly managed.
Common mistakes
- •Not adding jitter to cache TTLs — all cached objects set at the same time expire at the same time, causing a coordinated stampede.
- •Not using request coalescing (single-flight) for cache misses — multiple simultaneous misses for the same key should result in one upstream request, not N.
- •Implementing retry logic without jitter — all clients retry at the same backoff interval, creating a new thundering herd with each retry wave.
Related Terms
Jitter (retries)
Adds randomness to retry wait times to prevent synchronized retries.
Exponential backoff
Retry wait times grow multiplicatively after each failure.
Tail latency
High-percentile latency values (p99, p99.9) representing slowest requests.
Retry storm
Failure amplification where synchronized retries recreate overload.