Leaky bucket

Rate Limiting

The leaky bucket algorithm processes requests at a constant output rate, queueing incoming requests in a buffer (the bucket) when they arrive faster than the output rate. Excess requests are placed into a queue (the "bucket"). The queue drains at a fixed rate, smoothing bursty traffic into a uniform output stream. Only when the queue is full are incoming requests dropped. This queuing behaviour is what distinguishes leaky bucket as a traffic shaper from a plain rate limiter, which drops or rejects immediately without buffering. The analogy is a bucket with a hole at the bottom: water drains at a constant rate regardless of how fast you fill it.

**Terminology note:** There are two distinct interpretations of "leaky bucket" in the literature. The first, described above, is the *leaky bucket as a queue* (traffic shaper) — the more useful model for SREs and API designers. The second, the *leaky bucket as a meter*, is mathematically equivalent to a token bucket and is sometimes used interchangeably in cloud provider documentation (AWS, GCP). When a vendor claims to use "leaky bucket", verify whether they mean traffic shaping (constant output rate) or simply token bucket with a different name.

Formula

output_rate = constant (configured)
queue_size = bucket_capacity (configured)
if queue_full: drop_request
else: queue_request and process_at_output_rate

Why it matters in practice

Leaky bucket is used when a downstream system cannot tolerate any burst, even brief ones — for example, a legacy backend with fixed threading that becomes unstable under load spikes. It is also used in traffic shaping to enforce a maximum average rate for billing purposes. The trade-off is that bursty clients experience queuing latency even when the system is underloaded, which can make p99 latency worse than with token bucket.

Common mistakes

•Using leaky bucket for interactive APIs where clients legitimately need burst support — the resulting queuing latency makes the API feel sluggish.
•Setting queue size too small — overflow drops requests immediately, which is equivalent to a hard rate limit with no burst tolerance.
•Confusing output rate with input rate — leaky bucket smooths output, not input.

Try it in

Try in Rate Limit Calculator

Leaky bucket

Formula

Why it matters in practice

Common mistakes

Related Terms

Token bucket

Burst limit

API throttling vs rate limiting

Try it in