Methodology
The tool runs calculateAll: limits derive a per-window request budget from peak RPS, window length, consumer count, and global vs per-key mode (with a built-in headroom factor), then adjust burst by strategy (fixed, sliding, token bucket). Capacity compares those limits to your targets; configs are template snippets, not validated against a live gateway. The model is steady-state planning — not adaptive abuse detection or exact vendor rate-limit semantics.
What is API rate limiting and why it matters
An api rate limit caps how many requests a client can make in a time window. Think of it as a merge lane: unlimited inbound traffic causes collisions—your service protects itself and fair-shares capacity. Platform teams use an api rate limit calculator like this one to translate real peak RPS, payload size, and consumer count into limits that avoid abuse without blocking legitimate partners.
Rate limiting strategies compared
Fixed window resets a counter every N seconds—simple but clients can double their effective rate at window boundaries. Sliding window smooths counts across time. Token bucket allows controlled bursts while refilling steadily—what most edge gateways implement. Leaky bucket outputs traffic at a constant rate, shaping noisy clients. Use this tool to see how each behaves before you paste configs into nginx or Kong.
The hidden danger: retry storms
When thousands of clients get HTTP 429 at once, naive retries can synchronize and amplify load—an api rate limit incident becomes self-sustaining. Exponential backoff alone is not enough without jitter; otherwise pods or mobile clients retry on the same tick. This calculator surfaces retry-storm risk so you can tune Retry-After relative to your p99 latency before you commit to production limits.
Not sure what p99 latency to assume? Build your critical path in our System Latency Budget Calculator to estimate realistic p50/p99 from reference hops before you plug numbers into retry and rate-limit math.
X-RateLimit and IETF RateLimit headers
Legacy X-RateLimit-* headers are widely supported; IETF Draft-07 introduces RateLimit-Limit, RateLimit-Remaining, RateLimit-Reset (seconds to reset, not always a Unix timestamp—check your implementation), and RateLimit-Policy for machine-readable discovery. Toggle the option here to generate the style your API contract requires. Whether you deploy with nginx, Kong, or AWS API Gateway, consistent headers reduce support tickets from confused integrators.
AWS API Gateway exposes throttle (rate + burst) as a token bucket; daily quota is separate billing/monetization—configure it outside this throttle-focused snippet.
Token bucket vs fixed window
- Token bucket — smooth average rate with controlled bursts; common in gateways and AWS throttles.
- Fixed window — resets every interval; can allow 2× spikes at window edges unless you add jitter or sliding logic.
IETF guidance on communicating limits lives in RFC 6585 (status 429) and rate-limit header drafts your gateway may implement.
Copy-paste solution
limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;
server {
location /api/ {
limit_req zone=api burst=20 nodelay;
proxy_pass http://upstream;
}
}