dkduckkit.dev

API Rate Limit Calculator

Optimal rate limits from traffic and consumers. Nginx, Kong, AWS snippets and IETF headers.

Last updated: March 2026

TL;DR

Derive steady RPS and burst headroom from concurrent clients and per-client call patterns, then map them to token-bucket or leaky-bucket style limits.

Formula: Required RPS ≈ clients × requests per client per second; burst sized to absorb coordinated spikes.

When to use this

  • Setting gateway limits that protect upstreams without false positives.
  • Translating product SLOs into enforceable rate and burst parameters.

How the math works

How the math worksLaTeX model and TypeScript reference — same logic as the calculator on this page.

This describes the implementation behind the numbers as of 2026-03-26. It is engineering documentation, not legal or compliance advice.

Specification citation

Logic reflects our proprietary implementation of the following public specifications: IETF RFC 6585 and HTTP 429 Too Many Requests (MDN).

This snippet represents the core logic of our proprietary calculation engine, verified against RFC 6585 and widely used token-bucket / fixed-window rate-limit patterns.

Model (LaTeX source)
Sustainable requests per window (duckkit.dev model)

Let R_peak be peak RPS, W window seconds, N consumers, global flag G.
Effective consumer count: N_eff = G ? 1 : N

Raw allowance (20% headroom in implementation):
N_raw = floor((R_peak · W · 0.8) / N_eff)
N_window = max(1, N_raw)

Effective RPS shown: R_eff = N_window / W

Burst depends on strategy (fixed-window, sliding-window, token-bucket multiplier).
Reference implementation (TypeScript, excerpt from shipped modules)
// lib/rate-limit-calculator/limits.ts
export function calculateLimits(inputs: RateLimitInputs): RecommendedLimits {
  const effectiveConsumers = globalMode ? 1 : consumerCount

  const rawLimit = Math.floor(
    (peakRPS * windowSeconds * 0.8) / effectiveConsumers,
  )

  const requestsPerWindow = Math.max(1, rawLimit)

  const burstLimit =
    strategy === 'fixed-window'
      ? requestsPerWindow
      : strategy === 'sliding-window'
        ? Math.floor(requestsPerWindow * 1.5)
        : Math.floor(requestsPerWindow * burstMultiplier)

  const rpsEffective = parseFloat(
    (requestsPerWindow / windowSeconds).toFixed(2),
  )
  return { requestsPerWindow, burstLimit, /* … */ }
}

Throttling risk 20 percent. Effective RPS 8.00. Requests per window 480.

At a glance

Throttling risk
20%
Effective RPS
8.00
Req / window
480

Configuration

Quick presets

Traffic spike simulator1.0× peak
Optional — tap to adjust stress multiplier
Normal (1×)Black Friday (5×)DDoS-like (10×)
Traffic profile
Rate limit strategy
ContextEnvironment, auth, retry behavior — tap to expand

Results

480req/window
Requests per window
8.00req/s
Effective RPS
1440burst
Burst limit
480req
Per consumer / window

How your strategy handles traffic

Incoming trafficAllowed throughThrottled
window boundarylimit0s60s120s
0.078MB/s
Max throughput
10consumers
Safe consumer count
20%
Throttling risk (peak)
100% util
Utilization at peak
Retry storm riskMEDIUM
nginxNginx — limit_req
# Nginx rate limiting — generated by API Rate Limit Calculator
limit_req_zone $http_x_api_key zone=api_limit:10m rate=8r/s;

server {
    location /api/ {
        limit_req zone=api_limit burst=1440 nodelay;
        limit_req_status 429;
        add_header Retry-After 5 always;
    }
}
yamlKong — rate-limiting plugin
# Kong rate-limiting plugin — generated by API Rate Limit Calculator
plugins:
  - name: rate-limiting
    config:
      second: 8
      minute: 480
      policy: consumer
      fault_tolerant: true
      hide_client_headers: false
      error_code: 429
      error_message: "API rate limit exceeded"
jsonAWS API Gateway — throttle (JSON)
{
  "_comment": "AWS API Gateway Usage Plan — throttle settings only. Quota (daily/monthly limits) is a billing concern — configure separately.",
  "throttle": {
    "rateLimit": 8,
    "burstLimit": 1440
  }
}
hclAWS API Gateway — Terraform
# Terraform — AWS API Gateway Usage Plan
# Generated by API Rate Limit Calculator

resource "aws_api_gateway_usage_plan" "rate_limit" {
  name = "api-rate-limit"

  throttle_settings {
    rate_limit  = 8        # req/s steady state
    burst_limit = 1440 # token bucket size
  }
}

resource "aws_api_gateway_usage_plan_key" "rate_limit_key" {
  key_id        = aws_api_gateway_api_key.consumer.id
  key_type      = "API_KEY"
  usage_plan_id = aws_api_gateway_usage_plan.rate_limit.id
}
httpRate limit response headers
# X-RateLimit Headers (legacy — widely supported)
X-RateLimit-Limit: 480        # requests allowed per window
X-RateLimit-Remaining: <current_count>         # requests remaining
X-RateLimit-Reset: <unix_timestamp>            # when window resets (unix epoch)
X-RateLimit-Window: 60    # window duration in seconds

# On 429 Too Many Requests:
Retry-After: 5              # seconds before client should retry
You're running at 100% of capacity at peak. Add 20% headroom or your next traffic spike will cause throttling.

Methodology

The tool runs calculateAll: limits derive a per-window request budget from peak RPS, window length, consumer count, and global vs per-key mode (with a built-in headroom factor), then adjust burst by strategy (fixed, sliding, token bucket). Capacity compares those limits to your targets; configs are template snippets, not validated against a live gateway. The model is steady-state planning — not adaptive abuse detection or exact vendor rate-limit semantics.

What is API rate limiting and why it matters

An api rate limit caps how many requests a client can make in a time window. Think of it as a merge lane: unlimited inbound traffic causes collisions—your service protects itself and fair-shares capacity. Platform teams use an api rate limit calculator like this one to translate real peak RPS, payload size, and consumer count into limits that avoid abuse without blocking legitimate partners.

Rate limiting strategies compared

Fixed window resets a counter every N seconds—simple but clients can double their effective rate at window boundaries. Sliding window smooths counts across time. Token bucket allows controlled bursts while refilling steadily—what most edge gateways implement. Leaky bucket outputs traffic at a constant rate, shaping noisy clients. Use this tool to see how each behaves before you paste configs into nginx or Kong.

The hidden danger: retry storms

When thousands of clients get HTTP 429 at once, naive retries can synchronize and amplify load—an api rate limit incident becomes self-sustaining. Exponential backoff alone is not enough without jitter; otherwise pods or mobile clients retry on the same tick. This calculator surfaces retry-storm risk so you can tune Retry-After relative to your p99 latency before you commit to production limits.

Not sure what p99 latency to assume? Build your critical path in our System Latency Budget Calculator to estimate realistic p50/p99 from reference hops before you plug numbers into retry and rate-limit math.

X-RateLimit and IETF RateLimit headers

Legacy X-RateLimit-* headers are widely supported; IETF Draft-07 introduces RateLimit-Limit, RateLimit-Remaining, RateLimit-Reset (seconds to reset, not always a Unix timestamp—check your implementation), and RateLimit-Policy for machine-readable discovery. Toggle the option here to generate the style your API contract requires. Whether you deploy with nginx, Kong, or AWS API Gateway, consistent headers reduce support tickets from confused integrators.

AWS API Gateway exposes throttle (rate + burst) as a token bucket; daily quota is separate billing/monetization—configure it outside this throttle-focused snippet.

Token bucket vs fixed window

  • Token bucket — smooth average rate with controlled bursts; common in gateways and AWS throttles.
  • Fixed window — resets every interval; can allow 2× spikes at window edges unless you add jitter or sliding logic.

IETF guidance on communicating limits lives in RFC 6585 (status 429) and rate-limit header drafts your gateway may implement.

Copy-paste solution

limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;
server {
  location /api/ {
    limit_req zone=api burst=20 nodelay;
    proxy_pass http://upstream;
  }
}

Related tools