Rate limit window
A rate limit window is the time period over which request counts are measured for enforcement. Common values: 1 second (for per-second rate limits), 60 seconds (per-minute), 3600 seconds (per-hour), and 86400 seconds (per-day). The choice of window size determines how smoothly traffic is distributed and how quickly limit resets propagate to clients.
Why it matters in practice
Short windows (1 second) enforce fine-grained smoothness — clients must spread requests evenly — but are sensitive to clock skew and require high-resolution counters. Long windows (1 hour or 1 day) allow large short-term bursts as long as the overall rate stays within quota — useful for batch processing clients. The mismatch between window granularity and client behaviour is the most common source of unexpected 429s: a client that sends all its hourly quota in the first minute will be throttled for the remaining 59 minutes.
Common mistakes
- •Using only a single window size — combine a per-second limit (burst control) with a per-hour limit (quota control) for most APIs.
- •Not surfacing window reset time in response headers — clients cannot implement intelligent retry logic if they don't know when the window resets.
- •Setting very short windows (< 1 second) for public APIs — sub-second windows require distributed atomic counters (Redis) and add latency to every request.
Related Terms
Fixed window rate limiting
Counts requests in discrete, non-overlapping time buckets.
Sliding window rate limiting
Maintains rolling count of requests over recent N seconds.
Token bucket
Accumulates tokens at fixed rate; requests consume tokens.
API throttling vs rate limiting
Rate limiting rejects excess; throttling delays to preserve availability.