dkduckkit.dev

Global vs per-consumer rate limit

Rate Limiting

A global rate limit applies a single counter shared across all clients — the API allows N total requests per window regardless of who sends them. This is typically implemented as a system-wide throttle to protect backend capacity from aggregate traffic, regardless of individual client behavior. Global limits are often combined with per-consumer limits to provide both fairness and overall system protection.

Why it matters in practice

Global rate limits prevent the entire system from being overwhelmed even when many clients are each within their individual quotas. Without a global limit, 100 clients each making 100 requests per second could collectively send 10,000 requests per second, potentially overwhelming the backend even if each client respects their per-client limit. Global limits are essential for protecting shared resources like databases, third-party APIs, or expensive computation.

Common mistakes

  • Setting global limits too low, causing artificial throttling even when the system has capacity — global limits should be based on actual system capacity, not arbitrary quotas.
  • Not accounting for burst capacity in global limits — global limits need burst tolerance just like per-client limits to handle legitimate traffic spikes.
  • Using only global limits without per-consumer limits — this allows a single misbehaving client to consume the entire global quota, blocking other legitimate users.