dkduckkit.dev

N+1 query problem

Latency & SRE

The N+1 query problem occurs when code executes one query to fetch a list of N items, then executes N additional queries to fetch associated data for each item individually. The result is N+1 database round-trips instead of the optimal 1 or 2. At small N the problem is invisible in development; at production scale with N=100 and 1 ms per query, it consumes 100 ms of latency budget.

Formula

Example: Fetching 50 orders then querying each order's customer in a loop: SELECT * FROM orders (1 query) + SELECT * FROM customers WHERE id = ? × 50 = 51 queries, 51 ms at 1 ms each — vs. SELECT orders.*, customers.* FROM orders JOIN customers = 1 query, ~2 ms.

Why it matters in practice

N+1 is one of the most common latency killers in web services, particularly in applications using ORM frameworks (Hibernate, ActiveRecord, Prisma) where lazy loading hides the query count. It is invisible in unit tests and development environments (small datasets, fast local databases) but catastrophic in production with hundreds of records and a remote database. The fix is always to batch: use JOINs, WHERE id IN (...) queries, or DataLoader-style batching.

Common mistakes

  • Not enabling query logging in development — N+1 problems are invisible without seeing the actual SQL executed per request.
  • Using lazy loading as the default ORM strategy — eager loading or explicit batching should be the default for list endpoints.
  • Fixing N+1 by adding a cache in front of the loop — caching the inner query still makes N cache calls; the correct fix is to eliminate the N calls.