dkduckkit.dev
← Blog
·6 min read·Related tool →

Latency Numbers Every Platform Engineer Should Know in 2026

Updated Jeff Dean latency numbers for cloud-native 2026: NVMe, cross-AZ Kafka, Lambda cold starts, and why N+1 beats network as the top latency killer.

In 2010, Jeff Dean's "Latency Numbers Every Programmer Should Know" gave the industry a shared mental model — the vast performance gulf between L1 cache hits and disk seeks made visceral by raw nanosecond comparisons. Colin Scott's interactive version later showed how these numbers evolved over time.

Those numbers assumed a single rack. In 2026 we build in the cloud: abstracted services, cross-AZ replications, serverless runtimes. The latency numbers every programmer should know need a cloud-native update.

The updated table for 2026

Values distilled from 2025–2026 AWS and GCP performance documentation, AnandTech CPU benchmarks, and Cloudflare network data. These are the speed limits of our profession.

OperationLatencyContext
L1 cache hit1 nsModern ARM/x86 performance cores
L2 cache hit4 ns4× slower than L1
RAM access100 nsDDR5/DDR6 typical load-to-use
NVMe SSD random read15–25 μsGen5 enterprise NVMe
Redis GET (same AZ)150–300 μsOver TCP/TLS within VPC
PostgreSQL indexed query1–2 msSingle row, hot cache
Kafka produce (same DC)3–5 msp99, acks=all
Kafka produce (cross-AZ)10–15 msp99 with synchronous replication
HTTP (same AZ)1 msInternal microservice call
HTTP (cross-region EU)25–40 mse.g. London → Frankfurt
HTTP (transatlantic)70–90 mse.g. NYC → London
Lambda cold start (Node.js)150–400 msVPC-attached, standard init
Lambda cold start (JVM)800–1500 msWithout SnapStart

N+1 beats network as the #1 latency killer

The most common SLA breach in 2026 is not a slow network hop — it's the N+1 query pattern. In a distributed system, chatty interfaces are compounded by the speed of light.

50 serial database calls at 1ms each = 50ms floor before any business logic runs. That's slower than a single cross-region network hop.

The problem:

typescript
async function getOrderDetails(orderId: string) {
  const order = await db.orders.findById(orderId);   // 1ms

  const items = [];
  for (const itemId of order.itemIds) {
    const item = await db.items.findById(itemId);    // 1ms × N = N ms
    items.push(item);
  }
  return { order, items };
}

The fix:

typescript
async function getOrderDetailsOptimized(orderId: string) {
  const order = await db.orders.findById(orderId);   // 1ms

  // Single round-trip regardless of order size
  const items = await db.items.findMany({
    where: { id: { in: order.itemIds } },
  });                                                 // ~2ms

  return { order, items };
}

The optimized version moves from O(N) latency to O(1) relative to the network. Understanding the latency numbers every programmer should know helps you spot these patterns during code review before they breach production SLAs.

The physics floor: why cross-continent will never be sub-50ms

Speed of light in a vacuum: 299,792 km/s. In standard single-mode optical fibre (refractive index ≈ 1.467):

v = c / n = 299,792 / 1.467 ≈ 204,357 km/s

New York → Tokyo great-circle distance: ~10,850 km.

One-way: 10,850 / 204,357 ≈ 53ms
Minimum RTT: 53ms × 2 = 106ms

This is the physics floor — a perfectly straight fibre cable with zero routers. In practice, cable routing and signal regenerators push real-world RTT to 140–160ms. No HTTP/3 or QUIC optimisation breaks this limit. It's physics, not engineering.

Lambda cold starts in your latency budget

Serverless is the default for event-driven architectures in 2026. Cold starts remain the tail latency problem — when a Lambda hasn't been used, the provider provisions a Firecracker microVM, downloads the code, and initialises the runtime.

For Node.js or Python this is usually acceptable (< 400ms). For the JVM it can exceed 1.5 seconds — deal-breaking for synchronous UI paths.

AWS Lambda SnapStart mitigates JVM cold starts by snapshotting the initialised Firecracker VM. AWS documentation reports up to 10× reduction in startup latency for Java functions.

When building a latency budget, account for p99.9. If 0.1% of requests trigger a 1.5s cold start, that tail is visible to users when the Lambda sits on a synchronous UI path.

For an interactive architecture calculator that shows where your latency budget is actually going, use the System Latency Budget Calculator.

Related tool

System Latency Budget Calculator

Build your architecture from reference hop latencies and see instantly if you fit your SLA — interactive latency numbers for engineers.