Latency numbers for platform engineers (2026)

In 2010, Jeff Dean's "Latency Numbers Every Programmer Should Know" gave the industry a shared mental model — the vast performance gulf between L1 cache hits and disk seeks made visceral by raw nanosecond comparisons. Colin Scott's interactive version later showed how these numbers evolved over time.

Those numbers assumed a single rack. In 2026 we build in the cloud: abstracted services, cross-AZ replications, serverless runtimes. The latency numbers every programmer should know need a cloud-native update.

The updated table for 2026

Values distilled from 2025–2026 AWS and GCP performance documentation, AnandTech CPU benchmarks, and Cloudflare network data. These are the speed limits of our profession.

Operation	Latency	Context
L1 cache hit	1 ns	Modern ARM/x86 performance cores
L2 cache hit	4 ns	4× slower than L1
RAM access	100 ns	DDR5/DDR6 typical load-to-use
NVMe SSD random read	10–20 μs	Gen5 enterprise NVMe (QD1)
Redis GET (same AZ)	150–300 μs	Over TCP/TLS within VPC
PostgreSQL indexed query	1–2 ms	Single row, hot cache
Kafka produce (same DC)	3–5 ms	p50, `acks=all`
Kafka produce (same DC)	10–25 ms	p99, `acks=all` — dominated by slowest ISR
Kafka produce (cross-AZ)	10–15 ms	p99 with synchronous replication
HTTP (same AZ)	1 ms	Internal microservice call
HTTP (cross-region EU)	25–40 ms	e.g. London → Frankfurt
HTTP (transatlantic)	70–90 ms	e.g. NYC → London
Lambda cold start (Node.js)	150–400 ms	VPC-attached, standard init
Lambda cold start (JVM)	800–1500 ms	Without SnapStart

N+1 beats network as the #1 latency killer

The most common SLA breach in 2026 is not a slow network hop — it's the N+1 query pattern. In a distributed system, chatty interfaces are compounded by the speed of light.

50 serial database calls at 1–2ms each = 50–100ms floor before any business logic runs. The 1ms figure assumes hot cache with an established pooled connection — with connection checkout and network overhead, expect 2–3ms per call in practice. That's slower than a single cross-region network hop.

The problem:

typescript

async function getOrderDetails(orderId: string) {
  const order = await db.orders.findById(orderId);   // 1ms

  const items = [];
  for (const itemId of order.itemIds) {
    const item = await db.items.findById(itemId);    // 1ms × N = N ms
    items.push(item);
  }
  return { order, items };
}

The fix:

typescript

async function getOrderDetailsOptimized(orderId: string) {
  const order = await db.orders.findById(orderId);   // 1ms

  // Single round-trip regardless of order size
  const items = await db.items.findMany({
    where: { id: { in: order.itemIds } },
  });                                                 // ~2ms

  return { order, items };
}

The optimized version moves from O(N) latency to O(1) relative to the network. Understanding the latency numbers every programmer should know helps you spot these patterns during code review before they breach production SLAs.

The physics floor: why cross-continent will never be sub-50ms

Speed of light in a vacuum: 299,792 km/s. In standard single-mode optical fibre (refractive index ≈ 1.467):

v = c / n = 299,792 / 1.467 ≈ 204,357 km/s

New York → Tokyo great-circle distance: ~10,850 km.

One-way: 10,850 / 204,357 ≈ 53ms
Minimum RTT: 53ms × 2 = 106ms

This is the physics floor — a perfectly straight fibre cable with zero routers. In practice, cable routing and signal regenerators push real-world RTT to 140–160ms. No HTTP/3 or QUIC optimisation breaks this limit. It's physics, not engineering.

Lambda cold starts in your latency budget

Serverless is the default for event-driven architectures in 2026. Cold starts remain the tail latency problem — when a Lambda hasn't been used, the provider provisions a Firecracker microVM, downloads the code, and initialises the runtime.

For Node.js or Python this is usually acceptable (< 400ms). For the JVM it can exceed 1.5 seconds — deal-breaking for synchronous UI paths.

AWS Lambda SnapStart mitigates JVM cold starts by snapshotting the initialised Firecracker VM. AWS documentation reports up to 10× reduction in startup latency for Java functions.

When building a latency budget, account for p99.9. If 0.1% of requests trigger a 1.5s cold start, that tail is visible to users when the Lambda sits on a synchronous UI path.

For an interactive architecture calculator that shows where your latency budget is actually going, use the System Latency Budget Calculator.