Latency Numbers Every Platform Engineer Should Know in 2026
Updated Jeff Dean latency numbers for cloud-native 2026: NVMe, cross-AZ Kafka, Lambda cold starts, and why N+1 beats network as the top latency killer.
In 2010, Jeff Dean's "Latency Numbers Every Programmer Should Know" gave the industry a shared mental model — the vast performance gulf between L1 cache hits and disk seeks made visceral by raw nanosecond comparisons. Colin Scott's interactive version later showed how these numbers evolved over time.
Those numbers assumed a single rack. In 2026 we build in the cloud: abstracted services, cross-AZ replications, serverless runtimes. The latency numbers every programmer should know need a cloud-native update.
The updated table for 2026
Values distilled from 2025–2026 AWS and GCP performance documentation, AnandTech CPU benchmarks, and Cloudflare network data. These are the speed limits of our profession.
| Operation | Latency | Context |
|---|---|---|
| L1 cache hit | 1 ns | Modern ARM/x86 performance cores |
| L2 cache hit | 4 ns | 4× slower than L1 |
| RAM access | 100 ns | DDR5/DDR6 typical load-to-use |
| NVMe SSD random read | 15–25 μs | Gen5 enterprise NVMe |
| Redis GET (same AZ) | 150–300 μs | Over TCP/TLS within VPC |
| PostgreSQL indexed query | 1–2 ms | Single row, hot cache |
| Kafka produce (same DC) | 3–5 ms | p99, acks=all |
| Kafka produce (cross-AZ) | 10–15 ms | p99 with synchronous replication |
| HTTP (same AZ) | 1 ms | Internal microservice call |
| HTTP (cross-region EU) | 25–40 ms | e.g. London → Frankfurt |
| HTTP (transatlantic) | 70–90 ms | e.g. NYC → London |
| Lambda cold start (Node.js) | 150–400 ms | VPC-attached, standard init |
| Lambda cold start (JVM) | 800–1500 ms | Without SnapStart |
N+1 beats network as the #1 latency killer
The most common SLA breach in 2026 is not a slow network hop — it's the N+1 query pattern. In a distributed system, chatty interfaces are compounded by the speed of light.
50 serial database calls at 1ms each = 50ms floor before any business logic runs. That's slower than a single cross-region network hop.
The problem:
async function getOrderDetails(orderId: string) {
const order = await db.orders.findById(orderId); // 1ms
const items = [];
for (const itemId of order.itemIds) {
const item = await db.items.findById(itemId); // 1ms × N = N ms
items.push(item);
}
return { order, items };
}The fix:
async function getOrderDetailsOptimized(orderId: string) {
const order = await db.orders.findById(orderId); // 1ms
// Single round-trip regardless of order size
const items = await db.items.findMany({
where: { id: { in: order.itemIds } },
}); // ~2ms
return { order, items };
}The optimized version moves from O(N) latency to O(1) relative to the network. Understanding the latency numbers every programmer should know helps you spot these patterns during code review before they breach production SLAs.
The physics floor: why cross-continent will never be sub-50ms
Speed of light in a vacuum: 299,792 km/s. In standard single-mode optical fibre (refractive index ≈ 1.467):
v = c / n = 299,792 / 1.467 ≈ 204,357 km/s
New York → Tokyo great-circle distance: ~10,850 km.
One-way: 10,850 / 204,357 ≈ 53ms Minimum RTT: 53ms × 2 = 106ms
This is the physics floor — a perfectly straight fibre cable with zero routers. In practice, cable routing and signal regenerators push real-world RTT to 140–160ms. No HTTP/3 or QUIC optimisation breaks this limit. It's physics, not engineering.
Lambda cold starts in your latency budget
Serverless is the default for event-driven architectures in 2026. Cold starts remain the tail latency problem — when a Lambda hasn't been used, the provider provisions a Firecracker microVM, downloads the code, and initialises the runtime.
For Node.js or Python this is usually acceptable (< 400ms). For the JVM it can exceed 1.5 seconds — deal-breaking for synchronous UI paths.
AWS Lambda SnapStart mitigates JVM cold starts by snapshotting the initialised Firecracker VM. AWS documentation reports up to 10× reduction in startup latency for Java functions.
When building a latency budget, account for p99.9. If 0.1% of requests trigger a 1.5s cold start, that tail is visible to users when the Lambda sits on a synchronous UI path.
For an interactive architecture calculator that shows where your latency budget is actually going, use the System Latency Budget Calculator.
Related tool
System Latency Budget Calculator →Build your architecture from reference hop latencies and see instantly if you fit your SLA — interactive latency numbers for engineers.