Kafka Consumer Lag: How to Calculate Time-to-Overflow Before the Incident
A practical guide to consumer lag math: partition bottleneck, max.poll.interval.ms misconfiguration, and the alerts you need before production incidents.
Practical guides for platform engineers — Kafka, APIs, rate limiting, latency.
A practical guide to consumer lag math: partition bottleneck, max.poll.interval.ms misconfiguration, and the alerts you need before production incidents.
Updated Jeff Dean latency numbers for cloud-native 2026: NVMe, cross-AZ Kafka, Lambda cold starts, and why N+1 beats network as the top latency killer.
What counts as a breaking change, why request and response schemas follow different rules, and how to version safely without breaking clients.
Fixed window boundary exploit, token bucket burst math, retry storm prevention, and why banking APIs need different configuration than public APIs.
RecordBatch vs Record overhead, why linger.ms=0 kills compression, zstd vs gzip benchmarks, and the replica.fetch.max.bytes silent killer.