Kafka Message Size: How Record V2 Overhead, Compression, and Batching Really Work
RecordBatch vs Record overhead, why linger.ms=0 kills compression, zstd vs gzip benchmarks, and the replica.fetch.max.bytes silent killer.
The most common mistake in Kafka capacity planning is assuming a 1KB JSON payload produces 1KB of disk pressure. The true kafka message size is a moving target shaped by record headers, batching efficiency, and compression codecs. Getting it wrong doesn't just cause under-provisioned storage — it can cause silent replication failure and permanent data loss.
The overhead you're not accounting for
Since KIP-98, Kafka uses the Record V2 format. Every individual record carries a typical overhead of approximately 7–21 bytes — attributes, timestamp delta, offset delta, key length, value length, and optional headers. These fields use varint encoding, so the actual size depends on key/value lengths and delta magnitudes.
Records are never sent in isolation. They are wrapped in a RecordBatch with its own 61-byte header containing the base offset, CRC, and producer ID (per KIP-98).
Effective on-disk size per message:
effectiveSize = (payload + 21) + (61 / messagesPerBatch)
For a 100-byte payload in a single-message batch: (100 + 21) + 61 = 182 bytes — 82% overhead. As batch size grows, the 61-byte RecordBatch header is amortised across thousands of records and becomes negligible.
Why compression needs big batches
A common complaint in Kafka performance work is "compression isn't helping." The root cause is almost always insufficient batching. Kafka compression is a batch-level operation, not message-level — the codec needs a large sample to identify repeated patterns.
The critical lever is linger.ms. At the default linger.ms=0, the producer sends as soon as a thread is available. In low-throughput environments this often produces single-message batches where compression saves less than 5% while still incurring full CPU cost. In high-concurrency environments, natural batching still occurs even at linger.ms=0 as records accumulate while prior sends complete — but batch sizes remain unpredictable.
Increase linger.ms to 20ms and batches reliably accumulate 100–1,000 messages. Zstd can then achieve 65–70% savings on standard JSON payloads by compressing repeated keys across the entire batch. The linger.ms trade-off is 20ms of added producer latency — usually worthwhile for storage and network savings, but unsuitable for real-time systems where every millisecond of publish latency matters.
The replica.fetch.max.bytes silent killer
This is one of the most dangerous Kafka misconfigurations, especially in older versions. The sequence:
- You increase
message.max.byteson the broker andmax.request.sizeon the producer to handle a large payload (say, 5MB) - Producer sends the message and receives a success ACK
replica.fetch.max.bytesis still at its default of 1MB- Follower brokers attempt to replicate but are capped — replication stalls
- Replication lag for that partition grows indefinitely
- Leader broker fails; Kafka triggers an election
- Under-replicated data may be lost if no in-sync replica has the message
Modern Kafka versions allow oversized batches to proceed in fetch requests to ensure forward progress, but the misconfiguration still degrades replication performance, increases inter-broker latency, and risks data loss during broker failures. Always align your fetch limits with your produce limits.
Per the Apache Kafka documentation and KIP-98, fetch limits must always be greater than or equal to produce limits. The full parameter chain:
max.request.size (Producer)
≤ message.max.bytes (Broker/Topic)
≤ replica.fetch.max.bytes (Broker Replication)
≤ max.partition.fetch.bytes (Consumer)If any link is smaller than the ones before it, you risk either producer rejections (catch-able) or the silent replication failure described above (not catch-able at write time).
Choosing the right compression codec
Approximate benchmarks from LinkedIn Engineering and Confluent's internal testing on structured JSON workloads. Your actual savings depend heavily on payload structure — highly variable JSON with unique values will see lower compression ratios:
| Codec | Savings (JSON) | CPU cost | Min Kafka version |
|---|---|---|---|
| None | 0% | None | N/A |
| Snappy | 33–50% | Very low | 0.8+ |
| LZ4 | 33–44% | Lowest | 0.8.2+ |
| Gzip | 60–67% | High | 0.8+ |
| Zstd | 67–75% | Moderate | 2.1.0+ |
For modern workloads: LZ4 is the performance default (highest throughput per CPU cycle); Zstd is best for storage-constrained environments. Avoid Gzip at high throughput — it often becomes a producer CPU bottleneck before the network does.
Note: these savings only materialise with adequate batch sizes. At linger.ms=0, all codecs converge toward 0% effective savings.
When to use Claim Check
Kafka is a log, not a file system. While it can technically handle large kafka message sizes, payloads above ~1MB increase JVM heap pressure, bloat segment files, and slow replication.
The Claim Check pattern:
- Producer uploads large binary to object storage (S3, GCS)
- Producer sends a Kafka message containing only the URI and metadata
- Consumer fetches the payload directly from object storage
This keeps Kafka segments lean and the cluster responsive for low-latency coordination. Use it for payloads above 1MB, binary assets (images, ML weights), and data that doesn't need Kafka's retention semantics.
For a precise calculation of your cluster's storage, bandwidth, and replica.fetch.max.bytes requirements based on your specific replication factor and retention settings, use the Kafka Message Size Calculator.
Related tool
Kafka Message Size Calculator →Calculate Kafka message size, storage, bandwidth and optimal configuration. Compression, batching and replication.