Kafka message size: Record V2 & compression

The most common mistake in Kafka capacity planning is assuming a 1KB JSON payload produces 1KB of disk pressure. The true kafka message size is a moving target shaped by record headers, batching efficiency, and compression codecs. Getting it wrong doesn't just cause under-provisioned storage — it can cause silent replication failure and permanent data loss.

The overhead you're not accounting for

Since KIP-98, Kafka uses the Record V2 format. Every individual record carries a typical overhead of approximately 7–21 bytes — attributes, timestamp delta, offset delta, key length, value length, and optional headers. These fields use varint encoding, so the actual size depends on key/value lengths and delta magnitudes.

Records are never sent in isolation. They are wrapped in a RecordBatch with its own 61-byte header containing the base offset, CRC, and producer ID (per KIP-98).

Effective on-disk size per message:

effectiveSize = (payload + 21) + (61 / messagesPerBatch)

For a 100-byte payload in a single-message batch: (100 + 21) + 61 = 182 bytes — 82% overhead. As batch size grows, the 61-byte RecordBatch header is amortised across thousands of records and becomes negligible.

Why compression needs big batches

A common complaint in Kafka performance work is "compression isn't helping." The root cause is almost always insufficient batching. Kafka compression is a batch-level operation, not message-level — the codec needs a large sample to identify repeated patterns.

The critical lever is linger.ms. At the default linger.ms=0, the producer sends as soon as a thread is available. In low-throughput environments this often produces single-message batches where compression saves less than 5% while still incurring full CPU cost. In high-concurrency environments, natural batching still occurs even at linger.ms=0 as records accumulate while prior sends complete — but batch sizes remain unpredictable.

Increase linger.ms to 20ms and batches reliably accumulate 100–1,000 messages. Zstd can then achieve 65–70% savings on standard JSON payloads by compressing repeated keys across the entire batch. The linger.ms trade-off is 20ms of added producer latency — usually worthwhile for storage and network savings, but unsuitable for real-time systems where every millisecond of publish latency matters.

The replica.fetch.max.bytes silent killer

This is one of the most dangerous Kafka misconfigurations, especially in older versions. The sequence:

You increase message.max.bytes on the broker and max.request.size on the producer to handle a large payload (say, 5MB)
Producer sends the message and receives a success ACK
replica.fetch.max.bytes is still at its default of 1MB
Follower brokers attempt to replicate but are capped — replication stalls
Replication lag for that partition grows indefinitely
Leader broker fails; Kafka triggers an election
Under-replicated data may be lost if no in-sync replica has the message

Modern Kafka versions allow oversized batches to proceed in fetch requests to ensure forward progress, but the misconfiguration still degrades replication performance, increases inter-broker latency, and risks data loss during broker failures. Always align your fetch limits with your produce limits.

Per the Apache Kafka documentation and KIP-98, fetch limits must always be greater than or equal to produce limits. The full parameter chain:

max.request.size (Producer)
  ≤ message.max.bytes (Broker/Topic)
    ≤ replica.fetch.max.bytes (Broker Replication)
      ≤ max.partition.fetch.bytes (Consumer)

If any link is smaller than the ones before it, you risk either producer rejections (catch-able) or the silent replication failure described above (not catch-able at write time).

Choosing the right compression codec

Approximate benchmarks from LinkedIn Engineering and Confluent's internal testing on structured JSON workloads. Your actual savings depend heavily on payload structure — highly variable JSON with unique values will see lower compression ratios:

Codec	Savings (JSON)	CPU cost	Min Kafka version
None	0%	None	N/A
Snappy	33–50%	Very low	0.8+
LZ4	33–44%	Lowest	0.8.2+
Gzip	60–67%	High	0.8+
Zstd	67–75%	Moderate	2.1.0+

For modern workloads: LZ4 is the performance default (highest throughput per CPU cycle); Zstd is best for storage-constrained environments. Avoid Gzip at high throughput — it often becomes a producer CPU bottleneck before the network does.

Note: these savings only materialise with adequate batch sizes. At linger.ms=0, all codecs converge toward 0% effective savings.

When to use Claim Check

Kafka is a log, not a file system. While it can technically handle large kafka message sizes, payloads above ~1MB increase JVM heap pressure, bloat segment files, and slow replication.

The Claim Check pattern:

Producer uploads large binary to object storage (S3, GCS)
Producer sends a Kafka message containing only the URI and metadata
Consumer fetches the payload directly from object storage

This keeps Kafka segments lean and the cluster responsive for low-latency coordination. Use it for payloads above 1MB, binary assets (images, ML weights), and data that doesn't need Kafka's retention semantics.

For a precise calculation of your cluster's storage, bandwidth, and replica.fetch.max.bytes requirements based on your specific replication factor and retention settings, use the Kafka Message Size Calculator.