dkduckkit.dev

Kafka batch.size

Kafka

batch.size is a Kafka producer configuration that controls the maximum size (in bytes) of a message batch before it is sent to the broker. The default is 16,384 bytes (16 KB). The producer sends a batch when either batch.size is reached or linger.ms expires — whichever comes first. Larger batch sizes improve throughput and compression efficiency but increase memory usage and potentially add latency for low-volume producers.

Formula

send when: batch_bytes ≥ batch.size OR elapsed ≥ linger.ms

Why it matters in practice

batch.size and linger.ms work together to control the throughput-latency trade-off. With linger.ms=0 and small messages, each message is sent as its own 1-record batch — maximum latency efficiency but minimum throughput and compression efficiency. With batch.size=1MB and linger.ms=20ms, the producer accumulates messages for up to 20 ms or until 1 MB is collected — reducing round trips, improving compression, but adding up to 20 ms of producer-side latency. For high-throughput workloads where compression is used, batch.size=64KB–1MB with linger.ms=5–20ms is a standard starting point.

Common mistakes

  • Using default batch.size=16KB with compression — batches this small compress inefficiently because compression algorithms need repeated patterns to find savings.
  • Not monitoring batch-size-avg JMX metric — if the average batch size is much smaller than the configured maximum, linger.ms may need to be increased.
  • Keeping linger.ms=0 while expecting meaningful compression — single-record batches give codecs almost no context, so savings collapse even when batch.size is large.