dkduckkit.dev

max.poll.records

Kafka

max.poll.records is a Kafka consumer configuration that limits the maximum number of records returned by a single poll() call. The default is 500. It is the primary knob for controlling the trade-off between throughput (larger batches) and poll loop latency (smaller batches that complete faster and stay within max.poll.interval.ms).

Formula

Relationship: max_batch_processing_time = processingTimeMs × max.poll.records. For the consumer to stay within max.poll.interval.ms=300s, the batch must complete in under 300 seconds. With 1 ms processing per record: 500 × 1ms = 500ms — fine. With 100 ms processing: 500 × 100ms = 50s — approaching the limit. With 1000 ms: 500 × 1000ms = 500s — exceeds default, triggers eviction.

Why it matters in practice

Reducing max.poll.records is the safest fix for max.poll.interval.ms violations without changing the timeout itself. It gives the poll loop a smaller batch to process, ensuring completion within the interval. The trade-off is increased polling overhead (more round trips to the broker) and slightly lower throughput. For most workloads with slow per-message processing (database writes, external API calls), reducing max.poll.records to 10–50 is a safer default than the 500-record default.

Common mistakes

  • Not changing max.poll.records when increasing per-message processing complexity — the default 500 is calibrated for fast, in-memory processing.
  • Setting max.poll.records=1 as a workaround for ordering issues — this dramatically reduces throughput without providing actual ordering guarantees.
  • Not adjusting fetch.min.bytes alongside max.poll.records — if max.poll.records is reduced, the broker may send smaller batches that increase network overhead.