dkduckkit.dev

Consumer poll loop (Kafka)

Kafka

The Kafka consumer poll loop is the central control loop of a consumer application: call consumer.poll(duration) to fetch a batch of records, process them, then call poll() again. The poll call serves two purposes: fetching records from the broker and sending heartbeats to the group coordinator (proving the consumer is alive). If poll() is not called within max.poll.interval.ms (default: 300 seconds), the broker assumes the consumer has died and triggers a rebalance.

Why it matters in practice

The poll loop constraint means slow message processing directly threatens consumer group membership. If a single batch takes longer than max.poll.interval.ms to process — for example, a downstream API call times out, a database transaction deadlocks, or a batch of 500 messages × 1s processing each = 500s — the consumer is evicted and a rebalance occurs. This is a common cause of rebalance loops in production: the consumer rejoins, receives the same partitions, processes the same slow batch, gets evicted again, repeats.

Common mistakes

  • Processing messages synchronously in the poll loop with slow I/O — offload slow processing to async workers or reduce max.poll.records to process smaller batches.
  • Setting max.poll.interval.ms to a very high value as a workaround — this delays detection of genuinely dead consumers, causing long rebalances when actual failures occur.
  • Catching all exceptions silently in the poll loop — an exception that prevents poll() from being called will cause the consumer to be evicted without any logged warning.