Consumer poll loop (Kafka)
The Kafka consumer poll loop is the central control loop of a consumer application: call consumer.poll(duration) to fetch a batch of records, process them, then call poll() again. Since KIP-62 (Kafka 0.10.1+), heartbeats are sent by a separate background thread — not by poll() itself. However, poll() drives the max.poll.interval.ms liveness check: if poll() is not called within that window (default: 300 seconds), the broker assumes the consumer has died and triggers a rebalance, even if heartbeats are still being sent normally.
Why it matters in practice
The poll loop constraint means slow message processing directly threatens consumer group membership. If a single batch takes longer than max.poll.interval.ms to process — for example, a downstream API call times out, a database transaction deadlocks, or a batch of 500 messages × 1s processing each = 500s — the consumer is evicted and a rebalance occurs. This is a common cause of rebalance loops in production: the consumer rejoins, receives the same partitions, processes the same slow batch, gets evicted again, repeats.
Common mistakes
- •Processing messages synchronously in the poll loop with slow I/O — offload slow processing to async workers or reduce max.poll.records to process smaller batches.
- •Setting max.poll.interval.ms to a very high value as a workaround — this delays detection of genuinely dead consumers, causing long rebalances when actual failures occur.
- •Catching all exceptions silently in the poll loop — an exception that prevents poll() from being called will cause the consumer to be evicted without any logged warning.