dkduckkit.dev

Kafka partition

Kafka

A Kafka partition is the fundamental unit of parallelism and ordering within a topic. Each partition is an ordered, append-only log of messages. Producers write to partitions (determined by message key hash or round-robin for null keys). Consumers in a group each own one or more partitions exclusively, so partition count caps consumer group parallelism. Partitions are replicated across brokers according to the replication factor for durability.

Formula

Max parallel consumers per group ≤ partition count per subscribed topic.

Why it matters in practice

Partition count is the single most impactful configuration decision for a Kafka topic — and one of the few that cannot be reversed (you can increase partitions but not decrease them). Too few partitions limits consumer parallelism: 3 partitions means a maximum of 3 concurrent consumers, regardless of how many you start. Too many partitions increases Zookeeper/KRaft metadata overhead, rebalance duration, and end-to-end latency for `acks=all` producers. The common rule: provision 2–3× your expected peak consumer count, leaving room to scale horizontally.

Common mistakes

  • Setting partition count equal to current consumer count — this leaves no room for scaling without topic recreation.
  • Not considering partition count when choosing a message key — a high-cardinality key (user ID) distributes evenly; a low-cardinality key (country code) concentrates traffic on a few partitions, causing skew.
  • Increasing partitions on a topic with key-based partitioning — adding partitions changes the key-to-partition mapping, breaking ordering guarantees for existing keys.