Published: 2026-06-01 โ€ข Updated: 2026-06-02

Working with Kafka Topics and Partitions

Last Updated: May 28, 2026

Learn how Apache Kafka topics and partitions work internally, how Kafka stores data, how producers distribute messages, how partitions enable scalability, and how to manage Kafka topics using real-world architecture concepts and Kafka CLI commands.

If you are new to Kafka fundamentals, first read:


Table of Contents


Understanding Kafka Topics

A Kafka Topic is a logical category or stream where producers publish events and consumers subscribe to read those events.

A topic acts as a continuously growing append-only event log.

Examples of Kafka topics:

  • customer-orders
  • payment-events
  • inventory-updates
  • email-notifications
  • user-login-events

Topics decouple producers from consumers.

This means producers do not need to know:

  • Who consumes the data
  • How many consumers exist
  • When consumers process messages

Similarly, consumers do not need direct communication with producers.

This decoupling is one of the most important architectural advantages of Kafka in event-driven systems.

A single Kafka topic can have:

  • Multiple producers
  • Multiple consumer groups
  • Millions of messages
  • Continuous real-time event streams

Understanding Kafka Partitions

Kafka topics are divided into smaller units called partitions.

Partitions are the actual storage mechanism Kafka uses internally.

Instead of storing all topic data inside one massive file, Kafka distributes the topic into multiple partitions for better scalability and parallel processing.

Each partition is:

  • Immutable
  • Append-only
  • Sequentially ordered
  • Stored independently

Messages are always appended at the end of the partition log.

Topic: customer-orders

+------------------------------------------------------+
| Partition 0                                          |
| Offset 0 -> Offset 1 -> Offset 2 -> Offset 3         |
+------------------------------------------------------+

+------------------------------------------------------+
| Partition 1                                          |
| Offset 0 -> Offset 1 -> Offset 2                     |
+------------------------------------------------------+

+------------------------------------------------------+
| Partition 2                                          |
| Offset 0 -> Offset 1 -> Offset 2 -> Offset 3         |
+------------------------------------------------------+

Each partition can reside on a different Kafka broker.

This allows Kafka clusters to scale horizontally across multiple servers.

Kafka guarantees strict ordering only inside an individual partition.

Ordering is not guaranteed across multiple partitions.

If ordering is important for your business workflow, related messages must always go to the same partition using a partition key.

Understanding Kafka Offsets

Every message inside a partition receives a unique sequential identifier called an offset.

Offsets are extremely important because they allow Kafka consumers to track message processing progress.

Consumers use offsets to:

  • Resume processing after crashes
  • Replay historical messages
  • Track processed events
  • Handle failures safely
  • Implement retry mechanisms
Partition 0

Offset 0 -> OrderCreated
Offset 1 -> PaymentProcessed
Offset 2 -> OrderPacked
Offset 3 -> OrderShipped

Offsets are unique only within a partition.

Offset 5 in Partition 0 is different from Offset 5 in Partition 1.

Kafka stores offsets separately from the actual messages.

Consumer groups commit offsets periodically to Kafka's internal offset topic.

Why Kafka Uses Partitions

Partitions are the foundation of Kafka scalability and performance.

1. Horizontal Scalability

A single server has limitations in:

  • Disk storage
  • CPU power
  • Network bandwidth
  • Read/write throughput

Partitions allow Kafka to distribute data across multiple brokers.

Partition 0 -> Broker A
Partition 1 -> Broker B
Partition 2 -> Broker C

This architecture enables Kafka clusters to handle:

  • Millions of messages per second
  • Petabytes of storage
  • Large-scale distributed processing

2. Parallel Processing

Partitions allow multiple consumers to process messages simultaneously.

This is one of the most powerful features of Kafka consumer groups.

Topic Partitions = 4
Consumer Instances = 4

Consumer 1 -> Partition 0
Consumer 2 -> Partition 1
Consumer 3 -> Partition 2
Consumer 4 -> Partition 3

Each consumer processes a different partition independently.

This improves:

  • Processing speed
  • Throughput
  • Scalability
  • Concurrency

3. Fault Tolerance

Partitions are replicated across multiple Kafka brokers.

If a broker crashes, Kafka automatically promotes another replica as leader.

This ensures:

  • High availability
  • Minimal downtime
  • Data durability
  • Automatic recovery

How Producers Route Messages

When a producer sends a message to Kafka, Kafka must determine which partition stores that message.

This process is controlled by the Kafka partitioner.

With a Partition Key

If a producer sends a key with the message:

customer_id = 1001

Kafka calculates:

hash(key) % number_of_partitions

This guarantees:

  • Same key always maps to same partition
  • Ordering remains consistent
  • Related events stay together

Example:

OrderCreated
PaymentProcessed
OrderShipped

All events for the same order remain in sequence.

Without a Partition Key

If no key is provided, Kafka uses the Sticky Partitioner.

The sticky partitioner temporarily selects one partition and sends batches of records there until the batch is full.

Then it switches to another partition.

Benefits:

  • Better batching efficiency
  • Lower network overhead
  • Higher throughput
  • Reduced latency

Sticky Partitioner Explained

Earlier Kafka versions used round-robin partitioning for messages without keys.

Modern Kafka producers use sticky partitioning because it improves batching performance dramatically.

Instead of constantly switching partitions for every message, sticky partitioning temporarily "sticks" to one partition.

This increases batch sizes and improves compression efficiency.

This optimization significantly improves producer throughput in high-volume systems.

Replication and Fault Tolerance

Each partition can have multiple replicas distributed across brokers.

Partition 0

Leader Replica -> Broker 1
Follower Replica -> Broker 2
Follower Replica -> Broker 3

The leader handles:

  • Read requests
  • Write requests
  • Producer acknowledgments

Followers continuously replicate the leader's data.

If the leader broker fails:

  • Kafka elects a new leader
  • Consumers reconnect automatically
  • Producers continue sending data

This failover process happens automatically.

Partitions and Consumer Groups

Consumer groups allow Kafka consumers to process data collaboratively.

Each partition inside a consumer group can only be consumed by one consumer instance at a time.

Example:

Topic Partitions = 3
Consumer Instances = 5

Result:

  • 3 consumers become active
  • 2 consumers remain idle

Maximum consumer parallelism equals the number of partitions.

This is why partition planning is extremely important in Kafka architecture design.

Kafka Topic CLI Commands

The following examples assume Kafka is running locally on port 9092.

Create a Topic

kafka-topics.sh \
--create \
--bootstrap-server localhost:9092 \
--replication-factor 1 \
--partitions 3 \
--topic ecommerce-transactions

Describe a Topic

kafka-topics.sh \
--describe \
--bootstrap-server localhost:9092 \
--topic ecommerce-transactions

This displays:

  • Partition count
  • Leader brokers
  • Replica brokers
  • ISR replicas
  • Topic configuration

List Topics

kafka-topics.sh \
--list \
--bootstrap-server localhost:9092

Increase Partitions

kafka-topics.sh \
--alter \
--bootstrap-server localhost:9092 \
--topic ecommerce-transactions \
--partitions 6

Important: Increasing partition count changes key hashing distribution.

This can affect ordering guarantees.

Delete a Topic

kafka-topics.sh \
--delete \
--bootstrap-server localhost:9092 \
--topic ecommerce-transactions

Partition Scaling Strategy

Choosing the correct partition count is one of the most important Kafka architecture decisions.

Too few partitions create bottlenecks.

Too many partitions increase:

  • Metadata overhead
  • Memory consumption
  • Controller load
  • Broker recovery time

Partition sizing should consider:

  • Expected throughput
  • Future traffic growth
  • Consumer scaling needs
  • Storage requirements
  • Replication factor

Real-World Use Cases

E-Commerce Order Processing

Using order_id as partition key guarantees ordering of order lifecycle events.

This ensures:

  • Payments happen before shipping
  • Order status remains consistent
  • Consumers process events sequentially

Banking Systems

Bank transactions require strict ordering.

Using account_id as the partition key ensures transaction consistency.

Log Aggregation Platforms

Logs often prioritize throughput over strict ordering.

Messages are distributed evenly across partitions for maximum performance.

IoT Sensor Processing

Millions of sensor events are distributed across partitions for scalable ingestion and analytics processing.

Common Mistakes to Avoid

Over-Partitioning

Too many partitions increase:

  • Open file handles
  • Broker memory usage
  • Metadata synchronization overhead
  • Leader election time

Under-Partitioning

Too few partitions limit:

  • Parallel processing
  • Consumer scalability
  • System throughput

Increasing Partitions on Keyed Topics

Changing partition count changes hash distribution.

Messages with the same key may later go to different partitions.

This can break ordering guarantees.

Using One Partition for Large Systems

Single-partition topics severely limit throughput and scalability.

Partition Design Best Practices

Use Stable Partition Keys

Good partition keys:

  • customer_id
  • account_id
  • order_id

Plan for Future Growth

Always estimate:

  • Future traffic
  • Consumer scaling
  • Storage growth
  • Peak throughput

Monitor Partition Distribution

Uneven traffic distribution creates hot partitions.

Monitor:

  • Broker CPU
  • Disk usage
  • Partition traffic
  • Consumer lag

Avoid Frequent Partition Changes

Partition changes affect message routing consistency.

Design topics properly from the beginning.

Interview Questions

Does Kafka guarantee ordering across partitions?

No. Kafka guarantees ordering only within a single partition.

What determines maximum consumer parallelism?

Maximum parallelism equals the number of partitions.

What is ISR in Kafka?

ISR stands for In-Sync Replicas.

These replicas remain fully synchronized with the leader.

What happens when a broker fails?

Kafka automatically elects a new leader from ISR replicas.

Why are partitions important?

Partitions enable:

  • Scalability
  • Parallelism
  • Replication
  • Fault tolerance
  • High throughput

Frequently Asked Questions

Can Kafka partitions be decreased?

No. Kafka supports increasing partitions but not safely decreasing them.

How many partitions should a topic have?

It depends on:

  • Throughput requirements
  • Consumer scaling
  • Future growth
  • Broker capacity

Can multiple consumers read the same partition?

Within the same consumer group: Only one consumer reads a partition.

Across different consumer groups: Multiple groups can read independently.

Why are offsets important?

Offsets allow:

  • Recovery
  • Replay
  • Fault tolerance
  • Consumer tracking

Next Step

Now continue learning Kafka producers and message publishing internals:

Understanding Kafka Producers and Sending Messages


Continue Learning Apache Kafka


Summary

Kafka topics and partitions are the foundation of scalable event-driven architecture.

Partitions enable:

  • Horizontal scalability
  • Parallel processing
  • High throughput
  • Fault tolerance
  • Distributed storage

Understanding partition design is critical for building reliable enterprise Kafka systems.

Always carefully plan:

  • Partition count
  • Partition keys
  • Replication factor
  • Consumer scaling
  • Ordering requirements

In the next lesson, you will learn how Kafka producers internally batch, compress, acknowledge, and publish events efficiently to Kafka brokers.

About the Author

Naresh Kumar

Naresh Kumar

Senior Java Backend Engineer experienced in Banking, Payments, ISO 20022, Spring Boot, Microservices, Kafka, Docker, Kubernetes, AWS and Cloud Native Systems.

Built enterprise payment solutions, transaction processing systems, API platforms and scalable microservices used in production.

LinkedIn Profile