Skip to content
This repository has been archived by the owner on Nov 20, 2020. It is now read-only.

Kafka Crash Course

Lev Gorodinski edited this page Dec 11, 2017 · 2 revisions

Kafka is a messaging system, where messages belong to partitions within topics, and are stored by brokers. Topics are organized into partitions, where each partition contains a sequence of messages ordered by a numeric offset as follows:

Topic A

p1 p2 ... pN
1 1 1
2 2 2
3 3

Topic B

p1 p2 ... pM
1 1 1
2 2
3

Partitions are allocated to brokers with a configurable replication factor:

Broker Partitions
1 (Topic A,p1), (Topic A,p3), ...
2 (Topic B,p2), (Topic A,p2), ...
3 (Topic B,p1), (Topic B,p3), ...

Glossary

  • Broker: a Kafka server node.
  • Cluster: a collection of brokers which operate in unison to provide redundancy and load-balancing for a set of topics.
  • Leader: a selected broker, which receives produced messages and returns fetched messages for a partition of a topic.
  • Replica: a broker which contains a copy of messages in a partition. The leader is a replica, but there may also be non-leader replicas.
  • ISR: in-sync replicas - a set of replicas which are in-sync with each other.
  • Client: a Kafka client which communicates with brokers.
  • Message: a unit of messaging, belonging to a particular partition within a topic. Messages are stored by brokers.
  • MessageSet: a collection of contiguous messages. Messages are produced and consumed in messages sets as an optimization.
  • Topic: a named collection of messages, allocated across a number of partitions.
  • Offset: a numeric position of a message within a partition of a topic.
  • Partition: a sequence of messages ordered by offset.
  • Consumer Group: a group of client instances consuming a topic as coordinated by a group coordinator.
  • Group Coordinator: a broker designated as the coordinator for a consumer group.
  • Retention Policy: a policy configurable by topic which defines the maximum age of messages within a topic, and the maximum size of a partition within a topic. Messages beyond the threshold are removed.
  • Producer: a client which writes messages to partitions of a topic by communicating with the leading broker for that topic/partition combination.
  • Consumer: a client which fetches messages from partitions of a topic.
Clone this wiki locally