Reliable Data Delivery in Kafka

Reliability is an important criterion which everyone considers before designing an application. More so, reliability is the property of a system — not of a single component — so even when talking about reliability guarantees of Kafka, we need to keep the entire system and it’s use cases in mind.

Kafka is very flexible about reliable data delivery. Some of the use cases require utmost reliability(e.g. Bank transactions) while others may prioritize speed and simplicity over reliability(e.g. tracking clicks over the website). All this is achieved by configuring Kafka’s client API as it is flexible enough to allow all kinds of reliability trade-offs.

In this article, I am going to discuss the different kinds of reliabilities that Kafka offers and a developer/administrator should consider while designing an application around it.

We usually talk about the reliability of the system in terms of guarantees, The behavior a system is guaranteed to preserve under different circumstances. Understanding the guarantees that Kafka provides is critical for building reliable applications. This understanding allows developers to understand how the system would behave under different failure scenarios.

What Kafka Guarantees :

  • Order guarantee of a message in the partition. If a message B was written after message A from the same producer in the same partition, then Kafka guarantees the offset of message B would be higher than message A and the consumer would consume message A first and then message B.
  • Produced messages are considered committed when it is written to the partition of all in-sync replicas(not necessarily flushed to disc). Producers can configure to receive acknowledgments of sent messages when fully committed when written to the leader when sent to the network.
  • Committed messages won’t be lost as at least one in-sync remains alive.
  • Consumers can read only those messages which are committed.

These basic guarantees can be considered while building up a reliable system, but in themselves, don’t make the system fully reliable. There are trade-offs involved in making a reliable system and Kafka was designed to allow administrators and developers to decide how much reliability they need by giving configuration parameters that allow controlling these trade-offs.

So, this is pretty much from me for today, in my next article I would describe Kafka's replication mechanism and how it contributes to reliability.




Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Viewing Completed Service Reports

Lessons learned from Downsizing Mongo 3.6 by removing Shards

Jupyter Launch Button

Space shuttle launching into orbit.

A Guide to the Google Summer of Code

Java Nptel Unproctored Exam Questions

Beginner’s Web Development Guide Part 1: Frontend

Useful Gems for Rails Applications

Ravendex AMA On Bitmart Ahead Of Testnet Launch, ISPO & Staking Platform Release

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Manik Khandelwal

Manik Khandelwal

More from Medium

Apache Kafka: Core Concepts and Use Cases

Logging in Serverless Spark (Part3)

Apache Kafka - Interview Questions

Apache Kafka 101 in a nutshell