A banking transaction has to turn into a fraud score in seconds. In retail, a stock counter that is out of sync with orders will make campaigns misfire. In these scenarios, batch pipelines simply cannot keep up and Apache Kafka-based event streaming architectures take over.
What Kafka Actually Does
Kafka is not a distributed message queue — it is a durable distributed log. Producer systems write messages to topics; consumers read from those topics. Messages remain readable during their retention window (days, weeks or indefinitely), which is what sets Kafka apart from a classic message queue.
Architectural Patterns
Three common patterns dominate Kafka-based architectures:
- Event Sourcing: application state is stored as a log of events and the current state is rebuilt by replaying them in order.
- CDC (Change Data Capture): tools like Debezium stream every change in the operational database into Kafka, letting the analytical layer sync as a stream rather than a batch.
- Stream Processing: Kafka Streams, Apache Flink or Spark Structured Streaming perform windowed aggregations on events in real time.
Production Decisions That Really Matter
- Replication factor 3: the minimum standard to avoid data loss.
- Partitioning strategy: the key determines how order is preserved within a partition.
customer_idis by far the most common choice for customer-level processing. - Schema Registry: managing schema compatibility between producer and consumer via Avro or Protobuf saves production from chaos.
- Monitoring: consumer lag, broker disk utilisation and under-replicated partition count are the three metrics that must be watched continuously.
When Kafka Is the Right Enterprise Choice
Not every data flow should move to Kafka. Batch pipelines that run once a day are perfectly fine on Airflow-driven ETL. Kafka's value emerges when latency is critical (sub-second) and when multiple consumers use the same event for different purposes.
A Practical Example
At a major Turkish private bank, a CDC-based Kafka integration cut end-of-day report runtime from six hours to twelve minutes. The key success factors were setting up the Schema Registry on day one and splitting consumer groups by business domain (risk, CRM, analytics).
