As Confluent puts it: “At its heart lies the humble, immutable commit log, and from there you can subscribe to it, and publish data to any number of systems or real-time applications.
“Unlike messaging queues, Kafka is a highly scalable, fault tolerant distributed system, allowing it to be deployed for applications like managing passenger and driver matching at Uber.”
It has four key APIs
The Producer API allows an application to publish a stream of records to one or more Kafka topics.
The Consumer API allows an application to subscribe to one or more topics and process the stream of records produced to them.
The Streams API allows an application to act as a stream processor, consuming an input stream from one or more topics and producing an output stream to one or more output topics, effectively transforming the input streams to output streams.
The Connector API allows building and running reusable producers or consumers that connect Kafka topics to existing applications or data systems. For example, a connector to a relational database might capture every change to a table.
What is Apache Kafka? And What is It Used For?
One of its most popular uses now is for stream processing, or querying continuous data streams to detect changing conditions.
Many users of Kafka process data in pipelines consisting of multiple stages, where raw input data is consumed and then aggregated, enriched, or otherwise transformed into new “topics” for further follow-up processing.