Apache Kafka Streaming
kafka.apache.org
1
Leaving SiteNav
External Link Disclaimer
You are about to visit kafka.apache.org. This website is not operated by us. We are not responsible for its content or privacy practices.
About this website
Apache Kafka is a distributed event streaming platform originally developed by LinkedIn engineers Jay Kreps, Neha Narkhede, and Jun Rao in 2011, open-sourced under the Apache Software Foundation and subsequently commercialized through Confluent, the company founded by the original creators in 2014, providing high-throughput, low-latency publish-subscribe messaging, distributed commit log storage, and real-time stream processing, capable of handling millions of messages per second with millisecond latency, adopted by over eighty percent of Fortune one hundred companies including Netflix, Airbnb, Uber, LinkedIn, Twitter, and Goldman Sachs for real-time data pipelines, event-driven architectures, and stream processing applications. The distributed partitioned commit log architecture organizes messages into topics that are partitioned across multiple broker nodes, with each partition being an ordered, append-only sequence of messages identified by an incremental offset number, enabling parallel consumption through consumer groups where each consumer in a group reads from a subset of partitions, providing horizontal scalability as both topic throughput and consumer processing capacity scale with the number of partitions. The durability guarantees include configurable message replication across multiple broker nodes with configurable acks levels supporting at-most-once, at-least-once, and effectively-once semantics, with the ISR in-sync replicas mechanism ensuring data consistency by tracking which replicas are fully caught up with the leader, and automatic leader election when a broker fails. The Kafka Streams library provides a full stream processing DSL. The Kafka Connect framework provides source and sink connectors. The schema registry for Avro and Protobuf serialization. The KSQL for SQL-based stream processing. Designed by LinkedIn. Designed for data engineering.
Statistics
1
Views
0
Clicks
0
Like
0
Dislike