Apache Kafka Event Streaming Platform
github.com
1
Leaving SiteNav
External Link Disclaimer
You are about to visit github.com. This website is not operated by us. We are not responsible for its content or privacy practices.
About this website
Apache Kafka is a free and open-source distributed event streaming platform used for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications. Originally developed at LinkedIn by Jay Kreps, Neha Narkhede, and Jun Rao in 2010 (open-sourced in 2011, becoming a top-level Apache project in 2012), Kafka is now maintained by the Apache Software Foundation and Confluent (the commercial company founded by the original developers in 2014). Key features: distributed pub/sub: Kafka implements a distributed publish-subscribe model where producers write events to topics and consumers read from them. Topics are partitioned for parallelism, with each partition being an append-only, immutable log. Horizontal scalability: Kafka clusters scale horizontally across multiple broker nodes, handling trillions of messages per day at companies like LinkedIn, Netflix, Uber, and Airbnb. Topics can have hundreds of partitions distributed across brokers. Durability: events are persisted to disk and replicated across configurable replica servers (typically 3), providing fault tolerance. Data is retained for a configurable period rather than deleted after consumption. High throughput: achieves millions of messages per second through sequential disk I/O, zero-copy data transfer (sendfile), and efficient batching. Consumer groups: consumers organize into groups, with each partition assigned to one consumer within a group. Exactly-once semantics: supports exactly-once delivery through idempotent producers and transactional APIs. Kafka Connect: framework for connecting Kafka with external systems. Kafka Streams: lightweight stream processing library. Java/Scala codebase. Apache-2.0.
Statistics
1
Views
0
Clicks
0
Like
0
Dislike