ClickHouse Columnar Database
github.com
2
Leaving SiteNav
External Link Disclaimer
You are about to visit github.com. This website is not operated by us. We are not responsible for its content or privacy practices.
About this website
ClickHouse is a free and open-source column-oriented database management system (DBMS) for online analytical processing (OLAP) queries, capable of generating real-time analytical data reports using SQL queries. Originally developed at Yandex (the Russian search engine and technology company, headquartered in Moscow) by Alexey Milovidov starting in 2009 for the Yandex.Metrica web analytics service (the second-largest web analytics platform in the world), ClickHouse was open-sourced in 2016 and has since been adopted by major companies including Cloudflare, Uber, Bloomberg, Spotify, eBay, and Comcast. Key features: columnar storage: data is stored by columns rather than by rows, enabling dramatic compression (5-10x) and fast aggregation queries (only relevant columns are read from disk). Vectorized query execution processes data in batches (vectors of 65536 rows) for CPU cache efficiency. Performance: ClickHouse can process hundreds of millions to billions of rows per second on a single server, with queries typically completing in milliseconds to seconds. This makes it 100-1000x faster than traditional row-oriented databases for analytical workloads. SQL support: a rich SQL dialect supporting JOINs (hash join, merge join), subqueries, window functions, arrays, dictionaries, materialized views, and approximated aggregation functions (uniqCombined, quantileTDigest) for fast approximate results. MergeTree engine: the primary table engine, using an LSM-tree-like structure with background merges, data compression, and optional TTL-based data expiration. Specialized engines: for different use cases (Kafka engine for streaming ingestion, Distributed for sharded queries, MaterializedView for pre-aggregated data, ReplacingMergeTree for deduplication). Sharding and replication: horizontal scaling via sharding and high availability via replicated tables using ZooKeeper or Keeper for coordination. Cross-platform (C++). Apache-2.0.
Statistics
2
Views
0
Clicks
0
Like
0
Dislike