Apache Doris Analytical Database

Apache Doris Analytical Database

doris.apache.org

2

About this website

Apache Doris is a high-performance, real-time analytical database based on MPP (Massively Parallel Processing) architecture, designed for sub-second queries and high-concurrency real-time analytics. Originally developed at Baidu in 2009 as Palo (later renamed Doris) and contributed to the Apache Software Foundation in 2018, becoming a top-level project in 2022. Key features: MPP distributed query engine with vectorized execution for high-throughput analytical queries across multiple compute nodes, achieving sub-second response times on terabyte-scale datasets. Hybrid storage format with columnar storage optimized for OLAP workloads, including Aggregate, Unique, and Duplicate data models for different use cases (pre-aggregation, primary key deduplication, and raw data storage). Real-time data ingestion via Stream Load (HTTP), Routine Load (Kafka), Broker Load (HDFS/S3), and INSERT INTO SELECT for batch and streaming data with exactly-once semantics. Unique Key model with partial column updates and Merge-on-Write (MoW) for efficient upserts without read amplification. Rollup and materialized views for pre-aggregating data and accelerating queries with automatic query rewriting. Colocate Join for co-locating join keys on the same nodes for fast local joins. Bitmap and HLL (HyperLogLog) for efficient distinct counting. Inverted index for fast full-text search and equality filtering. Dynamic partitioning for automatic partition creation and expiration for time-series data. Cold and hot data separation with S3-compatible object storage for cost-effective long-term storage. Standard SQL support including window functions, CTEs, UNION, JOIN, and subqueries. MySQL protocol compatibility enabling connection from MySQL clients, JDBC, and BI tools. High availability with data replication, failover, and online schema changes. Cloud-native deployment on Kubernetes. Used by Baidu, ByteDance, Tencent, Meituan, and JD.com for real-time analytics and reporting.

Tags & Categories

Statistics

2
Views
0
Clicks
0
Like
0
Dislike

Comments

Log In to post a comment

No comments yet. Be the first!