Chroma AI Vector Database

Chroma AI Vector Database

www.trychroma.com

4

About this website

Chroma is an open source search infrastructure platform for AI applications, providing vector similarity search, full-text search with BM25 and SPLADE sparse vectors, regex and trigram search, and metadata filtering through a unified API. Released under Apache 2.0 with over 27 thousand GitHub stars, 15 million monthly NPM and PyPI downloads, and adoption in over 90 thousand open source codebases, it has become the default embedding store for retrieval-augmented generation pipelines across the AI ecosystem. Customers include Capital One, UnitedHealthcare, Mintlify, Weights and Biases, and Conduit. The query engine achieves p50 latency of 20 milliseconds for warm cache and 650 milliseconds for cold cache on 100,000 vectors at 384 dimensions, with write throughput of 30 megabytes per second per collection. The architecture leverages object storage (S3 and GCS) as the primary storage tier with intelligent query-aware tiering between hot in-memory cache, warm SSD cache, and cold object storage, delivering costs up to 10 times lower than memory-based vector databases while maintaining recall rates of 90 to 100 percent. Technical specifications support 1 million collections per database and 5 million records per collection. The collection forking feature enables copy-on-write dataset duplication for versioning, A/B testing, and staged rollouts without data copying. Chroma Sync automatically ingests data from S3 buckets, GitHub repositories, and web pages through crawling, scraping, chunking, and embedding pipelines without custom integration code. The write-ahead log (wal3) provides durability on top of object storage. The API is available in Python, TypeScript, and Rust with SDK clients. Enterprise features include bring-your-own-cloud deployment in customer VPCs, multi-region replication, point-in-time recovery, and AWS PrivateLink connectivity. The Context-1 research project explores self-editing search agents that improve retrieval quality.

Statistics

4
Views
0
Clicks
0
Like
0
Dislike

Comments

Log In to post a comment

No comments yet. Be the first!