BentoML Model Serving Framework

BentoML Model Serving Framework

bentoml.com

1

About this website

BentoML is an open-source framework for serving, deploying, and operating machine learning models in production, providing a unified platform for model packaging, serving, and deployment across cloud and edge environments. Founded by Chaoyu Yang and Bowen Sun in 2019 and headquartered in the San Francisco Bay Area, BentoML has raised over $15 million in funding from investors including Battery Ventures and Uncorrelated Ventures. Key features: model framework support for PyTorch, TensorFlow, Keras, scikit-learn, XGBoost, LightGBM, Hugging Face Transformers, ONNX, Stable Diffusion, and large language models (LLMs) via vLLM and TGI integration. Service definition using Python decorators (@bentoml.service, @bentoml.api) for declaring model serving APIs with automatic input/output validation via Pydantic. Runner architecture for optimizing model inference: each model runs as a Runner with configurable batching, concurrency, GPU allocation, and resource limits, automatically distributing work across multiple processes or containers. Adaptive batching for grouping individual requests into batches for GPU-efficient inference on transformer models. Bento packaging: a self-contained, distributable archive (bento) bundling the model, code, dependencies, and Docker base image specification, ensuring reproducible deployments. BentoML Yatai: Kubernetes-native deployment platform for managing bentos at scale with canary deployments, traffic splitting, and auto-scaling. BentoCloud: managed serverless platform for deploying bentos without infrastructure management. CLI and Python SDK for building, containerizing, and deploying bentos. Docker integration: bentoml containerize produces an OCI-compliant Docker image. gRPC and REST API serving. Async API support. Model registry for versioning and storing models. Monitoring and observability with OpenTelemetry integration. MLflow model format compatibility. Python 3.8+ required. Open source under Apache-2.0.

Statistics

1
Views
0
Clicks
0
Like
0
Dislike

Comments

Log In to post a comment

No comments yet. Be the first!