BentoML

GitHub Support CommunityModel serving and monitoring

BentoML is a flexible, high-performance framework for serving, managing, and deploying machine learning models.

Supports multiple ML frameworks, including Tensorflow, PyTorch, Keras, XGBoost and more
Cloud native deployment with Docker, Kubernetes, AWS, Azure and many more
High-Performance online API serving and offline batch serving
Web dashboards and APIs for model registry and deployment management

Features

Production-ready online serving:

Support multiple ML frameworks including PyTorch, TensorFlow, Scikit-Learn, XGBoost, and many more
Containerized model server for production deployment with Docker, Kubernetes, OpenShift, AWS ECS, Azure, GCP GKE, etc
Adaptive micro-batching for optimal online serving performance
Discover and package all dependencies automatically, including PyPI, conda packages and local python modules
Serve compositions of multiple models
Serve multiple endpoints in one model server
Serve any Python code along with trained models
Automatically generate REST API spec in Swagger/OpenAPI format
Prediction logging and feedback logging endpoint
Health check endpoint and Prometheus /metrics endpoint for monitoring

Official website

Tutorial and documentation

Enter your contact information to continue reading