GitHub Support CommunityModel serving and monitoring

Merlin is a platform for deploying and serving machine learning models. The project was born of the belief that model deployment should be:

Easy and self-serve: Human should not become the bottleneck for deploying model into production.
Scalable: The model deployed should be able to handle Gojek scale and beyond.
Fast: The framework should be able to let user iterate quickly.
Cost efficient: It should provide all benefit above in a cost efficient manner.


Project: Project represents a namespace for a collection of model. For example, a project could be food Recommendations, driver allocation, ride pricing, etc.

Model: Every model is associated with one (and only one) project and model endpoint. Model also can have zero or more model versions. In the entities’ hierarchy of MLflow, a model corresponds to an MLflow experiment.

Model Endpoint: Every model has each own endpoint that contains routing rule to active model version endpoint.

Model Version: The model version represents an iteration within a model. A model version is associated with a run within MLflow. A Model Version can be deployed as a service, there can be multiple deployments of model version with different endpoint each.

Model Version Endpoint: A model version endpoint is a way to obtain model inference results in real-time, over the network (HTTP).

Environment: The environment’s name is a user-facing property that will be used to determine the target Kubernetes cluster where a model will be deployed to. The environment has two important properties, name and Kubernetes cluster.

Official website

Tutorial and documentation

Enter your contact information to continue reading