Merlin is a platform for deploying and serving machine learning models. The project was born of the belief that model deployment should be:
Easy and self-serve: Human should not become the bottleneck for deploying model into production.
Scalable: The model deployed should be able to handle Gojek scale and beyond.
Fast: The framework should be able to let user iterate quickly.
Cost efficient: It should provide all benefit above in a cost efficient manner.
Project: Project represents a namespace for a collection of model. For example, a project could be food Recommendations, driver allocation, ride pricing, etc.
Model: Every model is associated with one (and only one) project and model endpoint. Model also can have zero or more model versions. In the entities’ hierarchy of MLflow, a model corresponds to an MLflow experiment.
Model Endpoint: Every model has each own endpoint that contains routing rule to active model version endpoint.
Model Version: The model version represents an iteration within a model. A model version is associated with a run within MLflow. A Model Version can be deployed as a service, there can be multiple deployments of model version with different endpoint each.
Model Version Endpoint: A model version endpoint is a way to obtain model inference results in real-time, over the network (HTTP).
Environment: The environment’s name is a user-facing property that will be used to determine the target Kubernetes cluster where a model will be deployed to. The environment has two important properties, name and Kubernetes cluster.