MLOps Software Review

A review of +250 MLOps tools and solutions

This free tool helps you select the right MLOps tools for your business based on our experts’ review.

Luigi 

Open source

Luigi is a Python (3.6, 3.7, 3.8, 3.9 tested) package that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization, handling failures, command line integration, and much more.

Read More

Metaflow

Open source

Metaflow is a human-friendly Python library that helps scientists and engineers build and manage real-life data science projects.

Read More

Kedro

Open source

Kedro is an open-source Python framework for creating reproducible, maintainable and modular data science code

Read More

Flyte 

Open source

Flyte’s main purpose is to increase the development velocity for data processing and machine learning, enabling large-scale compute execution without the operational overhead.

Read More

MLRun

Open source

MLRun is an end-to-end open-source MLOps solution to manage and automate your entire analytics and machine learning

Read More

Couler 

Open source

Couler aims to provide a unified interface for constructing and managing workflows on different workflow engines, such as Argo Workflows, Tekton Pipelines, and Apache Airflow.

Read More

Kale

Open source

KALE (Kubeflow Automated pipeLines Engine) is a project that aims at simplifying the Data Science experience of deploying Kubeflow Pipelines workflows.

Read More

Prefect

Commercial

Prefect is a new workflow management system, designed for modern infrastructure and powered by the open-source Prefect

Read More

Automate Studio

Commercial

Organizations embarking on intelligent process automation initiatives can rapidly build and deploy AI-powered workflows and integrate resulting insights into business applications and processes.

Read More

ZenML

Open source

ZenML is an extensible, open-source MLOps framework to create production-ready machine learning pipelines. It has a simple, flexible syntax, is cloud and tooling agnostic, and has

Read More

Argo 

Open source

Argo Workflows is an open source container-native workflow engine for orchestrating parallel jobs on Kubernetes. Argo Workflows is implemented as a Kubernetes CRD (Custom Resource Definition).

Read More

Modin 

Open source

Modin is an early-stage project at UC Berkeley’s RISELab designed to facilitate the use of distributed computing for Data Science

Read More

MLlib 

Open source

MLlib is Spark’s machine learning (ML) library. Its goal is to make practical machine learning scalable and easy. At a high level, it provides tools such as

Read More

Yellowbrick

Open source

Yellowbrick is a suite of visual diagnostic tools called “Visualizers” that extend the scikit-learn API to allow human steering of the model selection process.

Read More

Mahout

Open source

Apache Mahout(TM) is a distributed linear algebra framework and mathematically expressive Scala DSL designed to let mathematicians,

Read More

Netron 

Open source

Netron is a viewer for neural network, deep learning and machine learning models.

Read More

Jax  

Open source

JAX is Autograd and XLA, brought together for high-performance machine learning research.

Read More

Horovod 

Open source

Horovod was originally developed by Uber to make distributed deep learning fast and easy to use, bringing model training time

Read More

Manifold 

Open source

Manifold is an open-source publishing platform built by scholars and publishers. It’s responsive, accessible, intuitive, customizable, and opinionated. With Manifold, you can publish materials you already produce or use it to build something new together with your colleagues and students

Read More

H2O-3 

Commercial

H2O is an open source, in-memory, distributed, fast, and scalable machine learning and predictive analytics platform that allows you to build machine learning models on big data

Read More

Evidently

Read More

Fiber 

Open source

Fiber is an Express inspired web framework built on top of Fasthttp, the fastest HTTP engine for Go. Designed to ease things up for

Read More

Weld  

Open source

Weld is a compiler and runtime for improving the performance of data-intensive applications.

Read More

DeepSpeed 

Open source

DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

Read More

Dask 

Open source

Dask is a flexible library for parallel computing in Python. Dask is composed of two parts:

Read More

Vulkan Kompute  

Open source

Vulkan Kompute – The General Purpose Vulkan Compute Framework. Blazing fast, lightweight, mobile-enabled, and optimized for advanced GPU data processing usecases.

Read More

CuPy

Open source

multi-dimensional array on CUDA.

Read More

Vaex  

Open source

Vaex is a python library for lazy Out-of-Core DataFrames (similar to Pandas), to visualize and explore big tabular datasets

Read More

CuML 

Open source

cuML is a suite of fast, GPU-accelerated machine learning algorithms designed for data science and analytical tasks. Our API mirrors

Read More

CuDF  

Open source

cuDF is a Python GPU DataFrame library (built on the Apache Arrow columnar memory format) for loading, joining, aggregating,

Read More

WhyLogs 

Open source

whylogs is an open source standard for data and ML logging whylogs logging agent is

Read More

Tpot 

Open source

TPOT stands for Tree-based Pipeline Optimization Tool. Consider TPOT your Data Science Assistant. TPOT is a Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

Read More

Vespa 

Open source

Vespa provides metrics integration with CloudWatch, Datadog and Prometheus / Grafana, as well as a JSON HTTP API.

Read More

Triton Inference Server 

Open source

Triton Inference Server provides a cloud and edge inferencing solution optimized for both CPUs and GPUs. Triton supports an

Read More

Singa 

Open source

Apache SINGA is an Apache Top Level Project, focusing on distributed training of deep learning and machine learning models

Read More

Ray 

Open source

Ray provides a simple, universal API for building distributed applications. Ray accomplishes this mission by:

Read More

Triton Inference Server 

Open source

Triton Inference Server provides a cloud and edge inferencing solution optimized for both CPUs and GPUs. Triton supports an HTTP/REST and GRPC protocol that allows remote clients

Read More

TorchServe

Open source

TorchServe is a flexible and easy to use tool for serving PyTorch models.

Read More

Rapids 

Open source

The RAPIDS suite of open source software libraries and APIs gives you the ability to execute end-to-end data science and analytics pipelines entirely on GPUs. Licensed under Apache 2.0, RAPIDS is incubated by NVIDIA® based on extensive hardware and data science experience.

Read More

TensorFlow Serving 

Open source

TensorFlow Serving is a flexible, high-performance serving system for machine learning models, designed for production environments

Read More

Tempo 

Open source

Tempo is a python SDK for data scientists to help them move their models to production. It has 4 core goals:

Read More

Petastorm 

Open source

Petastorm is an open source data access library developed at Uber ATG.

Read More

NumpyGroupies 

Open source

This package consists of a small library of optimised tools for doing things that can roughly be considered “group-indexing operations”.

Read More

Streamlit 

Open source

Streamlit is an open-source Python library that makes it easy to create and share beautiful, custom web apps for machine learning

Read More

Seldon

Commercial

Seldon core converts your ML models (Tensorflow, Pytorch, H2o, etc.) or language wrappers

Read More

Numba 

Open source

Numba is a compiler for Python array and numerical functions that gives you the power to speed up your applications with high performance functions written directly in Python.

Read More

Modin 

Open source

Modin is an early-stage project at UC Berkeley’s RISELab designed to facilitate the use of distributed computing for Data Science. It is a multiprocess Dataframe library with an identical API to pandas that allows users to speed up their Pandas workflows.

Read More

Redis-AI 

Open source

RedisAI is a Redis module for executing Deep Learning/Machine Learning models and managing their data. Its purpose is being a “workhorse” for model serving, by providing

Read More

Model Server for Apache MXNet (MMS) 

Open source

Multi Model Server (MMS) is a flexible and easy to use tool for serving deep learning models trained using any ML/DL framework.

Read More

Merlin 

Open source

Merlin is a platform for deploying and serving machine learning models. The project was born of the belief that model deployment should be:

Read More

PredictionIO 

Open source

Apache PredictionIO® is an open source Machine Learning Server built on top of a state-of-the-art open source stack for developers

Read More

m2cgen 

Open source

m2cgen (Model 2 Code Generator) – is a lightweight library which provides an easy way to transpile trained statistical models into a native code (Python, C, Java, Go, JavaScript, Visual Basic, C#, PowerShell, R, PHP, Dart, Haskell, Ruby, F#, Rust).

Read More

KFServing

Open source

The Kubeflow project is dedicated to making deployments of machine learning (ML) workflows on Kubernetes simple, portable and scalable.

Read More

Jina  

Open source

Jina? is a neural search framework that empowers anyone to build SOTA and scalable deep learning search applications in minutes.

Read More

Opyrator

Open source

Instantly turn your Python functions into production-ready microservices. Deploy and access your services via HTTP API or interactive UI. Seamlessly export your

Read More

OpenScoring 

Open source

REST web service for scoring PMML models. Openscoring is a Java service that provides a JSON REST interface to the Java Predictive Model Markup Language (PMML) evaluator JPMML

Read More

Hydrosphere 

Open source

Hydrosphere is a platform for deploying, versioning, and monitoring your machine learning models in production. It is language-agnostic and framework-agnostic, with support for all major programming languages and frameworks – Python, Java, Tensorflow, Pytorch, etc.

Read More

GraphPipe 

Open source

GraphPipe is a protocol and collection of software designed to simplify machine learning model deployment and decouple it from framework-specific model implementations.

Read More

ModelDB

Open source

ModelDB: An open-source system for Machine Learning model versioning, metadata, and experiment management.

Read More

ForestFlow 

Open source

ForestFlow is a scalable policy-based cloud-native machine learning model server. ForestFlow strives to strike a balance between the flexibility it offers data scientists and the adoption of standards while reducing friction between Data Science, Engineering and Operations teams.

Read More

Mlflow

Open source

MLflow is an open source platform to manage the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry.

Read More

Keepsake

Open source

The Keepsake Python library is used to create experiments and checkpoints in your train

Read More

Fiddler

Commercial

Progress® Telerik® Fiddler Everywhere is a web-debugging tool that monitors, inspects, edits, and logs all HTTP(S) traffic, and issue requests between your computer and the Internet, and fiddles with incoming and outgoing data. It is a high performance, cross-platform proxy for any browser, system, or platform.

Read More

Guild AI

Open source

Guild AI brings systematic control to machine learning to help you build better

Read More

Comet

Commercial

Comet enables data scientists and teams to track, compare, explain and optimize experiments and models across the model’s

Read More

Evidently 

Open source

Evidently helps evaluate and monitor machine learning models in production. It generates interactive reports or JSON profiles from pandas DataFramesor csv files.

Read More

Aim

Open source

Aim is an open-source comparison tool for AI experiments. With more resources and complex models more experiments

Read More

DeepDetect  

Open source

DeepDetect is a deep learning API and server written in C++11, along with a pure Web Platform for training and managing models.

Read More

Cortex 

Commercial

Cortex’s AI analyzes all the content from:

Read More

Visual Studio Code

Commercial

Visual Studio Code is a lightweight but powerful source code editor which runs on your desktop and is available for Windows, macOS

Read More

BudgetML

Open source

BudgetML is perfect for practitioners who would like to quickly deploy their models to an endpoint, but not waste a lot of time, money, and effort trying to figure out how to do this end-to-end.

Read More

Thonny

Open source

Thonny is an integrated development environment for Python that is designed for beginners. It supports different ways of stepping

Read More

BentoML

Open source

BentoML is a flexible, high-performance framework for serving, managing, and deploying machine learning models.

Read More

Spyder

Open source

Spyder is a powerful scientific environment written in Python, for Python, and designed by and for scientists, engineers and

Read More

Backprop  

Open source

Backprop is a serverless model platform that makes it simple for developers to use machine learning models in any application.

Read More

Rstudio

Commercial

RStudio is an integrated development environment (IDE) for R. It includes a

Read More

XAI – eXplainableAI 

Open source

XAI is a Machine Learning library that is designed with AI explainability in its core. XAI contains various tools that enable for analysis and evaluation of data and models.

Read More

Pycharm

Commercial

PyCharm is a dedicated Python Integrated Development Environment (IDE) providing a wide range of essential tools for Python

Read More

woe 

Open source

Tools for WoE Transformation mostly used in ScoreCard Model for credit rating

Read More

TreeInterpreter 

Open source

Package for interpreting scikit-learn’s decision tree and random forest predictions.

Read More

themis-ml  

Open source

themis-ml defines discrimination as the preference (bias) for or against a set of social groups that result in the unfair treatment of its members with respect to some outcome.

Read More

Themis 

Open source

Themis is an open-source high-level cryptographic services library for securing data during authentication, storage, messaging, network exchange, etc.

Read More

tensorflow’s Model Analysis 

Open source

TensorFlow Model Analysis (TFMA) is a library for evaluating TensorFlow models.

Read More

tensorflow’s lucid 

Open source

Lucid is a collection of infrastructure and tools for research in neural network interpretability.

Read More

Tensorflow’s cleverhans

Open source

This repository contains the source code for CleverHans, a Python library to benchmark machine learning systems’ vulnerability to adversarial examples. You can learn more about such vulnerabilities on the accompanying blog.

Read More

Eclipse

Open source

The Eclipse Foundation provides our global community of individuals and organizations with a mature, scalable, and business-friendly environment for open source software

Read More

Tensorboard’s Tensorboard WhatIf

Open source

The What-If Tool (WIT) provides an easy-to-use interface for expanding understanding of black-box classification and regression ML models.

Read More

Atom

Open source

Atom is a hackable text editor for the 21st century, built on Electron, and based on everything we love about our favorite editors

Read More

Snitch ai

Commercial

Automated scientific validation for your ML models in a few clicks.

Read More

Skater

Open source

Skater is a unified framework to enable Model Interpretation for all forms of model to help one build an Interpretable machine learning system often needed for real world use-cases(** we are actively working towards to enabling faithful interpretability for all forms models).

Read More

Anaconda

Commercial

Anaconda Individual Edition is a free, easy-to-install package manager, environment manager, and Python distribution with a

Read More

SHAPash  – Shapash is a Python library that provides several types of visualization that display explicit labels that everyone can understand.

Open source

Shapash is a Python library which aims to make machine learning interpretable and understandable by everyone. It provides several types of visualization that display explicit labels that everyone can understand.

Read More

JSON Schema

Open source

JSON Schema is a vocabulary that allows you to annotate and validate JSON documents.

Read More

Great Expectations

Open source

Great Expectations is the leading tool for validating, documenting, and profiling

Read More

SHAP 

Open source

SHAP (SHapley Additive exPlanations) is a game theoretic approach to explain the output of any machine learning model.

Read More

SAGE

Open source

Sage is an open source project and completely free to use.

Read More

responsibly 

Open source

Toolkit for Auditing and Mitigating Bias and Fairness of Machine Learning Systems.

Read More

rationale

Open source

Rationale is inspired by RamdaJS. It is a collection of helper utility functions that are absent in the OCaml/ReasonML standard library.

Read More

pyBreakDown 

Open source

Break Down method is moved to the dalex Python package which is actively maintained

Read More

Cerberus

Open source

Cerberus provides powerful yet simple and lightweight data validation functionality out of the box and is designed to be easily

Read More

NETRON  

Open source

Netron is a viewer for neural network, deep learning and machine learning models.

Read More

mljar-supervised

Open source

The mljar-supervised is an Automated Machine Learning Python package that works with tabular data. It is designed to save time for a data scientist ?.

Read More

Spark Streaming 

Open source

Spark Streaming is an extension of the core Spark API that enables scalable, high-throughput, fault-tolerant stream processing of

Read More

MindsDB  

Open source

MindsDB enables advanced predictive capabilities directly in your Database.

Read More

Lucid

Open source

Lucid is a collection of infrastructure and tools for research in neural network interpretability.

Read More

Kafka Streams

Open source

Apache Kafka is an open-source distributed event streaming platform used by thousands of companies for high-performance

Read More

LOFO Importance  

Open source

LOFO (Leave One Feature Out) Importance calculates the importances of a set of features based on a metric of choice, for a model of choice, by iteratively removing each feature from the set, and evaluating the performance of the model, with a validation scheme of choice, based on the chosen metric.

Read More

IBM Stream Analytics

Commercial

IBM® Streaming Analytics for IBM Cloud is powered by IBM® Streams, an advanced analytic platform that you can use to ingest, analyze, and correlate information as it arrives from different types of data sources in real time.

Read More

LIME

Open source

This project is about explaining what machine learning classifiers (or models) are doing.

Read More

Google Cloud DataFlow

Commercial

Google Cloud Dataflow is a fully managed service for executing Apache Beam pipelines within the Google Cloud Platform ecosystem

Read More

Lightly  

Open source

Lightly is a fork of breeze theme style that aims to be visually modern and minimalistic.

Read More

L2X  

Open source

Code for replicating the experiments in the paper Learning to Explain: An Information-Theoretic Perspective on Model Interpretation at ICML 2018, by Jianbo Chen, Mitchell Stern, Martin J. Wainwright, Michael I. Jordan.

Read More

Faust 

Open source

Faust is a stream processing library, porting the ideas from Kafka Streams to Python.

Read More

Brooklin

Open source

Brooklin is a distributed system intended for streaming data between various heterogeneous source and destination systems

Read More

Azure Stream Analytics

Commercial

Azure Stream Analytics is a real-time analytics and complex event-processing engine that is designed to analyze and process high

Read More

Apache Samza

Open source

Apache Samza is a scalable data processing engine that allows you to process and analyze your data in real-time.

Read More

Apache Flink

Open source

Apache Flink is a framework and distributed processing engine for stateful computations over unbounded and bounded

Read More

Amazon Kinesis

Commercial

Amazon Kinesis makes it easy to collect, process, and analyze real-time, streaming data so you can get timely insights and react quickly

Read More

Talend Open Studio for Data Integration

Commercial

Talend Open Studio for Data Integration is free-to-download software to kickstart your first data integration and ETL projects.

Read More

Spark

Open source

Apache Spark is a unified analytics engine for large-scale data processing. It provides high-level APIs in Java, Scala, Python and R,

Read More

Snakemake

Open source

The Snakemake workflow management system is a tool to create reproducible and scalable data analyses. Workflows are described via a human readable

Read More

keras-vis

Open source

keras-vis is a high-level toolkit for visualizing and debugging your trained keras neural net models.

Read More

SETL

Open source

SETL (pronounced “settle”) is a Scala ETL framework powered by Apache Spark that helps you structure your Spark ETL projects,

Read More

InterpretML

Open source

InterpretML is an open-source package that incorporates state-of-the-art machine learning interpretability techniques under one roof.

Read More

Prefect Core

Open source

The prefect Python library includes everything you need to design, build, test, and run powerful data applications. Instantly upgrade

Read More

PipelineX

Open source

PipelineX: Python package to build ML pipelines for experimentation with Kedro, MLflow, and more

Read More

Integrated-Gradients 

Open source

Integrated Gradient(IG) computes the gradient of the model’s prediction output to its input features and requires no modification to the original deep neural network.

Read More

Oozie

Open source

Oozie v3 is a server based Bundle Engine that provides a higher-level oozie abstraction that will batch a set of coordinator applications.

Read More

iNNvestigate 

Open source

In the recent years neural networks furthered the state of the art in many domains like, e.g., object detection and speech recognition.

Read More

Neuraxle

Open source

Neuraxle is a Machine Learning (ML) library for building machine learning pipelines.

Read More

Metaflow

Open source

Metaflow is a human-friendly Python library that helps scientists and engineers build and manage real-life data science projects.

Read More

IBM AI Fairness 360  

Open source

AI Fairness 360, an LF AI incubation project, is an extensible open source toolkit that can help users examine, report, and mitigate discrimination and bias in machine learning models throughout the AI application lifecycle.

Read More

IBM AI Explainability 360 

Open source

The AI Explainability 360 toolkit, an LF AI Foundation incubation project, is an open-source library that supports the interpretability and explainability of datasets and machine learning models.

Read More

Luigi

Open source

The purpose of Luigi is to address all the plumbing typically associated with long-running batch processes.

Read More

Kedro

Open source

Kedro is an open-source Python framework for creating reproducible, maintainable and modular data science code.

Read More

GEBI

Open source

Global Explanations for Bias Identification. With our proposed method, we have identified four different clusters.

Read More

Informatica Power Center

Commercial

PowerCenter is a scalable, high-performance foundation for on-premises data integration initiatives, including anal

Read More

FairML  

Commercial

FairML is a python toolbox auditing the machine learning models for bias.

Read More

Hadoop

Open source

The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models

Read More

Fairlearn

Open source

Fairlearn is a Python package that empowers developers of artificial intelligence (AI) systems to assess their system’s fairness and mitigate any observed unfairness issues.

Read More

FACETS 

Open source

The facets project contains two visualizations for understanding and analyzing machine learning datasets: Facets Overview and Facets Dive.

Read More

Gokart 

Open source

Gokart solves reproducibility, task dependencies, constraints of good code, and ease of use for Machine Learning Pipeline.

Read More

Genie

Open source

GenieAnalytics provides deep and powerful big-data analytic ability that delivers immediate operational insights for your business.

Read More

Flyte

Open source

The Workflow Automation Platform for Complex, Mission-Critical Data and ML Processes at Scale

Read More

Dagster

Open source

Dagster is a data orchestrator. It lets you define pipelines (DAGs) in terms of the data flow

Read More

Couler 

Open source

Couler aims to provide a unified interface for constructing and managing workflows on different workflow engines,

Read More

Bonobo 

Open source

Bonobo is a lightweight Extract-Transform-Load (ETL) framework for Python 3.5+.

Read More

Basin

Open source

Extract, transform, load using visual programming that can run Spark jobs on any environment

Read More

Azkaban

Open source

Azkaban is a batch workflow job scheduler created at LinkedIn to run Hadoop jobs.

Read More

Argo Workflows

Open source

Argo Workflows is an open source container-native workflow engine for orchestrating parallel jobs on Kubernetes.

Read More

Apache Nifi 

Open source

Put simply, NiFi was built to automate the flow of data between systems. While the term ‘dataflow’ is used in a variety of contexts, we use it here to mean the automated and managed flow

Read More

Airflow

Open source

Airflow is a platform that lets you build and run workflows. A workflow is represented as a DAG (a Directed Acyclic Graph),

Read More

ELI5 

Open source

ELI5 is a Python library which allows to visualize and debug various Machine Learning models using unified API.

Read More

Pinecone 

Commercial

Pinecone is a fully managed vector database that makes it easy to add vector search to production applications. It combines vector search

Read More

DeepVis Toolbox 

Open source

This is the code required to run the Deep Visualization Toolbox, as well as to generate the neuron-by-neuron visualizations using regularized optimization.

Read More

Milvus 

Open source

Milvus is an open-source vector database built to power AI applications and vector similarity search.

Read More

DeepLIFT  –

Open source

This version of DeepLIFT has been tested with Keras 2.2.4 & tensorflow 1.14.0. See this FAQ question for information on other implementations of DeepLIFT that may work with different versions of tensorflow/pytorch, as well as a wider range of architectures. See the tags for older versions.

Read More

Marquez

Open source

Marquez is an open source metadata service for the collection, aggregation, and visualization of a data ecosystem’s metadata.

Read More

lakeFS 

Open source

lakeFS is an open source platform that delivers resilience and manageability to object-storage based data lakes.

Read More

ContrastiveExplanation (Foil Trees) 

Open source

Contrastive Explanation provides an explanation for why an instance had the current outcome (fact) rather than a targeted outcome of interest (foil).

Read More

Intake

Open source

Intake is a lightweight set of tools for loading and sharing data in data science projects. Intake helps you:

Read More

Captum 

Open source

Captum (“comprehension” in Latin) is an open source, extensible library for model interpretability built on PyTorch.

Read More

DVC

Commercial

Data Version Control is a new type of data versioning, workflow, and experiment management software, that builds upon Git

Read More

anchor 

Open source

An anchor explanation is a rule that sufficiently “anchors” the prediction locally – such that changes to the rest of the feature values of the instance do not matter. In other words, for instances on which the anchor holds, the prediction is (almost) always the same.

Read More

Dolt 

Open source

Dolt is a version controlled relational database. Dolt implements a superset of MySQL.

Read More

Delta Lake

Open source

Delta Lake is an open source project that enables building a Lakehouse architecture on top of data lakes. Delta Lake provides ACID transactions,

Read More

Alibi 

Open source

Alibi is designed to help explain the predictions of machine learning models and gauge the confidence of those predictions.

Read More

Arrikto

Commercial

A complete machine learning platform that simplifies, accelerates, and secures model development through production

Read More

Visual Object Tagging Tool (VOTT)

Open source

An open source annotation and labeling tool for image and video assets.

Read More

Valohai

Commercial

Valohai is all about taking away the not-so-fun parts of machine learning.

Read More

VGG Image Annotator (VIA)

Open source

VGG Image Annotator is a simple and standalone manual annotation software for image, audio and video. VIA runs in a web browser and does not

Read More

Sagemaker

Commercial

Amazon SageMaker helps data scientists and developers to prepare, build, train, and deploy high-quality machine learning (ML) models quickly by bringing together a broad set of capabilities purpose-built for ML.

Read More

V7 Darwin

Commercial

To enable any business, large and small, to leverage the sense of sight and automate any visual task. To achieve this, we must

Read More

Superintendent

Open source

superintendent provides an ipywidget-based interactive labelling tool for your data. It allows you to flexibly label all kinds of data.

Read More

Polyaxon

Open source

Polyaxon is a platform for building, training, and monitoring large scale deep learning applications. We are making a system to solve reproducibility, automation, and scalability for machine learning applications.

Read More

Super Annotate Data Labelling

Commercial

SuperAnnotate is the end-to-end image and video annotation platform to annotate, train, and automate your computer vision pipeline.

Read More

Semantic Segmentation Editor

Open source

A web based labeling tool for creating AI training data sets (2D and 3D).

Read More

Pachyderm

Commercial

Pachyderm is a tool for version-controlled, automated, end-to-end data pipelines for data science.

Read More

Sagemaker ground truth

Commercial

Amazon SageMaker Ground Truth is a fully managed data labeling service that makes it easy to build highly accurate training datasets for

Read More

Neu.ro

Commercial

Neu.ro platform seamlessly stitches your on-prem and cloud resources, deploys pipelines, integrates your open-source and commercial development tools.

Read More

PixelAnnotationTool

Read More

Modzy 

Commercial

Modzy is the ModelOps and MLOps software platform for businesses to deploy, manage, and get value from AI—at scale.

Read More

OpenLabeling

Open source

Image labeling in multiple annotation formats: PASCAL VOC (= darkflow)

Read More

ML Workspace 

Open source

The ML workspace is an all-in-one web-based IDE specialized for machine learning and data science.

Read More

LynxKite

Open source

LynxKite is an open-source “one stop shop” graph data science platform.

Read More

Kubeflow 

Open source

The Kubeflow project is dedicated to making deployments of machine learning (ML) workflows on Kubernetes simple, portable and scalable.

Read More

Knime

Commercial

At KNIME, we build software to create and productionize data science using one easy and intuitive environment, enabling every stakeholder in the data science process to focus on what they do best.

Read More

MedTagger

Open source

MedTagger is a collaborative framework for annotating medical datasets

Read More

makesense.ai

Open source

makesense.ai is a free to use online tool for labelling photos. Thanks to the use of a browser it does not require any complicated installation – just visit the website and you are ready to go.

Read More

Iguazio

Commercial

The Iguazio Data Science Platform transforms AI projects into real-world business outcomes.

Read More

Labelimg 

Open source

LabelImg is a graphical image annotation tool. It is written in Python and uses Qt for its graphical interface.

Read More

Label Studio

Open source

Label Studio is an open source data labeling tool. It lets you label data types like audio,

Read More

IBM Watson Studio

Commercial

IBM Watson® Studio empowers data scientists, developers and analysts to build, run and manage AI models, and optimize decisions anywhere on IBM Cloud Pak® for Data.

Read More

ImgLab

Open source

A web based tool to label images for objects that can be used to train dlib or other object detectors

Read More

Hopsworks

Commercial

Hopsworks is a managed platform for scale-out data science, with support for both GPUs and Big Data, in a familiar development environment.

Read More

H2O

Commercial

H2O is an in-memory platform for distributed, scalable machine learning.

Read More

Gradient

Commercial

Gradient is a Paperspace product that simplifies developing, training, and deploying machine learning models.

Read More

Domino

Commercial

Domino is a data science platform that enables fast, reproducible, and collaborative work on data products like models, dashboards, and data pipelines.

Read More

DataRobot

Commercial

The DataRobot Automated Machine Learning product accelerates your AI success by combining cutting-edge machine learning technology with the team you have in place.

Read More

Dataiku

Commercial

Dataiku is an artificial intelligence and machine learning company which was founded in 2013. In December 2019,

Read More

ImageTagger

Open source

This is a collaborative online tool for labeling image data.

Read More

Figure Eight

Commercial

Figure Eight Federal is critical in the creation of the highest quality Decision-Grade

Read More

Doccano 

Open source

doccano is an open source text annotation tool for humans.

Read More

Dataloop

Commercial

Enterprise-grade data platform for vision AI systems in development and in production.

Read More

Computer Vision Annotation Tool (CVAT)

Open source

CVAT is free, online, interactive video and image annotation tool for computer vision.

Read More

COCO Annotator

Open source

COCO Annotator is a web-based image annotation tool designed for versatility and efficiently label images to create training data

Read More

DAGsHub

Open source

DAGsHub is a platform for data scientists and machine learning engineers to version their data, models, experiments, and code.

Read More

CNVRG

Commercial

cnvrg.io is a machine learning platform built by data scientists, for data scientists.

Read More

Clear ML

Commercial

ClearML is an open source platform that automates and simplifies developing and managing machine learning solutions for thousands of data science teams all over the world.

Read More

Bodywork

Open source

Bodywork deploys machine learning projects developed in Python, to Kubernetes.

Read More

AzureML

Commercial

Azure Machine Learning is a cloud service for accelerating and managing the machine learning project lifecycle.

Read More

Algorithmia

Commercial

Algorithmia provides the fastest time to value for enterprise machine learning. Rapidly deploy, serve, and manage machine learning models at scale.

Read More

aiWARE 

Commercial

The Veritone aiWARE platform for Enterprise AI provides real-time input adapters, hundreds of AI engines across over 20 cognitive categories

Read More

Kyso 

Commercial

One central knowledge hub, so everyone can learn from and take action on your data insights.

Read More

Knowledge Repo

Open source

The Knowledge Repo project is focused on facilitating the sharing of knowledge between data scientists and other technical roles using data formats and tools that make sense in these professions.

Read More

Talend Data Fabric

Commercial

Talend Data Fabric combines Talend products into a common set of powerful, easy-to-use solutions.

Read More

Metacat

Open source

Metacat is a flexible, open source metadata catalog and data repository that targets scientific data, particularly from ecology and environmental science. Metacat accepts XML

Read More

Magda

Open source

Magda is a data catalog system that provides a single place where all of your organization’s data can be catalogued, enriched,

Read More

Tune 

Open source

Tune is a Python library for experiment execution and hyperparameter tuning at any scale.

Read More

Informatica Data Catalog

Commercial

Informatica Enterprise Data Catalog is an AI-powered data catalog that provides a machinelearning-based discovery engine to scan and catalog data assets

Read More

IBM Data Catalog

Commercial

IBM Watson Knowledge Catalog is an open and intelligent data catalog for enterprise data and AI model governance, quality, and collaboration.

Read More

Talos

Open source

Talos radically changes the ordinary Keras, TensorFlow (tf.keras), and PyTorch workflow by fully automating hyperparameter tuning and model evaluation.

Read More

Google Data Catalog

Commercial

GCP Data Catalog is rapidly taking over the metadata management services, availability being on the google cloud.

Read More

Scikit Optimize

Open source

Scikit-Optimize, or skopt, is a simple and efficient library to minimize (very) expensive and noisy black-box functions.

Read More

DataHub

Open source

DataHub is an open-source metadata platform for the modern data stack. Read about the architectures of different metadata systems and why DataHub excels here

Read More

CKAN

Open source

The Comprehensive Knowledge Archive Network (CKAN) is an open-source open data portal for the storage and distribution of open data.

Read More

Optuna 

Open source

Optuna is an automatic hyperparameter optimization software framework, particularly designed for machine learning.

Read More

Katib 

Open source

Katib is a Kubernetes-native project for automated machine learning (AutoML). Katib supports Hyperparameter Tuning, Early Stopping and Neural Architecture Search.

Read More

Hyperopt 

Open source

Hyperopt is a Python library for serial and parallel optimization over awkward search spaces, which may include real-valued, discrete, and conditional dimensions.

Read More

Hypera ( I think it will be Hyperas)

Open source

A very simple convenience wrapper around hyperopt for fast prototyping with keras models.

Read More

Veri 

Open source

Veri is a Feature Label Store. Feature Label store allows storing features as keys and labels as values. Querying values is only possible with knn using features.

Read More

Ivory  

Open source

ivory defines a specification for how to store feature data and provides a set of tools for querying it.

Read More

Hopsworks Feature Store  

Open source

Hopsworks and its Feature Store are an open source data-intensive AI platform used for the development and operation of machine learning models at scale.

Read More

Azure Data Catalog

Commercial

Azure Data Catalog is an enterprise-wide metadata catalog enabling self-service data asset discovery. It’s a fully managed service in Azure.

Read More

Feast

Open source

Feast (Feature Store) is an operational data system for managing and serving machine learning features to models in production.

Read More

ByteHub 

Commercial

ByteHub is a Python-based feature store designed to be as easy-to-use and familiar to data scientists as possible.

Read More

Atlan Data Catalog

Commercial

It is a Modern Data Catalog & Discovery tool. A data catalog is a neatly organized inventory of data assets across all your data sources.

Read More

Apache Atlas

Open source

Apache Atlas is an open source metadata management and governance system designed to help you easily find, organize, and manage data assets.

Read More

Butterfree

Open source

It is a feature store, as the name suggests, corresponds to an organized set of features for machine learning models.

Read More

Amundsen 

Open source

Amundsen is a data discovery and metadata engine for improving the productivity of data analysts, data scientists and engineers when interacting with data.

Read More

Papertrail

Commercial

SolarWinds® Papertrail™ provides cloud-based log management that seamlessly aggregates logs from applications, servers, network devices, services, platforms, and much more.

Read More

Weights and biases

Commercial

Weights & Biases is the machine learning platform for developers to build better models faster.

Read More

HealthchecksIO

Commercial

Healthchecks is a cron job monitoring service. You can use Healthchecks.io for lightweight server monitoring:

Read More

Cronitor

Commercial

Cronitor is a web-based tracking application that monitors, alerts, and analyzes scheduled computer processes.

Read More

Sacred

Open source

Sacred is a tool to configure, organize, log and reproduce computational experiments.

Read More

GitHub Actions

GitHub Actions makes it easy to automate all your software workflows, now with world-class CI/CD. Build, test, and deploy your code

Read More

CML

Open source

Continuous Machine Learning (CML) is an open-source CLI tool for implementing continuous integration & delivery (CI/CD) with

Read More

Neptune AI

Commercial

Neptune is a metadata store for MLOps, built for teams that run a lot of experiments.‌

Read More

Azure Devops

Commercial

Azure DevOps provides developer services for support teams to plan work, collaborate on code development, and build and deploy applications.

Read More

AWS CodePipeline

Commercial

AWS CodePipeline is a fully managed continuous delivery service that helps you automate your release pipelines for fast and reliable application and infrastructure updates.

Read More

Enter your contact information to continue reading