DVC

[wtm_mlop_cats]

Data Version Control is a new type of data versioning, workflow, and experiment management software, that builds upon Git (although it can work stand-alone). DVC reduces the gap between established engineering tool sets and data science needs, allowing users to take advantage of new features while reusing existing skills and intuition.

Features

1. Git-compatible
2. Storage agnostic
3. Reproducible
4. Low friction branching
5. Metric tracking
6. ML pipeline framework
7. Language- & framework-agnostic
8. HDFS, Hive & Apache Spark
9. Track failures

Official website

Tutorial and documentation

Enter your contact information to continue reading