Pinecone 

Pinecone is a fully managed vector database that makes it easy to add vector search to production applications. It combines vector search libraries, capabilities such as filtering, and distributed infrastructure to provide high performance and reliability at any scale. Features 1. Similarity search2. Image similarity search3. Audio similarity search4. Question answering with similarity search5.Product recommendation …

Milvus 

Milvus is an open-source vector database built to power AI applications and vector similarity search. It is available in: Milvus standaloneMilvus cluster Features 1. Millisecond search on trillion vector datasets2. Simplified unstructured data management3. Reliable, always on vector database4. Highly scalable and elasticHybrid search5. Unified Lambda structure6. Community supported, industry recognized Official website Link Tutorial …

Marquez

Marquez is an open source metadata service for the collection, aggregation, and visualization of a data ecosystem’s metadata. It maintains the provenance of how datasets are consumed and produced, provides global visibility into job runtime and frequency of dataset access, centralization of dataset lifecycle management, and much more. Marquez was released and open sourced by …

lakeFS 

lakeFS is an open source platform that delivers resilience and manageability to object-storage based data lakes. With lakeFS you can build repeatable, atomic and versioned data lake operations – from complex ETL jobs to data science and analytics. Features 1. Exabytes scale version control2. Git-like operations: branch,commit, merge, revert3. Zero copy branching forfrictionless experiments4. Full …

Intake

Intake is a lightweight set of tools for loading and sharing data in data science projects. Intake helps you: Features 1. Load data from a variety of formats (see the current list of known plugins) into containers you already know, like Pandas dataframes, Python lists, NumPy arrays, and more.2. Convert boilerplate data loading code into …

DVC

Data Version Control is a new type of data versioning, workflow, and experiment management software, that builds upon Git (although it can work stand-alone). DVC reduces the gap between established engineering tool sets and data science needs, allowing users to take advantage of new features while reusing existing skills and intuition. Features 1. Git-compatible2. Storage …

Dolt 

Dolt is a version controlled relational database. Dolt implements a superset of MySQL. It is compatible with MySQL, and provides extra constructs exposing the version control features, which are closely modeled on Git. Features 1. Compatible2. Lineage & Time Travel3. Collaboration Official website Link Tutorial and documentation Click here to view

Delta Lake

Delta Lake is an open source project that enables building a Lakehouse architecture on top of data lakes. Delta Lake provides ACID transactions, scalable metadata handling, and unifies streaming and batch data processing on top of existing data lakes, such as S3, ADLS, GCS, and HDFS. Features 1. ACID Transactions2. Scalable Metadata Handling3. Time Travel …

Arrikto

A complete machine learning platform that simplifies, accelerates, and secures model development through production Features 1. Simplified deployment2. ML monitoring3. Life cycle management4. Compliance Official website Link Tutorial and documentation Click here to view

Enter your contact information to continue reading