Modin 

BOOK A MEETING FR Modin is an early-stage project at UC Berkeley’s RISELab designed to facilitate the use of distributed computing for Data Science. It is a multiprocess Dataframe library with an identical API to pandas that allows users to speed up their Pandas workflows. Features Modin can give the user the opportunity to extend …

MLlib 

BOOK A MEETING FR MLlib is Spark’s machine learning (ML) library. Its goal is to make practical machine learning scalable and easy. At a high level, it provides tools such as: ML Algorithms: common learning algorithms such as classification, regression, clustering, and collaborative filtering Featurization: feature extraction, transformation, dimensionality reduction, and selection Pipelines: tools for …

Mahout

BOOK A MEETING FR Apache Mahout(TM) is a distributed linear algebra framework and mathematically expressive Scala DSL designed to let mathematicians, statisticians, and data scientists quickly implement their own algorithms. Apache Spark is the recommended out-of-the-box distributed back-end, or can be extended to other distributed backends. Mathematically Expressive Scala DSL Support for Multiple Distributed Backends …

Jax  

BOOK A MEETING FR JAX is Autograd and XLA, brought together for high-performance machine learning research. With its updated version of Autograd, JAX can automatically differentiate native Python and NumPy functions. It can differentiate through loops, branches, recursion, and closures, and it can take derivatives of derivatives of derivatives. It supports reverse-mode differentiation (a.k.a. backpropagation) …

Horovod 

BOOK A MEETING FR Horovod was originally developed by Uber to make distributed deep learning fast and easy to use, bringing model training time down from days and weeks to hours and minutes. With Horovod, an existing training script can be scaled up to run on hundreds of GPUs in just a few lines of …

H2O-3 

BOOK A MEETING FR H2O is an open source, in-memory, distributed, fast, and scalable machine learning and predictive analytics platform that allows you to build machine learning models on big data and provides easy productionalization of those models in an enterprise environment. H2O’s core code is written in Java. Inside H2O, a Distributed Key/Value store …

Fiber 

BOOK A MEETING FR Fiber is an Express inspired web framework built on top of Fasthttp, the fastest HTTP engine for Go. Designed to ease things up for fast development with zero memory allocation and performance in mind. Features Robust routing Serve static files Extreme performance Low memory footprint API endpoints Middleware & Next support …

Weld  

BOOK A MEETING FR Weld is a compiler and runtime for improving the performance of data-intensive applications. It enables powerful compiler optimizations and automatic parallelization across functions by expressing the core computations in libraries using a small common intermediate representation and a lazy runtime API. Features Weld is integrated into many Java EE application servers …

DeepSpeed 

BOOK A MEETING FR DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective. 10x Larger Models 10x Faster Training Minimal Code Change Features Extreme scale: Using current generation of GPU clusters with hundreds of devices, 3D parallelism of DeepSpeed can efficiently train deep learning models with trillions of parameters. …

Dask 

BOOK A MEETING FR Dask is a flexible library for parallel computing in Python. Dask is composed of two parts: Dynamic task scheduling optimized for computation. This is similar to Airflow, Luigi, Celery, or Make, but optimized for interactive computational workloads. “Big Data” collections like parallel arrays, dataframes, and lists that extend common interfaces like …

Enter your contact information to continue reading