Amazon Kinesis

BOOK A MEETING FR Amazon Kinesis makes it easy to collect, process, and analyze real-time, streaming data so you can get timely insights and react quickly to new information. Amazon Kinesis offers key capabilities to cost-effectively process streaming data at any scale, along with the flexibility to choose the tools that best suit the requirements …

Talend Open Studio for Data Integration

BOOK A MEETING FR Talend Open Studio for Data Integration is free-to-download software to kickstart your first data integration and ETL projects. Features Free open source Apache license RDBMS connectors: Oracle, Teradata, Microsoft SQL server SaaS connectors: Marketo, Salesforce, NetSuite Packaged apps: SAP, Microsoft Dynamics, Sugar CRM Official website Link Tutorial and documentation Click here …

Spark

BOOK A MEETING FR Apache Spark is a unified analytics engine for large-scale data processing. It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphs. It also supports a rich set of higher-level tools including Spark SQL for SQL and structured data processing, MLlib for machine …

Snakemake

BOOK A MEETING FR The Snakemake workflow management system is a tool to create reproducible and scalable data analyses. Workflows are described via a human readable, Python based language. They can be seamlessly scaled to server, cluster, grid and cloud environments, without the need to modify the workflow definition. Finally, Snakemake workflows can entail a …

SETL

BOOK A MEETING FR SETL (pronounced “settle”) is a Scala ETL framework powered by Apache Spark that helps you structure your Spark ETL projects, modularize your data transformation logic and speed up your development. Features With SETL, an ETL application could be represented by a Pipeline. A Pipeline contains multiple Stages. In each stage, we …

Prefect Core

BOOK A MEETING FR The prefect Python library includes everything you need to design, build, test, and run powerful data applications. Instantly upgrade your existing code with workflow best practices, and use the Prefect UI to orchestrate and monitor everything. Features A proper automation framework has three critical components: Workflow definition Workflow engine Workflow state …

PipelineX

BOOK A MEETING FR PipelineX: Python package to build ML pipelines for experimentation with Kedro, MLflow, and more Features HatchDict: Python in YAML/JSON6 Flex-Kedro: Kedro plugin for flexible config MLflow-on-Kedro: Kedro plugin for MLflow users Kedro-Extras: Kedro plugin to use various Python packages Official website Link Tutorial and documentation Click here to view See more …

Oozie

BOOK A MEETING FR Oozie v3 is a server based Bundle Engine that provides a higher-level oozie abstraction that will batch a set of coordinator applications. The user will be able to start/stop/suspend/resume/rerun a set coordinator jobs in the bundle level resulting a better and easy operational control. Oozie v2 is a server based Coordinator …

Neuraxle

BOOK A MEETING FR Neuraxle is a Machine Learning (ML) library for building machine learning pipelines. Features Component-Based: Build encapsulated steps, then compose them to build complex pipelines. Evolving State: Each pipeline step can fit, and evolve through the learning process Hyperparameter Tuning: Optimize your pipelines using AutoML, where each pipeline step has their own …

Metaflow

BOOK A MEETING FR Metaflow is a human-friendly Python library that helps scientists and engineers build and manage real-life data science projects. Metaflow was originally developed at Netflix to boost the productivity of data scientists who work on a wide variety of projects from classical statistics to state-of-the-art deep learning. Features Model with your favorite …

Enter your contact information to continue reading