Vaex  

[wtm_mlop_cats]

Vaex is a python library for lazy Out-of-Core DataFrames (similar to Pandas), to visualize and explore big tabular datasets. It can calculate statistics such as mean, sum, count, standard deviation etc, on an N-dimensional grid up to a billion () objects/rows per second. Visualization is done using histograms, density plots and 3d volume rendering, allowing interactive exploration of big data. Vaex uses memory mapping, a zero memory copy policy, and lazy computations for best performance (no memory wasted).

Features

Performance: works with huge tabular data, processes rows/second

Lazy / Virtual columns: compute on the fly, without wasting ram

Memory efficient no memory copies when doing filtering/selections/subsets.

Visualization: directly supported, a one-liner is often enough.

User friendly API: you will only need to deal with the DataFrame object, and tab completion + docstring will help you out: ds.mean<tab>, feels very similar to Pandas.

Lean: separated into multiple packages

Official website

Tutorial and documentation

Enter your contact information to continue reading