Vaex

GitHub Support CommunityOptimization tools

Vaex is a python library for lazy Out-of-Core DataFrames (similar to Pandas), to visualize and explore big tabular datasets. It can calculate statistics such as mean, sum, count, standard deviation etc, on an N-dimensional grid up to a billion () objects/rows per second. Visualization is done using histograms, density plots and 3d volume rendering, allowing interactive exploration of big data. Vaex uses memory mapping, a zero memory copy policy, and lazy computations for best performance (no memory wasted).

Features

Performance: works with huge tabular data, processes rows/second

Lazy / Virtual columns: compute on the fly, without wasting ram

Memory efficient no memory copies when doing filtering/selections/subsets.

Visualization: directly supported, a one-liner is often enough.

User friendly API: you will only need to deal with the DataFrame object, and tab completion + docstring will help you out: ds.mean<tab>, feels very similar to Pandas.

Lean: separated into multiple packages

Official website

Link

Tutorial and documentation

Click here to view

Montreal

1275 Av. des Canadiens-de-Montréal,

Montréal, QC H3B 0G4

Canada

Los Angeles

312 Arizona Ave,

Santa Monica, CA 90401,

USA

Dubai

Gate Avenue Zone D at DIFC – Sheikh Zayed Road

Dubai, United Arab Emirates

Doha

1 Al Corniche St, Burj Doha, level 21,

Doha, Qatar