MagdaData catalog

Magda is a data catalog system that provides a single place where all of your organization’s data can be catalogued, enriched, searched, tracked and prioritized – whether big or small, internally or externally sourced, available as files, databases or APIs. With Magda, your data analysts, scientists and engineers can easily find useful data with powerful discovery features, properly understand what they’re using thanks to metadata enhancement and authoring tools, and make data-informed decisions with confidence as a result of history tracking and duplication detection.


* Powerful and scalable search based on ElasticSearch
* Quick and reliable aggregation of external sources of datasets
* An unopinionated central store of metadata, able to cater for most metadata schemas
* Federated authentication via passport.js – log in via Google, Facebook, WSFed, AAF, CKAN, and easily create new providers.
* Based on Kubernetes for cloud agnosticism – deployable to nearly any cloud, on-premises, or on a local machine.
* Easy (as long as you know Kubernetes) installation and upgrades
* Extensions are based on adding new docker images to the cluster, and hence can be developed in any language

Official website

Tutorial and documentation

