Metadata types & instances:
* Pre-defined types for various Hadoop and non-Hadoop metadata
* Ability to define new types for the metadata to be managed
* Types can have primitive attributes, complex attributes, object references; can inherit from other types
* Instances of types, called entities, capture metadata object details and their relationships
*REST APIs to work with types and instances allow easier integration
Classification:
* Ability to dynamically create classifications – like PII, EXPIRES_ON, DATA_QUALITY, SENSITIVE
* Classifications can include attributes – like expiry_date attribute in EXPIRES_ON classification
* Entities can be associated with multiple classifications, enabling easier discovery and security enforcement
* Propagation of classifications via lineage – automatically ensures that classifications follow the data as it goes through various processing
Lineage:
* Intuitive UI to view lineage of data as it moves through various processes
* REST APIs to access and update lineage
Search/Discovery:
* Intuitive UI to search entities by type, classification, attribute value or free-text
* Rich REST APIs to search by complex criteria
* SQL like query language to search entities – Domain Specific Language (DSL)
Security & Data Masking:
* Fine grained security for metadata access, enabling controls on access to entity instances and operations like add/update/remove classifications
* Integration with Apache Ranger enables authorization/data-masking on data access based on classifications associated with entities in Apache Atlas. For example:
— who can access data classified as PII, SENSITIVE
— customer-service users can only see last 4 digits of columns classified as NATIONAL_ID