Delta Lake

GitHub Support CommunityData Management

Delta Lake is an open source project that enables building a Lakehouse architecture on top of data lakes. Delta Lake provides ACID transactions, scalable metadata handling, and unifies streaming and batch data processing on top of existing data lakes, such as S3, ADLS, GCS, and HDFS.

Features

1. ACID Transactions
2. Scalable Metadata Handling
3. Time Travel (data versioning)
4. Open Format
5. Unified Batch and Streaming Source and Sink
6. Schema Enforcement
7. Schema Evolution
8. Audit History
9. Updates and Deletes
10. 100% Compatible with Apache Spark API
11. Delta Everywhere

Official website

Tutorial and documentation

Enter your contact information to continue reading