Open Source Reliability for Data Lake with Apache Spark by Michael Armbrust

This is a past event

55 people went

Details

Abstract: Delta Lake is an open source storage layer that brings reliability to data lakes. Delta Lake offers ACID transactions, scalable metadata handling, and unifies streaming and batch data processing. It runs on top of your existing data lake and is fully compatible with Apache Spark APIs.

In this talk, we will cover
.All technical aspects of Delta Features
.What’s coming
.How to get started using it
.How to contribute

Bio: Michael Armbrust is committer and PMC member of Apache Spark and the original creator of Spark SQL. He currently leads the team at Databricks that designed and built Structured Streaming and Databricks Delta. He received his PhD from UC Berkeley in 2013, and was advised by Michael Franklin, David Patterson, and Armando Fox. His thesis focused on building systems that allow developers to rapidly build scalable interactive applications, and specifically defined the notion of scale independence. His interests broadly include distributed systems, large-scale structured storage and query optimization.

About the Location:

After turning off of Jefferson on to Alla Road, there is free street parking at the first right Coral Tree Place. From there you can enter the Reserve business park at the Gate 1 main entry which is the gate near the end of the road, or via a pedestrian entrance near gate 2.

Space 900 is the large building that is closest to the Gate 1 main entrance.

Special thanks to Verizon Digital Media https://eng.vdms.io/ for hosting this meet up.
Follow the VDMS engineering team on twitter at @engage_vdms https://twitter.com/engage_vdms and tweet your thanks.