Delta Lake, Spark, And The Incremental ETL Architecture (IN PERSON/VIRTUAL)


Details
The talk will be on the 4th floor of 1801 California Street. The security ppl will let you in; just tell them you are here for the Meetup at Ibotta. I will be downstairs in the Lobby by the security desk. You will need to sign NDAs. Then we'll take you upstairs .
+++
Join us at Ibotta HQ as we host John O'Dwyer, a Senior Solutions Architect at Databricks who will talk about Incremental ETL and the Delta Lake.
Schedule:
6:30 - 7:00 meet and greet
7:00 - 8:00 presentation
Incremental ETL in a conventional Data Warehouse has been possible for some time but scale, cost, accounting for state and the lack of access for machine learning make it not ideal. Until now, Incremental ETL in a Data Lake has not been possible due to factors such as updating data and identifying changed data in a big data table. Incremental ETL also makes the medallion table architecture possible and efficient so that all consumers of data can have the correct curated data sets for their needs. We will discuss the advances in Delta Lake, Spark, and Databricks that make Incremental ETL possible as well as the architecture as a whole.
Speaker Bio:
John O’Dwyer is a Senior Solutions Architect at Databricks where he helps empower the Databricks, Spark, Delta Lake and MLflow communities. He is a hands-on big data and machine learning engineer with extensive experience developing internet-scale infrastructure, data platforms, and predictive analytics systems. He has an MS from the University of Colorado and a BS from Ohio University. His current technical focuses include Distributed Systems, Apache Spark, Delta Lake, and Machine Learning.
COVID-19 safety measures

Delta Lake, Spark, And The Incremental ETL Architecture (IN PERSON/VIRTUAL)