TBDEG - Architecture of Apache Iceberg with Alex Merced!


Details
This meetup we'll be talking with Dremio's Alex Merced about Apache Iceberg:
Data Lakes have been built with a desire to democratize data - to allow more and more people, tools, and applications to make use of data. A key capability needed to achieve it is hiding the complexity of underlying data structures and physical data storage from users. The de-facto standard has been the Hive table format, released by Facebook in 2009 that addresses some of these problems, but falls short at data, user, and application scale. So what is the answer? Apache Iceberg.
Apache Iceberg table format is now in use and contributed to by many leading tech companies like Netflix, Apple, Airbnb, LinkedIn, Dremio, Expedia, and AWS. Alex Merced, Developer Advocate at Dremio, will present the architectural details of why the Hive table format falls short and why the Iceberg table format resolves them, as well as the benefits that stem from Iceberg’s approach.
You will learn:
- The issues that arise when using the Hive table format at scale, and why we need a new table format
- How a straightforward, elegant change in table format structure has enormous positive effects
- The underlying architecture of an Apache Iceberg table, how a query against an Iceberg table works, and how the table’s underlying structure changes as CRUD operations are done on it
- The resulting benefits of this architectural design
Bio: Alex Merced is a Developer Advocate for Dremio with a history of creating content to enable developers of all types through his personal projects like DevNursery.com, The Web Dev 101 Podcast, and the DataNation podcast. Alex Merced has been a developer with companies like Crossfield Digital, CampusGuard, GenEd Systems and others along with being an Instructor for General Assembly Bootcamps.
Looking forward to seeing you all soon!

TBDEG - Architecture of Apache Iceberg with Alex Merced!