Skip to content

Details

Apache Iceberg has quickly become the backbone of modern data lakes, but maintaining tables efficiently is just as critical as building them. This session dives into the art of Iceberg table maintenance, from compaction strategies to metadata cleanup, with a focus on balancing query performance and compute cost. Attendees will walk away with actionable strategies and best practices to keep their Iceberg tables lean, fast, and future-proof.
​Who Should Attend

  • ​Data Engineers managing large-scale Iceberg deployments
  • ​Platform Engineers optimizing lakehouse infrastructure costs
  • ​Data Architects designing scalable data lake solutions
  • ​DevOps Engineers responsible for data pipeline maintenance
  • ​Technical Leaders overseeing data platform performance and budgets

Webinar Agenda

  1. Introduction & The Maintenance Challenge- Why Iceberg table maintenance is critical for production data lakes
  2. Compaction Strategies Deep Dive- Bin-packing vs. Sorting vs. Z-ordering and when to use each approach
  3. Metadata & Snapshot Management- Snapshot expiration policies, orphan file cleanup, and manifest rewrites
  4. File Layout Optimization- Solving the small file problem and right-sizing files for optimal performance
  5. Cost-Performance Optimization Framework- Measuring ROI of maintenance operations and scheduling strategies
  6. QnA
Big Data
Data Analytics
Data Engineering
Database Professionals
Data Lakes

Members are also interested in