Skip to content

AI-Powered Data Exploration: Interacting with Apache Iceberg via Spark and LLMs

Photo of Vincent Mayers
Hosted By
Vincent M.
AI-Powered Data Exploration: Interacting with Apache Iceberg via Spark and LLMs

Details

This presentation delves into the potential of integrating LLMs with Apache Spark and Apache Iceberg to establish an intuitive chat interface for data interaction. We’ll show how this combination enables users to perform data queries and extract insights from massive datasets using natural language. At Azul, we have an enormous amount of data (logs) gathered over the years, of our open source and free JVM downloads - all stored in Apache Iceberg. In this session, we’ll explore the potential of combining Iceberg, Spar,k, and LLMs:

Natural Language Querying: By leveraging LLMs, we can run Spark operations that query the underlying Iceberg dataset. This abstracts away the need for users to write complex SQL or PySpark code, making data exploration accessible and easy.

AI-Enhanced Insight Generation: The integration allows LLMs to not only retrieve data but also to generate summaries, identify patterns, and perform trend analysis directly from the structured information stored in Iceberg tables

Integrated Solution: How we’ve built a solution that stacks Iceberg, Spark, and GenAI to interact with the download data

Photo of Atlanta Java Users Group group
Atlanta Java Users Group
See more events
FREE