Skip to content

Experience building an index in a Data Lakehouse, with Paola Pardo

Photo of Ferran Galí i Reniu
Hosted By
Ferran Galí i R. and 3 others
Experience building an index in a Data Lakehouse, with Paola Pardo

Details

Happy new year, Sparklers!
We hope you all had an excellent start of 2022 and seeking new challenges. Two months have gone by since our last meeting, where we promised to be more active... So here’s the first step forward!

This February we are inviting you to what will be the first offline event in almost 2 years! This time, our co-organizer Paola Pardo will share again her knowledge with a talk named "Experience building an index in a data lakehouse"

See you Wednesday 23rd of February 19:00 @ Attico Verdaguer

We want to thank our two sponsors for this event:

  • Attico Workspaces for offering their amazing venue in their support for an innovative, entrepreneurial and creative community!
  • Qbeast will delight us with drinks and pizzas!

Don't miss it!

Abstract:

The Big Data ecosystem is moving towards a Data Lakehouse architecture. The best of Data Lake and Data Warehouses are combined to offer a needed metadata management layer to the storage. At Qbeast, we built an extension that brings functionalities such as multi-column indexing and efficient sampling to your data lakehouse.

In this talk, we will deep dive into the internals of the open-source implementation based on Apache Spark and Delta Lake: Qbeast-spark. We will explain how the Qbeast Format organizes the data and answers a query using only the metadata insights. And, of course, the different optimization problems we have faced in the development!

Bio:

Paola Pardo is one of the co-founders of Qbeast, a spin-off of the Barcelona Supercomputing Center that uses a patented indexing technology to store and query big data more efficiently. She developed big data software at the BSC before joining the Qbeast team and graduated from the UPC with a thesis focused on Data storage push-down optimization for Apache Spark. She is currently developing Qbeast-Spark and advocating for open source technologies that help the growth of data analytics, data science, and data engineering.

COVID-19 safety measures

Masks required
Event will be indoors
We will ask for your contact details upon your arrival to notify you in the case of a COVID-19 positive case among the attendees.
The event host is instituting the above safety measures for this event. Meetup is not responsible for ensuring, and will not independently verify, that these precautions are followed.
Photo of Barcelona Spark Meetup group
Barcelona Spark Meetup
See more events
Aticco Verdaguer - El teu coworking a Gràcia
C/ de Provença, 339 · Barcelona, CT