1. Asurion's "GPS" of their data lake; 2. Intro to Apache Iceberg views

Name: 1. Asurion's "GPS" of their data lake; 2. Intro to Apache Iceberg views
Start: 2022-04-28T13:00:00-07:00
End: 2022-04-28T16:00:00-07:00
Location: SPIN Chicago

Hosted By

Jason H. and Dremio

1. Asurion's "GPS" of their data lake; 2. Intro to Apache Iceberg views

Details

All in-person attendees will receive an Apache Iceberg t-shirt!

Only 40 attendees are allowed due to limited space.

This meetup is a part of a hybrid virtual and in-person meetup, along with attendees virtually and another in-person watch party in Chicago at the same time. The two talks will be presented virtually and broadcasted to a screen at the in-person events with live Q&A from both virtual and in-person attendees.

The in-person attendees will have the opportunity to socialize and grab free food and free drinks before and after the talks.

Meetup event links for the virtual event and other in-person watch party:

Agenda (times in CDT):

3:00pm - 3:30pm - Check in and networking
3:30pm - 4:00pm - Asurion’s GPS of their data lake, Rajesh Gundugollu, Principal Architect, Asurion
4:00pm - 4:30pm - An Intro to Apache Iceberg views - Eduard Tudenhoefner, OSS Developer, Dremio
4:30pm - 6:00pm - Socializing and networking

This meetup will consist of two talks:

DPS(Data Positioning System): The GPS for your data lake - Rajesh Gundugollu, Principal Architect, Asurion.

This presentation is about our internal product called DPS. We do not call it a Data Catalog intentionally because it’s much more than a Data Catalog. It gives users and platform owners everything they need to know about the data in Data Platform all in one place via a simple search driven UI.

We brought together Data Assets, Columns, Data Movement Jobs, Users, Infrastructure, operational data and even documentation and help about Data Platform into one pane of glass. All of this is presented via a very simplified, interactive, and easy to understand interface. Lot of information about Data Assets like lineage, impact analysis, operational metrics, quality metrics, regulatory metrics all come together in one place.
With this presentation, we also want to share how we overcame the Metadata culture hurdle, how we built this ourselves and how we innovated using graph type data model without a graph database etc.

--------------------------------------------

Intro to Apache Iceberg views - Eduard Tudenhoefner, OSS Developer, Dremio

In open architectures, different engines are used for the workload they were designed and work best for. When using multiple different engines on the same datasets, they all need to agree on what the dataset is. Apache Iceberg provides us that capability, and it works well when you primarily have one engine doing the writing and one engine doing the downstream analytics. However, when using multiple engines for downstream user-facing analytics, each engine also needs to use business logic to provide the end user the answer they're looking for.

When using multiple engines for downstream analytics, there are generally three options:

Each engine has their own definition of the business logic on top of these shared tables
Route other engines’ access through a single engine, which technologies like Apache Arrow Flight make more feasible
Centrally define the business logic in a way all engines can make use of. This has generally not been possible for the vast majority of organizations in the past. This is the approach Apache Iceberg views aim to enable.

In this talk, we'll provide an intro to Iceberg views and how they can be useful to you.

Events in Chicago, IL Big Data SQL

Enterprise Architecture Data Warehouse Data Lakes

Open Data Lakehouse Meetups - Global

See more events

Open Data Lakehouse Meetups - Global

Thursday, April 28, 2022
1:00 PM to 4:00 PM PDT

SPIN Chicago

344 N State St · Chicago, IL

Open Data Lakehouse Meetups - Global

public group

1. Asurion's "GPS" of their data lake; 2. Intro to Apache Iceberg views