Uber Data Engineering Night


Details
Welcome to the first Uber Data Engineering Night!
We’re starting the Uber Data Engineer meetup series to share some of the knowledge we - and guest speakers from other tech companies - have gathered through developing data solutions at scale. This evening, we will have speakers from Airbnb, WeWork, and Uber presenting on defining and managing metrics, the evolution and future of the Big Data ecosystem, and using data to create better experiences for your users.
Agenda:
6:00pm- 6:30pm - Doors open, food and drinks
6:30pm- 6:40pm- Welcome and Introductions - Nicolas Garcia Belmonte, Uber
6:40pm- 7:00pm- Changing the Paradigm on Metric Management - Lauren Chircus, Airbnb
7:00PM- Doors close
7:00pm-7:20pm - Using Data to Design a 360-Degree View of the Uber Experience, Tanvi Kothari
7:20PM - 7:40pm- From Flat Files to Deconstructed Database: The Evolution and Future of the Big Data Ecosystem, Julien Le Dem
7:45pm -8:15pm -Networking
_______________________________________________________________________________________
Using data to design a 360-degree view of the Uber experience by Tanvi Kothari, Engineering Manager, Data @ Uber
To create the best transportation experiences possible on the Uber platform, , it is important fully understand the needs of our users. We collated data from multiple sources to build a single 360-degree view that provides a consistent and holistic view of a user's journey. This 360 profile helps teams gain valuable insights into the user’s preferences and enables us to provide better experiences on our apps.
_______________________________________________________________________________________
Changing the Paradigm on Metric Management @Airbnb by Lauren Chircus, Product Manager, Data Quality & Analytics Frameworks @ Airbnb
Developing source of truth data and metrics has been a perpetual quest at Airbnb. As complexity and the size of the company grows, the number of key metrics and dimensional cuts teams track grows, as well. Tribal knowledge of important data does not scale to large teams, and all teams may not have the data engineering skills to build and maintain pipelines.
To scale metric definition and discovery, we built a metrics framework to be the source of truth for metric definitions. This framework calculates metrics daily, feeds those metrics to reporting, experimentation, and anomaly detection applications, makes the searchable in Airbnb's Dataportal, and abstracts data lifecycle management issues such as schema evolution and backfilling. This solves the challenge of inconsistent metrics definitions, improving efficiency of Data Scientists defining metrics and freeing them from pipeline management. We have seen rapid adoption of this framework, ultimately driving business value through increasing the trustworthiness and availability of metrics.
_______________________________________________________________________________________
From Flat Files to Deconstructed Database: The Evolution and Future of the Big Data Ecosystem by Julien Le Dem, Principal Engineer @ WeWork
This talk will go over key open source components of the Big Data ecosystem (including Apache Calcite, Parquet, Arrow, Avro, Kafka, Batch and Streaming systems) and will describe how they all relate to each other and make our Big Data ecosystem more of a database and less of a file system. Parquet is the columnar data layout to optimize data at rest for querying. Arrow is the in-memory representation for maximum throughput execution and overhead-free data exchange. Calcite is the optimizer to make the most of our infrastructure capabilities. We’ll discuss the emerging components that are still missing or haven’t become standard yet to fully materialize the transformation to an extremely flexible database that lets you innovate with your data.

Sponsors
Uber Data Engineering Night