Skip to content

How Uber Uses Open Source Big Data Technologies

C
Hosted By
Christina
How Uber Uses Open Source Big Data Technologies

Details

Uber Engineering is excited to invite you to this open source meetup in Palo Alto, where we will highlight how we use Hadoop and Spark to manage data at a global scale. Presentations will cover how our Data Platform team handles the massive amount of data coming in from cities all over the world; development to improve Hadoop’s scalability in partnership with the open source community; solutions for operating Hadoop infrastructure across a large organization; and how we manage 80,000-plus Spark apps.

Join us to learn more about how Uber embraces open source technology and culture to create an open and innovative engineering organization.

Agenda
6:00pm -6:30pm - Doors open, food, and drinks
6:30pm -6:40pm - Unique data challenges at Uber scale- Suresh Srinivas
6:40pm -7:00pm - Future of HDFS at Uber- CR Hota and Chao Sun
7:00pm- 7:20pm- Managing Apache Hadoop at Scale-Mithun Mathew and
Liang Gao
7:20pm -7:35pm- Keeping Up With Apache Spark’s Popularity at Uber-
Abhishek Modi
7:35pm -8:00pm- Q&A and Networking

Unique data challenges at Uber scale
Speaker: Suresh Srinivas
Uber's mission is to bring transportation for everyone and everywhere. This mission has powered exponential growth of Uber at a global scale, starting with ridesharing and expanding into many offerings, such as Uber Eats, Uber Freight, Uber for Business, Uber Delivery, and Uber Health. Data is at the core of Uber’s business of creating great experiences and efficiencies for our users and partners. Scale of data coupled with the speed of data, and the need to derive insights and decisions faster, brings about unique challenges related to agility and technology. This talk will cover how the Data Platform team at Uber handles these unique challenges and key learnings from that work.

Future of HDFS at Uber
Speaker: CR Hota, Chao Sun
In the last few years, Uber's prime Hadoop cluster has grown to hold ~100PB worth of data. This volume of data, along with constraints to scale a single Hadoop cluster, has thrown new challenges in terms of scaling the current Hadoop infrastructure while still maintaining a single unified view of data. This talk will focus on how Uber's Hadoop team is designing a solution by working closely with the open-source community to tackle this problem of scale and efficiency. The talk will touch upon how various new features in Hadoop, such as Router-based federation, Observer Namenode, Erasure coding, and Tiering Service, along with interrelated projects will help shape Hadoop's future at Uber.

Managing Apache Hadoop at Scale
Speaker: Mithun Mathew, Liang Gao
Over the last three years Hadoop at Uber has grown from a footprint of 50 to 15,000+ servers. In this presentation, we focus on solutions we built targeting problems that arise from managing open source Hadoop in a large scale production environment. During the presentation we will cover three main topics:
Resource management system that empowers teams to manage their own resources and align them with organizational budget.
Our automated cluster deployment and management solution that has helped our team deploy and operate Hadoop clusters across 15,000+ servers.
Our future work on anomaly detection and auto-remediation as we continue to scale our infrastructure.

Keeping Up With Apache Spark’s Popularity at Uber
Speaker: Abhishek Modi
There are 80,000 Spark apps that run at Uber every day. Let’s talk about how Uber’s Spark Compute Infra Service handles this load. In this talk, we’ll discuss how Uber’s Spark ecosystem started out as a number of different spark builds and semi-managed clusters, and how this evolved to today, where we have managed builds and clusters as well as automated monitoring, profiling, autotuning, and dependency management! We’ll also talk about the issues that we face today and how we’re planning on tackling these.

Please join our Uber Open Source facebook page for updates: https://www.facebook.com/uberopensource/

Photo of Uber Engineering Events - San Francisco group
Uber Engineering Events - San Francisco
See more events
900 Arastradero Rd. · Palo Alto, CA