Skip to content

Building an In-house Elastic Hadoop/Spark Service on Multi-cloud Environments

T
Hosted By
T.J. B.
Building an In-house Elastic Hadoop/Spark Service on Multi-cloud Environments

Details

We are very happy to have invited Robin Li and Meng Meng (now at Twitch) from Tapjoy to talk about Tapjoy's Data Platform. I know Robin for several years, he has done a great job in building Tapjoy's data platform. I have invited him to do several tech talks in the past. This should be a great talk for companies want to build elastic cloud platform.

Title :

Building an In-house Elastic Hadoop/Spark Service on Multi-cloud Environments

Summary

In 2017, Tapjoy migrated out from our on-premises data center into the clouds -- lived our own framework in production runs MR and Spark in an ephemeral and elastically way,

uses cloud storages for Data Warehouse. The framework runs on both AWS and Google Cloud, utilizes Hortonworks' stack distribution, most workload runs on Spot(Preemptible) with a mixture of On-Demand(non-Preemptible) instances. We'd like to share our implementations of this framework, as well as lessons we learned of what it takes to run in-house large-scale EMR clusters on multi-clouds in production.

Speakers: Robin Li, Meng Meng

Robin Li, Director of Data Engineer and Data Science Engineer at Tapjoy, leads the team to build Tapjoy's Data Platforms and Decision Engine for optimization & personalization. Prior to Tapjoy, he worked in different roles at Credit Suisse. Robin received Master degree of Computer Science from Imperial College London.

Meng Meng

Meng Meng is a Software Engineer at Twitch. Before joining Twitch, Meng has worked as Sr. Data Analytics Engineer at Tapjoy for over 3 years, building and maintaining Tapjoy’s data platform for data science and analytics. Priority to Tapjoy, Meng worked at marketing agency Merkle Inc. Meng holds a Bachelor of Computer Science from Tongji University and a Master of Information Technology from Bentley University.

Agenda:

6:00 pm -- 6:30 pm Check-in/Networking/Light Dinner
6:30 pm -- 6:40 pm Introduction
6:40 pm -- 7:40 Main Talk + QA
7:50 pm -- 8:05 pm Networking
8:05pm -- 8:30 pm closing

Photo of SF Big Analytics group
SF Big Analytics
See more events
Workday Inc
160 Spear St Ste 1750 · San Francisco, CA