Skip to content

#32. Scaling Hive via Mesos - P. Szostek (Criteo)

Photo of KlaudiaZ
Hosted By
KlaudiaZ
#32. Scaling Hive via Mesos - P. Szostek (Criteo)

Details

We are happy to invite you to the 32nd meetup of WHUG. We'll be pleased to host Pawel Szostek from Criteo.

Title: Scaling Hive via Mesos

Abstract:

Hive is the main data transformation tool at Criteo used by hundreds of analysts and thousands of automated jobs every day. Pawel Szostek will discuss the evolution of Criteo’s Hive platform from an error-prone add-on installed on some spare machines to a best-in-class installation capable of self-healing and automatically scaling to handle its growing load.
The resulting platform is based on Mesos. Mesos has allowed Criteo to scale per demand and better utilize resources, iterate on development much faster than on bare metal, and roll out new versions seamlessly without downtime for our users. Finally, it has allowed the company to eliminate the last SPOF in its Hive stack. Pawel will detail Criteo’s data architecture and explain how the company solved challenges in security, monitoring, and load balancing on multiple layers. He will also discuss the gains made by this process.

Speaker:

Pawel Szostek is a senior software engineer on Criteo’s analytics data storage team, where he works on various projects around Hive and Vertica. Previously, he was a researcher at CERN in Geneva and a software engineer at ICM UW.

---
Free pizza & drinks

Pizza&drinks, sponsored by GetInData, will be available for the participants during the meetup. If you would like to get some, please sing up in this form: https://goo.gl/forms/0aFiP1w9fJOfyiVb2
This will help us to estimate the number of pizza and drinks :)

See you at the meeting!

Photo of Warsaw Data Tech Talks (Poland) group
Warsaw Data Tech Talks (Poland)
See more events