17. Druid at Criteo and History of Hadoop at Spotify.


Details
Agenda:
• 17.45: eat, drink, socialize
• 18.00: first talk: Interactive Analytics at Scale with Druid
Speaker: Julien is leading the team building the next-gen campaign management and analytics platform for Criteo, one of the biggest ad tech firms in the world. Previously, he spent 5 years working as a software architect for an hedge fund specialized in algorithmic trading. He helped the company move from monthly releases to deploying several times a day by introducing agile practices and by re-architecting monolithic applications into dozens of micro-services. He also co-founded two startups.
Active member of the Alt.net France community, regular speaker, Julien likes talking about SOA, distributed systems, DDD, polyglot persistence, Agile practices or more recently Druid.
Abstract: How do you run analytics queries on a dataset that grows by billions of entries a day? What if you need to be able to drill into it, filter it or aggregate it by any available dimension? On demand, without precomputing, and with sub second latency? Oh and this isn't for an internal dashboard. The interface is customer-facing and is going to be accessed by thousands of clients.
In this talk, I'll tell you how Criteo, one of the biggest ad tech firms in the world, uses Druid to build its new analytics platform.
Druid is an open-source, real-time data store designed to power interactive applications at scale. I’ll walk you through its architecture, explain how it scales and how data is stored on disk and in memory to serve queries faster than you can blink.
• 18.45: eat, drink, socialize (more)
• 18.55: second talk: The Evolution of Hadoop at Spotify - Through Failures and Pain
Speakers: Josh Baer and Rafal Wojdyla.
Josh ‘joined the band’ at Spotify in early 2013 and has worked on a small team focusing on stabilizing and enhancing the Hadoop infrastructure, performing multiple migrations, upgrades and growing the cluster from 190 nodes to over 1300.
Rafal is an engineer at Spotify, a member of Hadoop squad responsible for operating, maintaining and growing one of the biggest Hadoop clusters in Europe.
Abstract: The quickest way to learn and evolve infrastructure is by encountering obstacles and being forced to overcome limitations that keep you inches away from project goals. At Spotify, we’ve encountered many of these obstacles and frustrations as we grew our Hadoop cluster from a few machines in an office closet aggregating played song events for financial reports, to our current 1300 node cluster that plays a large role in many features that you see in our application today.
Two members of Spotify’s Hadoop ‘squad’ will weave in war stories, failures, frustrations and lessons learned to describe the Hadoop/Big Data architecture at Spotify and talk about how that architecture has evolved.
We’ll talk about how and why we use a number of tools, including Apache Falcon and Apache Bigtop to test changes; Apache Crunch to build features and provide analytics; and Snakebite and Luigi, two in-house tools created to overcome common frustrations.
• 19.40: drink, socialize (even more) & have a chance to win free O'Reilly books
Food and drinks will be provided by Criteo (http://www.criteo.com/) (thanks).
Strata + Hadoop World 2015 discount:
To receive 20% discount to Strata + Hadoop World 2015 use code SHUG20, more info about the conference here (http://oreil.ly/UK15SHW).

17. Druid at Criteo and History of Hadoop at Spotify.