Past Meetup

How Netflix Handles Data Streams + Intro to Apache Kudu

Location image of event venue

Details

We're lucky to have two awesome talks this month in the amazing space at New Relic!

Schedule

Doors open @ 6pm

6:30pm: networking and light food & beer (if you desire)

7pm: Talk 1 - How Netflix Handles Data Streams Up to 8M Event/sec

7:45: Talk 2 - Intro to Apache Kudu by Galvanize

8:30: Wrap up

Talk #1 - How Netflix Handles Data Streams Up to 8M Events/sec

Peter Bakas, Dir Eng @ Netflix will provide an overview of Keystone, their new data pipeline. The talk covers how Netflix migrated from Suro to Keystone, including the reasons behind the transition and the challenges of zero loss while processing over 500 billion events daily. Hear in detail how they deploy, operate, and scale Kafka, Samza, Docker, and Apache Mesos in AWS to manage 8 million events & 17 GB per second during peak

Speaker: Peter Bakas - Director of Engineering, Real-Time Data Infrastructure, Netflix

Talk #2 - Intro to Apache Kudu

Big Data applications need to ingest streaming data and analyze it. HBase is great at ingesting streaming data but not good at analytics. HDFS is great at analytics but not at ingesting streaming data. Frequently applications ingest data into HBase and then move it to HDFS for analytics.

What if you could use a single system for both use cases? This could dramatically simplify your data pipeline architecture.

Enter Kudu. Kudu is a storage system that lives between HDFS and HBase. It is good at both ingesting streaming data and good at analyzing it using Spark, MapReduce, and SQL.

Speaker: Asim Jalis - Lead Instructor, Data Engineering Immersive (http://www.galvanize.com/courses/data-engineering), Galvanize SF