Stream Processing with Apache Kafka & Apache Samza


Details
Welcome to the upcoming Stream Processing Meetup hosted by LinkedIn in Sunnyvale! This meetup focuses on Apache Kafka, Apache Samza, and related streaming technologies.
Location: Unify Conference Room, LinkedIn Corporate HQ in Sunnyvale. We will be on the 1st floor of 950 W Maude Ave, Sunnyvale, CA 94085
Agenda:
5:00 PM: Doors open and catered food available
5:00 - 6:00 PM: Networking
6:00 - 6:30 PM: High-performance data replication at Salesforce with Mirus
Paul Davidson, Salesforce
At Salesforce we manage high-volume Apache Kafka clusters in a growing number of data centers around the globe. In the past we relied on Kafka's Mirror Maker tool for cross-data center replication but, as the volume and variety of data increased, we needed a new solution to maintain a high standard of service reliability. In this talk, we will describe Mirus, our open-source data replication tool based on Kafka Connect. Mirus was designed for reliable, high-performance data replication at scale. It successfully replaced MirrorMaker at Salesforce and has now been running reliably in production for more than a year. We will give an overview of the Mirus design and discuss the lessons we learned deploying, tuning, and operating Mirus in a high-volume production environment.
6:30 - 7:00PM: Defending users from Abuse using Stream Processing at LinkedIn
Bhargav Golla, LinkedIn
When there are more than half a billion users, how can one effectively, reliably and scalably classify them as good and bad users? This talk will highlight how Anti-Abuse team at LinkedIn leverages Streams Processing techniques like Samza and Brooklin to keep the good users in a trusted environment devoid of bad actors.
7:00 - 7:30 PM: Enabling Mission-critical Stateful Stream Processing with Samza
Ray Manpreet Singh Matharu, LinkedIn
Samza powers a variety of large-scale business-critical stateful stream processing applications at LinkedIn. Their scale necessitates using persistent and replicated local state. Unfortunately, hard failures can cause a loss of this local state, and re-caching it can incur downtime ranging from a few minutes to hours! In this talk, we describe the systems and protocols that we've devised that bound the down time to a few seconds. We detail the tradeoffs our approach brings and how we tackle them in production at LinkedIn.
7:30 - 8:00 PM: Additional networking and Q&A
RSVP:
Please RSVP only if you plan to attend in person. Our facility can host 250 guests.
Parking:
You can park in the uncovered parking that is along the building or in the parking garage located next to the building.
NDA
You will need to sign a standard NDA when you enter the lobby.
Food & Drink:
Food & drink will be provided.
Can’t join us in person?:
Join us online - LINK will be posted 30 min prior to the meetup
Want to talk at a future meetup?
Please contact us via the “Contact” button in meetup.com.
We are hiring in Systems & Infra!
Senior SWE: https://www.linkedin.com/jobs/view/1375331316/
Staff SWE: https://www.linkedin.com/jobs/view/1183754281/

Stream Processing with Apache Kafka & Apache Samza