Stream Processing with Apache Kafka & Apache Samza (July 2018)


Welcome to the upcoming Stream Processing Meetup hosted by LinkedIn! This event focuses on Apache Kafka, Apache Samza, and related streaming technologies.

We will be hosting the actual event at Sunnyvale office, and we will also host a "viewing party" from San Francisco.


Main Event -
Yosemite Conference Room, LinkedIn Corporate HQ in Sunnyvale.
2nd floor of 605 W Maude Ave, Sunnyvale, CA. (Capacity for 200)

Viewing Party -
Lotta’s Fountain Conference Room, LinkedIn in San Francisco at 222 2nd Street, San Francisco, CA. (Capacity for 70)

6PM: Doors open
6-6:35 PM: Networking & Welcome

6:35-7:10 PM: Beam me up Samza: How we built a Samza Runner for Apache Beam (Xinyu Liu, LinkedIn)
Apache Beam provides an easy-to-use, and powerful model for state-of-the-art stream and batch processing, portability across a variety of languages, and the ability to converge offline and nearline data processing.

At LinkedIn, we have developed a Samza Runner to leverage the cutting-edge features of Beam. This runner combines the large-scale streaming processing capabilities and first-class state support in Samza with the advancements in Beam data processing. In this talk, we will discuss the Beam API and its implementation in Samza and the benefits of Beam Runner to the Samza and Beam community.

7:15-7:50 PM: uReplicator: Uber Engineering’s Scalable Robust Kafka Replicator (Hongliang Xu, Uber)

At Uber, we operate 20+ Kafka clusters to collect system and application logs as well as event data from rider and driver apps. We need a Kafka replication solution to replicate data between Kafka clusters across multiple data centers for different purposes. This talk will introduce the history behind uReplicator and the high level architecture. As the original uReplicator ran into scalability challenges and operational overhead as the scale of Kafka clusters increased, we built the Federated uReplicator which addressed above issues and provide an extensible architecture for further scaling.

7:55-8:30 PM: Concourse - Near real time notifications platform at Linkedin (Ajith Muralidharan & Vivek Nelamangala, LinkedIn)

Concourse is LinkedIn’s first near-real-time targeting and scoring platform for notifications. In this talk, we will provide an in-depth overview of the design and discuss various scaling optimizations. We'll explain how Concourse can score millions of notifications per second, while supporting the use of feature-rich machine learning models based on terabytes of feature data.

Please RSVP *only* if you plan to attend in person. Our facility can host ~200 guests in Sunnyvale and ~70 guests in San Francisco.

You can park in the uncovered parking that is along the building or in the parking garage located next to the building.

You will need to sign a standard NDA when you enter the lobby.

Food & Drink:
Food (pizza, wings) & drink (water, beer, wine) will be provided.

Live Stream:

Want to talk at a future meetup?:
Please contact us via the “Contact” button in