Stream Processing with Apache Kafka & Apache Samza



Welcome to the upcoming Stream Processing Meetup hosted by LinkedIn in Sunnyvale. This meetup focuses on Apache Kafka, Apache Samza, and related streaming technologies.

Location: Yosemite Conference Room, LinkedIn Corporate HQ in Sunnyvale. We will be on the 2nd floor of 605 W Maude Ave, Sunnyvale, CA 94085


6 PM: Doors open

6-6:35 PM: Networking & Welcome

6:35-7:10 PM: Stream processing using Samza-SQL@LinkedIn (Srinivasulu Punuru, LinkedIn)

Imagine if you can develop and run a stream processing job in few minutes and Imagine if a vast majority of your organization (business analysts, Product manager, Data scientists) can do this on their own without a need for a development team.

Need for real-time insights into the big data is increasing at a rapid pace. The traditional Java-based development model of developing, deploying and managing the stream processing application is becoming a huge constraint.

With Samza SQL we can simplify application development by enabling users to create stream processing applications and get real-time insights into their business using SQL.

In this talk, we try to answer the following questions

• How can SQL language be used to perform stream processing?

• How is Samza SQL implemented - Architecture?

• How can you deploy Samza SQL in your company?

7:15-7:50 PM: Streaming data pipelines @ Slack (Ananth Packkildurai, Slack)

Slack is a communication and collaboration platform for teams. Our millions of users spend 10+ hrs connected to the service on a typical working day. They expect reliability, low latency, and extraordinarily rich client experiences across a wide variety of devices and network conditions. It is crucial for the developers to get the real-time insights on Slack operational metrics.
In this talk, I will talk about how our data platform evolves from the batch system to near real-time. I will also touch base on how Samza helps us to build low latency data pipelines & Experimentation framework.

7:55-8:30 PM: Improving Kafka at-least-once performance (Ying Zheng, Uber)

At Uber, we are seeing an increasing demand for Kafka at-least-once delivery. So far, we are running a dedicated at-least-once Kafka cluster with special settings. With a very low workload, the dedicated at-least-once cluster has been working well for more than a year. When we want to allow at-least-once producing on the regular Kafka clusters, the producing performance became a concern. We spent some effort on this issue in the recent months and managed to at-least-once producer latency by about 80% with code changes and configuration tuning. Most of these improvements also help increase Kafka throughput and reducing Kafka end-to-end latency in general, not especially for at-least-once.


Please RSVP *only* if you plan to attend in person. Our facility can host 200 guests.


You can park in the uncovered parking that is along the building or in the parking garage located next to the building.


You will need to sign a standard NDA when you enter the lobby.

Food & Drink:

Food & drink will be provided.

Can’t join us in person?:

Live Stream will be available here:

Want to talk at a future meetup?:

Please contact us via the “Contact” button in