Stream Processing with Apache Kafka & Apache Samza


Details
Welcome:
Welcome to the upcoming Stream Processing Meetup hosted by LinkedIn in Sunnyvale. This meetup focuses on Apache Kafka, Apache Samza, and related streaming technologies.
Location: Yosemite Conference Room, LinkedIn Corporate HQ in Sunnyvale. We will be on the 2nd floor of 605 W Maude Ave, Sunnyvale, CA 94085
Agenda:
6 PM: Doors open
6-6:35 PM: Networking & Welcome
6:35-7:10 PM: Stream processing using Samza-SQL@LinkedIn (Srinivasulu Punuru, LinkedIn)
Imagine if you can develop and run a stream processing job in few minutes and Imagine if a vast majority of your organization (business analysts, Product manager, Data scientists) can do this on their own without a need for a development team.
Need for real-time insights into the big data is increasing at a rapid pace. The traditional Java-based development model of developing, deploying and managing the stream processing application is becoming a huge constraint.
With Samza SQL we can simplify application development by enabling users to create stream processing applications and get real-time insights into their business using SQL.
In this talk, we try to answer the following questions
• How can SQL language be used to perform stream processing?
• How is Samza SQL implemented - Architecture?
• How can you deploy Samza SQL in your company?
7:15-7:50 PM: Streaming data pipelines @ Slack (Ananth Packkildurai, Slack)
Slack is a communication and collaboration platform for teams. Our millions of users spend 10+ hrs connected to the service on a typical working day. They expect reliability, low latency, and extraordinarily rich client experiences across a wide variety of devices and network conditions. It is crucial for the developers to get the real-time insights on Slack operational metrics.
In this talk, I will talk about how our data platform evolves from the batch system to near real-time. I will also touch base on how Samza helps us to build low latency data pipelines & Experimentation framework.
7:55-8:30 PM: Improving Kafka at-least-once performance (Ying Zheng, Uber)
At Uber, we are seeing an increasing demand for Kafka at-least-once delivery. So far, we are running a dedicated at-least-once Kafka cluster with special settings. With a very low workload, the dedicated at-least-once cluster has been working well for more than a year. When we want to allow at-least-once producing on the regular Kafka clusters, the producing performance became a concern. We spent some effort on this issue in the recent months and managed to at-least-once producer latency by about 80% with code changes and configuration tuning. Most of these improvements also help increase Kafka throughput and reducing Kafka end-to-end latency in general, not especially for at-least-once.
RSVP:
Please RSVP only if you plan to attend in person. Our facility can host 200 guests.
Parking:
You can park in the uncovered parking that is along the building or in the parking garage located next to the building.
NDA:
You will need to sign a standard NDA when you enter the lobby.
Food & Drink:
Food & drink will be provided.
Can’t join us in person?:
Live Stream will be available here: https://primetime.bluejeans.com/a2m/live-event/ezeuvzqd
Want to talk at a future meetup?:
Please contact us via the “Contact” button in meetup.com.

Stream Processing with Apache Kafka & Apache Samza