February Samza Meetup


Details
This month we have the following talks planned:
- StatServer-Samza: Near Real-time Analytics by Tomy Tsai (LinkedIn)
Synopsis: StatServer is a near real-time analytics service popularly used in LinkedIn and is in the process of being migrated to the Samza platform.
In this presentation we will talk about how and why we are re-designing and implementing StatServer in Samza.
Bio: Tomy (Chang-Ming) Tsai is a senior software engineer in LinkedIn where he has been working in the online relevance infrastructure team.
One of his projects was to maintain and deploy an online count aggregation and query service, StatServer, for near real-time machine learning applications. He also helped deploy the re-designed StatServer in Samza.
- Optimizing Streaming SQL Queries by Julian Hyde (Hortonworks)
Synopsis:What is SamzaSQL, and what might I use it for? Does this mean that Samza is turning into a database? What is a query optimizer, and what can it do for my streaming queries?
Bio: Julian Hyde is an expert in query optimization, in-memory analytics, and streaming. He is PMC chair of Apache Calcite, the query planning framework behind Hive, Drill, Kylin and Phoenix. He was the original developer of the Mondrian OLAP engine, and is an architect at Hortonworks.
- Bridging the Gap: Connecting AWS and Kafka by Ryanne Dolan (LinkedIn) and Jason Li (LinkedIn)
Synopsis: Kinesis to Kafka Bridge is a Samza job that replicates AWS Kinesis to a configurable set of Kafka topics and vice versa. It enables integration between AWS and the rest of LinkedIn. It supports replicating streams in any LinkedIn fabric, any AWS account, and any AWS region.
DynamoDB Stream to Kafka Bridge is built on top of Kinesis to Kafka Bridge. It enables data replication from AWS DynamoDB to LinkedIn. In this presentation we will talk about how we designed the system and how we use it in LinkedIn.
Bio: Ryanne Dolan is a software engineer at LinkedIn. He was originally part of the Bizo acquisition. He works remotely from Missouri on Bizo's AWS infra and streams. His LinkedIn profile is https://www.linkedin.com/in/ryannedolan
Jason Li is a senior software engineer in LinkedIn where he has been working in the slideshare rich media platform infrastructure team. One of his projects was to create a data replication pipeline from AWS DynamoDB to LinkedIn Data Center. His LinkedIn profile is https://www.linkedin.com/in/jasonpengfeili
Agenda
6:00 - Networking
6:30 - Opening - Clark Haskins(LinkedIn)
6:35 - StatServer-Samza: Near Real-time Analytics by Tomy Tsai (LinkedIn)
7:15 - Optimizing Streaming SQL Queries by Julian Hyde (Hortonworks)
8:00 - Bridging the Gap: Connecting AWS and Kafka by Ryanne Dolan (LinkedIn) and Jason Li (LinkedIn)
8:45 - Q&A
Please only RSVP if you plan to attend in person. We will be streaming and recording the event.
If you are interested in presenting at any current or upcoming Samza meetup, please send an email to SamzaMeetups@linkedin.com

February Samza Meetup