Skip to content

【Flink+Kappa+ @Uber】and【Flink+Pulsar, streaming-first, unified data processing】

Photo of Bowen
Hosted By
Bowen and Haitao W.
【Flink+Kappa+ @Uber】and【Flink+Pulsar, streaming-first, unified data processing】

Details

Don't forget to share this event with your friends, colleagues, or anyone who is interested in Apache Flink or stream, unified data processing, or real-time machine learning! We look forward to seeing you soon!

------------------------------------
TALK #1: Moving from Lambda and Kappa Architectures to Kappa+ with Flink at Uber (~45min)

Speaker:
Roshan Naik is a technical lead for Uber's Real-time Streaming Analytics Platform, working on engineering challenges related to large-scale stream processing. Before joining Uber, Roshan architected Apache Storm 2.0's new high performance execution engine and authored Apache Hive's transactional streaming ingest APIs. He is a committer on Flume, Streamline, and Storm. He is also author of Castor, an open source C++ library that brings the logic paradigm to C++.

Abstract:
Stream processing is not generally suitable for offline data processing... or is it ? Kappa+ is a simple and powerful new streaming architecture developed at Uber that greatly broadens the scope of stream processing. Whether your real-time infrastructure processes data at Uber-scale (well over a trillion messages daily) or only a fraction of that, chances are, your stack faces various design decisions based on the perceived limitations of stream processing, including choices such as whether to use streaming or batch processing, whether to leverage a Lambda or Kappa architecture for backfilling data (i.e., reprocessing archived data), how to transition from streaming offline to online or vice versa, and whether to switch to unified APIs like Beam, SQL, etc.

This talk introduces Uber's Kappa+ architecture and its benefits. We will also discuss how this architecture can be easily implemented on top of your existing Apache Flink deployment today without the need for new APIs or changes to runtime.

------------------------------------
TALK #2: When Apache Pulsar meets Apache Flink (~45min)

Speaker:
Sijie Guo is the founder of StreamNative. StreamNative is an infrastructure startup, focusing on building cloud native event streaming service around Apache Pulsar. Previously, he was the tech lead for the Messaging Group at Twitter, and worked on push notification infrastructure at Yahoo. He is also the VP of Apache BookKeeper and PMC Member of Apache Pulsar.

Abstract:
Both Apache Pulsar and Apache Flink share a similar view on how the data and the computation level of an application can be “streaming-first” with batch as a special case streaming. With Apache Pulsar’s Segmented-Stream storage and Apache Flink’s steps to unify batch and stream processing workloads under one framework, there are numerous ways of integrating the two technologies to provide elastic data processing at massive scale, and build a real streaming warehouse.

In this talk, Sijie Guo from Apache Pulsar community will given an overview of Apache Pulsar and how it provides the unified data view to fully leverage Apache Flink unified computation runtime for elastic data processing. He will share the latest integrations between Apache Pulsar and Apache Flink, especially around effectively-once processing and schema integration.

------------------------------------
AGENDA:

  • 5:30pm - 6pm Food and networking
  • 6pm - 6:10pm Meetup introduction and Flink's community status update, by Bowen
  • 6:10pm - 8pm Two talks, ~45min each

LOCATION: Uber Seattle Engineering Office, 1191 2nd Ave #1200 (12th floor), Seattle, WA 98101

DATE: Aug 22, 2019 (Thursday)

EVENT SPONSOR this time: Uber

Pizza and drinks will be provided.

**************
**************
If you are interested in giving talks or sponsoring our next event, please contact @Bowen.
**************
**************

Photo of Seattle Flink Meetup group
Seattle Flink Meetup
See more events
Uber Seattle
1191 2nd Ave #1200 (12th floor), Seattle, WA 98101 · Seattle, wa