Stream Event Processing in Scale with Apache Flink and Couchbase


Details
18:00 - 18:30 - Mingling
18:30 - 18:40 - Proofpoint Introduction
18:40 - 19:25 - Using a NoSQL Database in Event Processing - David Ostrovsky @ Couchbase
19:25 - 20:10 - Apache Flink: Building a Stream Processor for fast analytics, event-driven applications, event time, and tons of state - Robert Metzger @ data Artisans
Title:
Using a NoSQL Database in Event Processing
Abstract:
Is it possible to build an event processing system without a database? Despite interesting ideas about turning the database inside out, the practical answer is still usually ‘not really’. No matter what, you need to keep your state somewhere convenient and fast, and you need to create, aggregate and store your materialized views of this data somewhere that’s easy to query. However, once you reach significant scales, there aren’t that many databases that can keep up with processing billions of events per day.
So let’s talk about when and how to use a distributed database in event processing. We’ll discuss some common patterns for using a database at various stages of an event processing pipeline, from source, to sink, to side-effect. We’ll examine the differences between a distributed message system like Kafka and a database that supports streaming data in and out, in this case Couchbase Server. And we’ll demonstrate some of the key concepts with common tools like Storm, Spark and Flink.
Bio:
David Ostrovsky (https://www.linkedin.com/in/davidostrovsky/) is a Senior Solutions Architect at Couchbase (https://www.couchbase.com/) and a big data geek who enjoys talking about databases and distributed systems.
Title:
Apache Flink: Building a Stream Processor for fast analytics, event-driven applications, event time, and tons of state
Abstract:
In the talk, we’ll first give an overview over Apache Flink (https://flink.apache.org/), a distributed stateful stream processor. The overview will cover all the main features like state, time and snapshots.
Then, we’ll look at the available APIs in Flink and some implementation details. For example, we will describe how jobs are deployed, how the checkpointing mechanism works and how "good old" SQL is possible also on data streams.
At the end of the talk, we’ll also show how some of the biggest companies in the world are using Flink in production.
Bio:
Robert Metzger (https://www.linkedin.com/in/metzgerrobert/), is an Engineering Manager and Co-founder at data Artisans, the company behind open source Apache Flink which brings real-time data applications to the enterprise. Robert leads the team that's building dA Platform, which enables turnkey stream processing with Apache Flink.
Robert studied Computer Science at TU Berlin and worked at IBM Germany and at the IBM Almaden Research Center in San Jose. He is a frequent speaker at conferences such as the Hadoop Summit, ApacheCon, and meetups around the world. Robert is also a member of the Apache Software Foundation, serving as a PMC member on the Flink and Bahir projects.
Location:

Stream Event Processing in Scale with Apache Flink and Couchbase