Building a real-time complex event processing platform with Apache Flink


Details
• What we'll do
For the first meetup of 2018 we are back at Rental Cars. If you are just starting to look at the alternatives for highly scalable data ingestion pipelines, you will no doubt be looking at Apache Spark and the newer kid on the block Apache Flink. Our friends from GetIndata are travelling all the way from Warsaw to present Apache Flink, showcasing a real world telco use case.
Agenda
6.30pm Network, pizza and beer
7.00pm Building a real-time complex event processing platform with Apache Flink - lessons learned
8:30pm - Networking
The detail:
Building a real-time complex event processing platform with Apache Flink - lessons learned
Abstract: Our presentation will be based on our recent experience from building a platform for processing massive streams of telco events in real-time with Apache Flink. This platform has been jointly build by GetInData (www.getindata.com) and Kcell (the leading telco in Kazakhstan) in just few months and it currently runs in production by ingesting and processing over 150k messages per second or a terabyte of data per day on a still small cluster.
We will start with a brief overview of concept of stateful stream processing and how Flink implements it. Later we will talk about how we used it to build a dynamic, scalable and cost-effective platform that can process millions of events in seconds. We will also focus on challenges that we faced during development such as schema evolution of our data, testing business logic, monitoring our solution and using best of breed open-source projects. In the end we will try to provide some tips what one should pay attention to when developing similar solution.
Bios:
Dawid Wysakowicz:
Data Engineer at GetInData working to help people and companies succeed with Apache Flink and stream processing. Actively participates in the Flink community what resulted in becoming a Flink committer. First interested with Big Data technologies in 2015 while writing Master Thesis on Distributed Genomic Data Warehouse.
Grzegorz Kołakowski:
A software engineer with five years of experience. Recently a great enthusiast of stream processing and related open source tools, in particular Apache Flink and Apache Kafka. Currently, he is a data engineer at GetInData helping companies with building scalable, distributed systems for storing and processing big data volumes.
• What to bring
• Important to know

Building a real-time complex event processing platform with Apache Flink