Skip to content

Kafka on Kubernetes & Pipelines with Hadoop & Probabilistic Programming

Photo of Christine Koppelt
Hosted By
Christine K. and Andreas P.
Kafka on Kubernetes & Pipelines with Hadoop & Probabilistic Programming

Details

19:00 Doors open & Get-Together
19:30-20:00 Emiliano Tomaselli, Olgierd Grodzki: Streaming Data Platform with Kafka and Kubernetes
20:00-20:25 Andreas Pawlik: Data Pipelining and Deep Learning Pipelines with Hadoop
20:25-20:35 Break
20:35-21:20 Alexey Kuntsevich: Data-Driven Decision Making with Probabilistic Programming

Emiliano Tomaselli, Olgierd Grodzki: Streaming Data Platform with Kafka and Kubernetes

During this session Emiliano Tomaselli and Olgierd Grodzki from Data Reply give a presentation on Kafka and Kubernetes. Kafka has become the defacto standard for building a streaming architecture. A lot of organizations want to run Kafka "as a service" - on premise or in the cloud - and use it to enable its developers to create Apps, Data Pipelines and more.
In order to make the platform deployment Scalable, Fault Tolerant and Cloud Native we decided to take advantage of one of the most popular open-source systems for orchestrating containerized applications: Kubernetes.
We will show you our use-cases developed at the customer side using those platforms, some challenges that we faced (eg. Security, Acls and more ) and how we tried to solve them by developing custom tools and applications.
To conclude we will also present one of the solutions we adopted to bring automation into the Kubernetes Ecosystem with the CI/CD Pipelines.

Andreas Pawlik: Data Pipelining and Deep Learning Pipelines with Hadoop

Deep Learning profits from training and validation on large amounts of data. I will outline the challenges with running Deep Learning workflows on large data sets, discuss how Hadoop/Mesos can help address these challenges and explore the benefits and limitations of the approach.

Alexey Kuntsevich: Data-Driven Decision Making with Probabilistic Programming

No modern company can avoid shifting towards a data-driven decision making, in order to stay successful and competitive. Still, all the experience and knowledge gathered inside the company has to be incorporated into data-driven processes and automation. Probabilistic programming is able to combine automation, uncertainty which is a big part of any business, and existing knowledge from domain experts. The number of potential applications for probabilistic programming is limitless, from image reconstruction and spam filtering to economical modelling and anomaly detection. Moreover, probabilistic programming frameworks exist for nearly all popular development stacks. With several simple examples I’d like to show how to translate business reasoning into code and how to use a human-in-the-loop model to control and manage the uncertainty of the environment.

Photo of Munich Data Engineering Meetup group
Munich Data Engineering Meetup
See more events