Data Engineering for Artificial Intelligence


Details
This is the third instalment of our series of Data Engineering Meetups: ‘Data Engineering for Artificial Intelligence’.
For this edition, we have an amazing lineup of speakers, both from Zalando and outside. We have plenty of time for discussions before, during, and after the talks. Interested in data engineering? Join us, and share your story!
Schedule:
18:00 - 18:30 Doors Open: drinks, food, and discussions
18:30 - 19:00 Keynote by Eric Bowman, VP of Engineering at Zalando
19:00 - 19:20 KSQL – The Open Source SQL Streaming Engine for Apache Kafka, by Kai Waehner, Technology Evangelist at Confluent
19:20 - 19:40 Why and How to Leverage the Power and Simplicity of SQL on Apache Flink, by Fabian Hüske, Software Engineer at data Artisans
19:40 - 20:10 Break, drinks, food, and discussions
20:10 - 20:30 Asset Management for Machine Learning, by Georg Hildebrand, Data Scientist at Zalando
20:30 - 20:50 Vehicle and Property Price Prediction at Scout24, by Sebastian Bolz and Maik Goetze, Scout24
20:50 - 21:45 Networking
21:45 Event ends
Enabling Machine Learning at Zalando
Speaker: Eric Bowman, VP of Engineering, Zalando
Abstract:
We have embarked on an exciting journey to enable machine learning at Zalando, unleashing new opportunities for our customers and engineers alike. Leveraging our data treasure is a company-wide priority and during this talk, we will discuss what we have achieved and built so far, and take a look at the engineering challenges that lie ahead.
KSQL – The Open Source SQL Streaming Engine for Apache Kafka
Speaker: Kai Waehner, Technology Evangelist, Confluent
Abstract:
The rapidly expanding world of stream processing can be daunting. KSQL is the open-source, Apache 2.0 licensed streaming SQL engine on top of Apache Kafka which aims to make stream processing available to everyone. This session introduces the concepts, architecture, use cases and benefits of KSQL. A live demo shows how to setup and use KSQL quickly and easily on top of your Kafka ecosystem.
Why and How to Leverage the Power and Simplicity of SQL on Apache Flink
Speaker: Fabian Hüske, Software Engineer, data Artisans
Abstract:
SQL is the lingua franca of data processing and everybody working with data knows SQL. Apache Flink provides SQL support for querying and processing batch and streaming data. Flink’s SQL support powers large-scale production systems at Alibaba, Huawei, and Uber. Based on Flink SQL, these companies have built systems for their internal users as well as publicly offered services for paying customers. In my talk, I will discuss why you should and how you can (not being Alibaba or Uber) leverage the simplicity and power of SQL on Flink.
Asset Management for Machine Learning
Speaker: Georg Hildebrand, Data Scientist, Zalando
Abstract:
Extracted features and trained models are first class citizen data assets for modern data driven companies. Therefore management and governance is key to success for many data science projects. In this talk we will have a look on options for managing feature and model data and discuss tools and candidate architectures for this task.
Vehicle and Property Price Prediction at Scout24
Speakers: Sebastian Bolz and Maik Goetze, Scout24
Abstract:
Scout24 operates Europe’s largest online automotive sales platform and Germany’s largest online platform for buying, selling and renting properties. As part of goal to become the authority on prices in the automotive and property markets, we have developed several generations of machine learning models that predict prices for our users’ vehicles and property. In this talk, we will describe the current generation of these models and some important lessons learned during their development.
***
We aim for short, practical talks about experiences with data engineering at scale. We don’t do sales pitches, and instead focus on sharing experiences and lessons learned the hard way.
If you'd like to talk at our next Meetup, get in touch.

Data Engineering for Artificial Intelligence