Distributed Data To The Max

Name: Distributed Data To The Max
Start: 2020-02-13T17:30:00+01:00
End: 2020-02-13T20:30:00+01:00
Location: Picnic HQ

Hosted by Fabio T.

Meet the group

Reactive Amsterdam

No reviews yet

Details

Distributed Data. Let's hear about it from those who are best at it.

The night will be opened by Gerard Maas, author of "Stream Processing With Apache Spark" and engineer at Lightbend. Follows a cascade of wisdom by Ana Henriques Narciso, Data Engineer at our wonderful host Picnic. I am super excited about this event and I'm sure you will be too after reading their abstracts!

🍕 17:30 doors open + pizza
🎙 Talk #1: Streaming Applications on Kubernetes with Cloudflow
🔈 Talk #2: Monolithic Data Warehouse & Micro-Services: analytics meets operations at Picnic
🏠 21:00 End of Stream

🎙 Talk #1: Streaming Applications on Kubernetes with Cloudflow

Distributed streaming applications offer us a scalable approach to process data streams and deliver valuable real-time insights on the data. “Data is best eaten as fresh as possible.” Creating a fully-fledged streaming application usually requires the development of several steps that are assembled as a whole into a ‘data pipeline’. Deployment of such applications can become quite complicated as we need to ensure that the resources for each piece are provisioned while preserving end-to-end application and data consistency.

In this talk, we introduce Cloudflow, the latest open-source project from Lightbend. Cloudflow aims at reducing the complexity of creating and deploying streaming applications on Kubernetes. Using a financial fraud detection application as a practical example, we are going to explore the Cloudflow API, learn how to assemble an application using a Blueprint, and run it locally to verify that it works before deploying to a Kubernetes cluster with a YAML-free experience.

At the end of this presentation, you will have a pretty good idea of how to start using Cloudflow for your projects.

Gerard Maas is a Principal Engineer at Lightbend. He is a creative soul with a particular interest in streaming systems. He currently contributes to the engineering of Cloudflow at Lightbend, where he focuses on the integration of Structured Streaming and other stream processing technologies. He is the main author of the O’Reilly book Stream Processing with Apache Spark. (http://shop.oreilly.com/product/0636920047568.do)

🔈 Talk #2: Monolithic Data Warehouse & Micro-Services: analytics meets operations at Picnic

At Picnic, every decision is backed by data from dozens of systems. Have you ever wondered how one builds a monolithic Data Warehouse (DWH) in a micro-services world? In a business that is ever-changing?
In this talk, we will tell you about our journey of building the ultimate centralized analytical source of truth to fulfill all data desires from the business while keeping developers happy. We have done hundreds of analytics projects involving both batch and real-time events loading. Consumer-driven contracts, lots of data modelling, and guess what, the right process, are some of the ingredients that are part of our magic formula.

Ana Henriques Narciso is a Data Engineer turned Product Owner at Picnic. She worked as a Business Intelligence consultant for 4 years in Lisbon, before joining Picnic more than 3 years ago. She believes the world can be modeled in a Data Warehouse and all answers can be obtained using SQL. Besides nerding out with developers and enabling the business to make the right decisions, she is a fierce metalhead and loves to play guitar.

Events in Amsterdam

Distributed Data To The Max

Reactive Amsterdam

Details

Members are also interested in