Skip to content

5th Apache Flink Meetup Munich w/ Flink @ Workday & Image Processing with Flink

5th Apache Flink Meetup Munich w/ Flink @ Workday & Image Processing with Flink

Details

Excited to announce the next meetup in Munich with talks about "Tenant (source)- based encryption" by Enrico Agnoli, Workday, and "Image processing with Flink" by Apache Flink PMC member Marton Balassi.

DATE: Thursday, March 21

TALK #1: Tenant (source)-based encryption in Flink

At WORKDAY Inc. we process data for thousands of customers and our strict security regulations demand we always encrypt customer data at rest and in transit. That means, each piece of data should always be stored, encrypted with the customer key.

This is a challenge in a Data Streaming platform like Flink, where data may be persisted in multiple phases:

  • Storage of States in Checkpoints or Savepoints
  • Temporary fs storage for time-window aggregation
  • Common spilling to disk when heap is full
    On top of that, we need to consider that in a Flink dataflow the data might get manipulated. After the manipulation we need to maintain the context needed to correctly encrypt it.

We solved this challenge by extending the serialization libraries (AVRO) to enable encryption at serialization.

In this talk we will walk through the complexity of having a runtime encryption in a multi-tenant data streaming world, how we solved it and and how we support data traceability for GDPR.

Speaker: Enrico Agnoli, Senior Software Development Engineer at Workday https://www.linkedin.com/in/enricoagnoli/

TALK #2: Image Processing with Flink

Image processing seems to be a perfect match for big data tools on paper. Image data is inherently large and unstructured, processing it requires immense processing power and flexible, rich data manipulation languages. However bridging the gaps between the standard image tools and the big data world poses new challenges for organizations. These algorithms gain a significant speed up from GPUs, many of the most popular libraries are written in Python, organizations can often accumulate 100s of TBs of image data and most image sets require fast random access from downstream applications. In this talk we explore how we can use Flink to satisfy these requirements and how it squares off against Spark in this domain.

This talk builds on previous work with Jan Kunigk and Dr. Mirko Kampf presented at Strata London 2018. [1,2]

[1] https://www.oreilly.com/library/view/strata-data-conference/9781492025993/video320493.html
[2] https://conferences.oreilly.com/strata/strata-eu-2018/public/schedule/detail/65287

Speaker: Marton Balassi

Marton Balassi is an Apache Flink PMC member and has driven big data adoption at around 50 customers as a Senior Solutions Architect at Cloudera. He has opened his own consulting company and enjoys the freelancer lifestyle since 1st March.

LOCATION: Mindspace Viktualienmarkt, Rosental 7, München (Level 2)

Agenda:

  • 6pm - 6:30pm Food & Networking
  • 6:30pm - 7pm: Tenant (source)-based encryption in Flink
  • 7pm - 7:30pm: Image Processing with Flink
  • 7:30pm - 9pm Networking
Photo of Apache Flink Meetup Munich group
Apache Flink Meetup Munich
See more events
Mindspace Viktualienmarkt
Rosental 7 · München, BY