Python Streaming Pipelines with Beam on Flink

Name: Python Streaming Pipelines with Beam on Flink
Start: 2018-11-13T18:00:00Z
End: 2018-11-13T20:00:00Z
Location: 4th Floor Holden House, eOffice Soho

Hosted by Christos H.

Apache Flink London Meetup

Details

Python is popular amongst data scientists and engineers for data processing tasks. The big data ecosystem has traditionally been rather JVM centric. Often Java (or Scala) are the only viable option to implement data processing pipelines. That sometimes poses an adoption barrier for organizations that have already invested in other language ecosystems.

The Apache Beam project provides a unified programming model for data processing and its ongoing portability effort aims to enable multiple language SDKs (currently Java, Python and Go) on a common set of runners. The combination of Python streaming on the Apache Flink runner is one example. Let’s take a look how the Flink runner translates the Beam model into the native DataStream (or DataSet) API, how the runner is changing to support portable pipelines, how Python user code execution is coordinated with gRPC based services and how a sample pipeline runs on Flink.

Apache Flink London Meetup

Data Reply

Python Streaming Pipelines with Beam on Flink

Apache Flink London Meetup

Details

Related topics

Sponsors

Data Reply

You may also like