Vergangenes Meetup

Three Talks: Flink @ Mux, Flink Connectors, Flink's Modular Deployment Model

Dieses Meetup liegt in der Vergangenheit

101 Personen haben teilgenommen

Bild des Veranstaltungsortes

Details

We are co-hosting with Bay Area Big Data and Scalable Systems Meetup (http://www.meetup.com/San-Francisco-Bay-Area-Big-Data-and-Scalable-Systems/events/235184933/)

Please join us for this great night with multiple talks about Apache Flink! We have Scott from Mux as well as Robert and Max who are both Apache Flink committers and PMC.

6:00 - 6:30 Food / Networking

6:30 Talks begin

Talk 1: Real-Time Anomaly Detection with Flink @ Mux (Scott Kidder)

Mux is using Apache Flink to identify anomalies in the distribution & playback of digital video for major video streaming websites. Flink's streaming-first design, intuitive API, event-time processing, and stateful job execution set it apart from the crowd. Scott Kidder will describe the Apache Flink deployment at Mux, leveraging Docker, AWS Kinesis, Zookeeper, HDFS, and InfluxDB.

Talk 2: Connecting Apache Flink to the World (Robert Metzger)

Getting data in and out of Flink in a reliable fashion is one of the most important tasks of a stream processor. This talk will review the most important and frequently used connectors in Flink. Apache Kafka and Amazon Kinesis Streams both fall into the same category of distributed, high-throughput and durable publish-subscribe messaging systems. The talk will explain how the connectors in Flink for these systems are implemented. In particular we’ll focus on how we ensure exactly-once semantics while consuming data and how offsets/sequence numbers are handled. We will also review two generic tools in Flink for connectors: A message acknowledging source for classical message queues (like those implementing AMQP) and a generic write ahead log sink, using Flink’s state backend abstraction. The objective of the talk is to explain the internals of the streaming connectors, so that people can understand their behavior, configure them properly and implement their own connectors.

Talk 3: Dynamic and Modular Cluster Deployments with Apache Flink (Maximilian Michels)

There are an increasing variety of methods and frameworks to deploy clusters. In its early days, Flink used to be deployed on the premises or in an IaaS cloud environment. Then cluster managers like Yarn and Mesos changed the way users managed their resources. Now, we live in the container world where Docker combined with Kubernetes add another way to deploy applications.

The lifespan of processes, fault-tolerance, and multi-tenancy are fundamentally different across on-premise, Yarn, Mesos, and Docker deployments. The goal of the Apache Flink community was to develop a deployment technique that would enable all these paradigms to be supported well. In this talk, we will learn about the changes to Flink and how they enable isolated job deployments as well as sessionized multi-user clusters across different cluster deployment scenarios.

About Scott Kidder

Scott Kidder is a Software Engineer at Mux in San Francisco where he has developed a real-time anomaly-detection & alerting system that relies heavily on Apache Flink. Prior to joining Mux, Scott worked at Brightcove on the Zencoder cloud-based video-encoding service where he introduced machine-learning techniques to intelligently scale computing resources in response to changes in load.

About Robert Metzger:

Robert Metzger is a PMC member at the Apache Flink® project, co-founder and software engineer at data Artisans and a member of the Apache Software Foundation.

He has worked on many Flink components including the Kafka and YARN connectors. He is a frequent speaker at conferences such as the Hadoop Summit, ApacheCon, QCon and meetups around the world.

Robert studied Computer Science at TU Berlin and worked at IBM Germany and at the IBM Almaden Research Center in San Jose.

About Maximilian Michels:

Max is an Apache Flink and Apache Beam committer and PMC member. After his studies at Free University of Berlin and Istanbul University, he worked as a research assistant in parallel and distributed computing at Zuse Institute Berlin. Later he joined Data Artisans to work on Apache Flink. Max enjoys to learn about open-source development, distributed systems, infrastructure, and how to bring more women into tech.