Skip to content

【Alibaba, Google, Uber 】Apache Flink with Hive, Tensorflow, Beam and AthenaX

Photo of Bowen
Hosted By
Bowen and Haitao W.
【Alibaba, Google, Uber 】Apache Flink with Hive, Tensorflow, Beam and AthenaX

Details

Welcome to the Apache Flink community! This time we are thrilled to have speakers from Alibaba, Google, and Uber to talk about Apache Flink with Hive, Tensorflow, Beam, and AthenaX.

Please share this event with your friends, colleagues, or anyone who is interested in Apache Flink and stream processing! We look forward to seeing you soon!

------------------------------------
DATE: Feb 21, 2019 (Thursday)

LOCATION: Google Seattle Lakeside, 837 N 34th St, Seattle, WA 98103. Room: Baseflow

------------------------------------
TALK #1: Integrate Apache Flink with Apache Hive (~30min)

Speaker:
Xuefu Zhang, Senior Staff Engineer at Alibaba and PMC/Commiter of Apache Hive; Bowen Li, Senior Engineer at Alibaba

Abstract:
Along with the community's effort, engineers at Alibaba have explored Flink's potential as an execution engine not just for stream processing but also for batch processing. The findings are encouraging, so we have initiated our effort to make Flink's batching capabilities full-fledged, especially in SQL support. While comparing Flink to a mature SQL tool, we identified a major gap: a well integration with Hive ecosystem. This is crucial to the success of Flink SQL and batch processing as more than likely a user has already established a data ecosystem around Hive. Therefore, we have decided to promote a close integration of Flink with Hive ecosystem. In this talk we will outline our proposal and the roadmap, and share our latest development with a demo.

------------------------------------
TALK #2: Tensorflow data preparation on Apache Beam using Portable Flink Runner (~30 min)

Speaker:
Ankur Goenka, Software Engineer at Google and committer of Apache Beam

Speaker Bio:
Ankur has been building large scale distributed system through out his career. Lately he has been focusing on building platforms for data processing at scale and currently adding support for Apache Flink to Apache Beam.

------------------------------------
TALK #3: AthenaX: Unified Stream & Batch Processing using SQL at Uber (~30 min)

Speaker: Zhenqiu Huang, Senior Software Engineer at Uber's Streaming Processing Team

Abstract
AthenaX is Uber's streaming analytics platform that enables users to run production-quality, large scale streaming analytics using SQL. It is used by many critical real-time business at Uber. For example, Uber's Risk & Fraud team use it to compute near-realtime features to fight frauds, e.g., payment frauds, account takeover and promotion abuse, and these features will be used for both online and offline model training. AthenaX is built on top of Apache Calcite and Apache Flink's SQL API which unified stream & batch processing. In this talk, we will present the design & architecture of AthenaX, and share our production experience.

Speaker Bio:
Zhenqiu works on Uber's Unified SQL-based stream analytics engine AthenaX which is currently powering over 1000+ production real-time data analytics pipelines, and corresponding batch pipelines for adhoc backfill and periodical data quality enhancement.

------------------------------------
AGENDA:

  • 5:30pm - 6pm Food and networking
  • 6pm - 6:10pm Meetup introduction and Flink's community status update, by Bowen
  • 6:10pm - 7:45pm Three talks, ~30min each

EVENT SPONSOR this time: Google

Food and drinks will be provided.

TRANSPORTATION: Taking public transportations would be convenient. There're also street parking spots.

**************
**************
If you are interested in giving talks or sponsoring our next event, please contact @Bowen.
**************
**************

Photo of Seattle Flink Meetup group
Seattle Flink Meetup
See more events
Google Seattle Lakeside
837 N 34th St, Seattle, WA 98103 · Seattle, wa