【Alibaba, Google, Uber 】Apache Flink with Hive, Tensorflow, Beam and AthenaX


Details
Welcome to the Apache Flink community! This time we are thrilled to have speakers from Alibaba, Google, and Uber to talk about Apache Flink with Hive, Tensorflow, Beam, and AthenaX.
Please share this event with your friends, colleagues, or anyone who is interested in Apache Flink and stream processing! We look forward to seeing you soon!
------------------------------------
DATE: Feb 21, 2019 (Thursday)
LOCATION: Google Seattle Lakeside, 837 N 34th St, Seattle, WA 98103. Room: Baseflow
------------------------------------
TALK #1: Integrate Apache Flink with Apache Hive (~30min)
Speaker:
Xuefu Zhang, Senior Staff Engineer at Alibaba and PMC/Commiter of Apache Hive; Bowen Li, Senior Engineer at Alibaba
Abstract:
Along with the community's effort, engineers at Alibaba have explored Flink's potential as an execution engine not just for stream processing but also for batch processing. The findings are encouraging, so we have initiated our effort to make Flink's batching capabilities full-fledged, especially in SQL support. While comparing Flink to a mature SQL tool, we identified a major gap: a well integration with Hive ecosystem. This is crucial to the success of Flink SQL and batch processing as more than likely a user has already established a data ecosystem around Hive. Therefore, we have decided to promote a close integration of Flink with Hive ecosystem. In this talk we will outline our proposal and the roadmap, and share our latest development with a demo.
------------------------------------
TALK #2: Tensorflow data preparation on Apache Beam using Portable Flink Runner (~30 min)
Speaker:
Ankur Goenka, Software Engineer at Google and committer of Apache Beam
Speaker Bio:
Ankur has been building large scale distributed system through out his career. Lately he has been focusing on building platforms for data processing at scale and currently adding support for Apache Flink to Apache Beam.
------------------------------------
TALK #3: AthenaX: Unified Stream & Batch Processing using SQL at Uber (~30 min)
Speaker: Zhenqiu Huang, Senior Software Engineer at Uber's Streaming Processing Team
Abstract
AthenaX is Uber's streaming analytics platform that enables users to run production-quality, large scale streaming analytics using SQL. It is used by many critical real-time business at Uber. For example, Uber's Risk & Fraud team use it to compute near-realtime features to fight frauds, e.g., payment frauds, account takeover and promotion abuse, and these features will be used for both online and offline model training. AthenaX is built on top of Apache Calcite and Apache Flink's SQL API which unified stream & batch processing. In this talk, we will present the design & architecture of AthenaX, and share our production experience.
Speaker Bio:
Zhenqiu works on Uber's Unified SQL-based stream analytics engine AthenaX which is currently powering over 1000+ production real-time data analytics pipelines, and corresponding batch pipelines for adhoc backfill and periodical data quality enhancement.
------------------------------------
AGENDA:
- 5:30pm - 6pm Food and networking
- 6pm - 6:10pm Meetup introduction and Flink's community status update, by Bowen
- 6:10pm - 7:45pm Three talks, ~30min each
EVENT SPONSOR this time: Google
Food and drinks will be provided.
TRANSPORTATION: Taking public transportations would be convenient. There're also street parking spots.
**************
**************
If you are interested in giving talks or sponsoring our next event, please contact @Bowen.
**************
**************

Sponsors
【Alibaba, Google, Uber 】Apache Flink with Hive, Tensorflow, Beam and AthenaX