Skip to content

Evening with Google Cloud, Distributed DataFrame, and Apache Flink

Photo of Henry Saputra
Hosted By
Henry S.
Evening with Google Cloud, Distributed DataFrame, and Apache Flink

Details

Please join us for another exciting evening to share knowledge and experience with Apache Flink community.

Tentative schedule:

6pm-6:30pm - Door open and socialize

6:30pm-8pm - Talks

Abstracts

• Apache Flink Community Updates

Will share the current community updates for Apache Flink including releases, community growth, new features, adoptions, and meetups happening.

• Google Cloud Platform and Apache Flink

Using Apache Flink with Google Cloud Dataproc (Google's managed Hadoop MapReduce, Spark, Pig, and Hive service) and Cloud Bigtable (Google's high performance NoSQL database).

• Building Interactive Big Apps on Flink & Spark using DDF (Distributed DataFrame) - http://ddf.io

Enterprise users today demand the ability to glean insights from their disparate data spread across varied transactional and analytics sources; hence, analytics application developers need the ability to connect to varied data & compute engines such as Spark, Flink, Cassandra, etc.

A key pain point for developers is the lack of a uniform API across data & compute engines, a limitation which adversely impacts developer productivity, while also restricting dataflow across different engines. DDF (Distributed DataFrame) is a simple but powerful API above and across multiple engines. Using DDF, developers reap significant benefits including (1) a unified and highly productive API for data/compute access, (2) the ability to process data at-source, bypassing the absolute requirement for a Hadoop data lake, and (3) future-proofing against rapidly shifting economics of specific data engines.

To date, DDF has been implemented on Spark, Flink, and other engines. In this talk we demonstrate, for the first time, a business-analyst-friendly realtime data exploration and visualization system working directly with Flink. We will show how a business user can enter natural-language questions of his/her data and get real-time answers from Flink, in the form of visual charts and tables. We’ll also show interaction with the DDF-on-Flink API at the developer level, and share our experience on the challenges and lessons learned in realizing this vision on Flink, and compare and contrast that with the same experience on Spark.

Speaker Bios

• Henry Saputra

Henry is a PMC member for the Apache Flink and also member of the Apache Software Foundation. Henry also member of Apache Incubator PMC and former mentor of Apache Flink while still in incubation.

Currently Henry is working on distributed systems and big data application platforms.

• Christopher Nguyen, Founder and CEO, Adatao

Christopher is the CEO & co-founder of Adatao. Previously, he served as engineering director of Google Apps and co-founded two other successful startups.

As a professor, he co-founded the Computer Engineering program at HKUST.

He earned his BS degree from University of California Berkeley summa cum laude and a Ph.D. from Stanford, where he created the first standard-encoding Vietnamese software suite, authored RFC 1456, and contributed to Unicode 1.1.

He is a co-creator of the open-source Distributed DataFrame project http://ddf.io.

• Rohit Rai, Founder and CEO of Tuplejump

Rohit is the founder and CEO of Tuplejump, Inc. and oversees the research and product development operations of the company.

He is author of the book, Real-Time Web Application Development. He is the creator of Calliope, the first connector for Cassandra & Spark, play-yeoman sbt plugin and has been a contributor to many open source projects.

He is an expert at scala, akka, spark, cassandra and distributed systems in general. Over the past 10 years, he has helped several companies including a few fortune 100, establish their (big) data analytics strategy, infrastructure and the required solutions.

• Les Vogel has been a Software engineer for over 40 years and worked for Google in developer relations for four. He worked with Apple, TVA, Motorola, Boeing, Ashton-Tate and many others.

He's most well known for AirPort wireless networking, but has worked on Flood management, Solar housing, handwriting recognition, a spreadsheet and written a couple OS's.

Photo of Bay Area Apache Flink Meetup group
Bay Area Apache Flink Meetup
See more events
Google
1200 Crittenden Lane · Mountain View, CA