Past Meetup

Big Data Application Meetup 03/16

This Meetup is past

153 people went


Shoutout to HPE ( for sponsoring this event!


6:00 - 6:30 - Socialize over food and beer(s)

6:30 - 8:00 - Talks


Talk #1: Introduction to Apache Beam (FKA Google's Dataflow), by Jean-Baptiste Onofré, Talend

Talk #2: Harnessing the power of unstructured data using Haven OnDemand, by Phong Vu, HPE

Talk #3: Introducing EsgynDB, based on Apache Trafodion, by Rao Kakarlamudi, Esgyn


Talk #1: Introduction to Apache Beam (FKA Google's Dataflow), by James Malone, Google & Jean-Baptiste, Talend

Apache Beam (formerly Google Cloud Dataflow SDK) is an unified model and set of language-specific SDKs for defining and executing data processing workflows. You design pipelines, simplifying the mechanics of large-scale batch and streaming data processing and can run on a number of runtimes like Apache Flink, Apache Spark, and Google Cloud Dataflow (a cloud service).

This talk will introduce the Beam programming model, and how you can use it to design your pipelines, transporting PCollection and applying some PTransforms. You will see how the same code will be "translated" to a target runtimes thanks to a specific runner. You will also have an overview of the current roadmap, with the new interesting features.

Talk #2: Harnessing the power of unstructured data using Haven OnDemand, by Phong Vu, HPE

HPE Haven OnDemand is a platform for building data-rich applications and analytics using text analysis, speech recognition, image analysis, indexing and search APIs. Simply put, developers and businesses use the Haven OnDemand APIs to add advanced capabilities such as natural language processing, machine learning, and predictive analytics to their applications.

This talk will focus on Haven OnDemand platform’s capabilities of human information analytics and building advanced unstructured text indexes.

Talk #3: Introduction EsgynDB, based on Apache Trafodion, by Rao Kakarlamudi, Esgyn

Introducing EsgynDB based on Apache Trafodion, the Big Data database that revolutionizes the way you manage Big Data and Hadoop. With EsgynDB, you can now run your transactional and enterprise operational reporting workloads on Hadoop, and avoid being locked into those expensive, proprietary database vendors.

By consolidating your workloads onto the same platform, you can derive business insight faster and cheaper than ever before. You can adopt EsgynDB to enable a Big Data strategy that simplifies and modernizes your operational data management, as illustrated by the following use cases from early adopters:

• Gain real-time views and analytics on security data collected from IT infrastructure, firewalls, and web traffic worldwide

• Monitor transit fleet to optimize and maintain efficiency in real time and perform historical reporting for future planning

• Offload historic data from expensive transactional systems to lower costs and differentiate customer experience by enriching transactional data with other data sources

• Transform traditional back office services to deliver service capabilities over the Internet


• Jean-Baptiste Onofré is an Apache Software Foundation member. He's PMC, champion, mentor and committer on several Apache projects, in different domain like middleware, system integration, and big data. He's software architect and fellow at Talend.

• Phong Vu is a developer evangelist for Haven OnDemand at Hewlett Packard Enterprise. Prior to joining HPE in 2015, Phong worked at Nokia and Microsoft on mobile technologies and Web services architecture.

• Rao Kakarlamudi is Principal Architect at Esgyn ( Esgyn promotes EsgynDB (, built on Apache Trafodion (), brings true enterprise relational database capabilities to the Hadoop ecosystem. Prior to Esgyn, Rao was one of the main architect of HP's Neoview/Seaquest Enterprise database warehouse and Business Intelligence (EDW/BI) database product. Neoview/Seaquest is a massively parallel processing (MPP) system designed from the ground up to support operational BI on a very large scale. Rao specializes in real time and operational BI with expertise in workload management and database connectivity areas. He hold patents on various techniques used in EDW/BI databases.


Cask HQ is a few minutes walk from the California Avenue Caltrain Station.

Also, Cask HQ has its own parking lot, but it will certainly not accommodate all guests. Please use parking lots available nearby: