Skip to content

SF: Apache MADLib + Apache HAWQ for advanced SQL machine learning on Hadoop

SF: Apache MADLib + Apache HAWQ for advanced SQL machine learning on Hadoop

Details

SF: Apache MADLib + Apache HAWQ for advanced SQL machine learning on Hadoop

NOTE: THIS MEETUP IS IN SAN FRANCISCO at 5:45pm PDT

6:00-6:30 pm Food and networking
6:30-8:00 pm Talk and Q&A
8:00-8:30 pm Wind down

The growing Apache ecosystem just got bigger and better -- now with the ability to crunch vast volumes of data using fully ANSI-compliant SQL and at-scale machine learning algorithms.

Apache HAWQ has been years in the making and derives its heritage from Greenplum Database and PostgreSQL. HAWQ enables developers, analysts, data scientists and engineers to run advanced SQL queries, transforming data sets of extreme size, visualizing data with standard tools, and seamlessly running R and Python in a highly-distributed fashion all in the same environment. Invoke powerful machine learning and advanced statistical functions using Apache MADlib, and build models on billions of rows of data.

Speakers will discuss the following:

• 20-30 min - HAWQ, MADlib, Journey to Apache (Frank McQuillan)

• 20-30 min - MADlib architecture and functional demo on how to use MADlib/PivotalR (Rahul Iyer)

20-30 min - Data science perspective and DS demo (Sarah Aerni)

Speakers:
Frank McQuillan is Director of Product Management at Pivotal, focusing on analytics and machine learning. Prior to Pivotal, Frank has worked on projects in the areas of on-line advertising technology and robotics.

Sarah Aerni is a Principal Data Scientist at Pivotal leading the San Francisco practice. She executes projects with customers from pharmaceutical companies and healthcare providers to financial institutions. Before Pivotal, Sarah obtained her PhD from Stanford University in Biomedical Informatics, performing research at the interface of biomedicine and machine learning. She also co-founded a company offering expert services in informatics to both academia and industry.

Rahul Iyer is a Senior Developer in the Predictive Analytics team at Pivotal. He has a background in the field of robotics and machine learning, and holds a PhD in Computer Science from the University of Texas at Austin.

Can't join us in person? No need to RSVP--you can participate via LiveStream:
https://livestream.com/pemo/apache

Folks participating via LiveStream can ask questions here:
https://www.sli.do/da5dl6hq

Want to give a talk? Have a talk idea? Submit your idea here! (http://bit.ly/SubmitIdeaHere)

Photo of Data Engineers Guild group
Data Engineers Guild
See more events
Pivotal
875 Howard Street, 5th Floor · San Francisco, CA