Big Data Analytics: Scalable machine learning using open-source tools with Rahul Iyer
5:30-6:30 pm Pizza and networking
6:30-8:00pm Talk and Q&A
8:00-8:30 pm Wind down
With the explosion of big data, the need for fast and inexpensive analytics solutions has become a key basis of competition in many industries. Extracting the value of big data with analytics can be complex, and requires advanced skills.
At Pivotal, we are building open-source solutions (MADlib, PivotalR, PyMadlib) to simplify this process for the user, while maintaining the efficiency necessary for big data analysis.
This talk will provide information about MADlib, an open source library of SQL-based algorithms for machine learning, data mining and statistics that run at large scale within a database engine, with no need for data import/export to other tools.
It provides an overview of the library’s architecture and compares various statistical methods with those available in other open-source packages like Spark and Mahout.
We also introduce, PivotalR, a R-based wrapper for MADlib that allows data scientists and programmers to access power of MADlib along with the ease of use of R.
About the speaker: Rahul Iyer
Rahul Iyer is a Senior Developer in the Predictive Analytics team at Pivotal. He has a background in the field of robotics and machine learning, and holds a PhD in Computer Science from the University of Texas at Austin.
As part of this doctoral research, he worked on building computational model simulations for human walking using realistic musculoskeletal models. At Pivotal, he primarily works towards developing novel solutions for large-scale machine learning problems, contributing to open-source tools like MADlib and PivotalR. Outside of work, he enjoys travel, photography and racquetball.