H2O: Open Source ML Engine for Big Data


Details
This meetup will focus on the open source (https://github.com/0xdata/h2o) machine learning engine, H2O (http://0xdata.com/). The H2O project is designed from the ground up to support big data (it runs on top of HDFS and has support for distributed algorithms). You can train models from the command line or via a web interface, so it is an accessible tool for data scientists of all skill levels.
Recently, they have added an R interface (http://docs.0xdata.com/Ruser/top.html) so you can run distributed ML algorithms from inside R. The project is continuously adding new algorithms, and currently includes fast (and some distributed) implementations of popular algorithms like GBM (http://docs.0xdata.com/tutorial/gbm.html) and Random Forest (http://docs.0xdata.com/tutorial/rf.html) and also deep learning (http://www.youtube.com/watch?v=GZU57OCSZy4).
http://0xdata.com/assets/images/illustration-3.png
Our speakers are Nidhi Mehta (http://www.linkedin.com/pub/nidhi-mehta/13/695/964) and Amy Wang (http://www.linkedin.com/pub/amy-wang/55/387/650) from 0xdata (http://0xdata.com/about/), the company behind H2O. Nidhi has a background in Physics and an interest in large scale data analysis and machine learning algorithms. Amy has a background in Applied Mathematics and Statistics.
Nidhi's talk will focus on GBM & algorithm correctness. She will discuss how they test their algorithms and the best practices that can be used for validating implementation correctness. Amy will talk about the h2o R package (http://cran.r-project.org/web/packages/h2o/index.html) and how to use it to connect to your data in Tableau.
Thanks again to our event space sponsor, GeekdomSF (http://geekdomsf.com/), for allowing us to use their venue to host our meetup.
We will provide pizza (sponsored by 0xdata) and refreshments. Hope to see you there!
7:00 - 7:30pm: Food & Networking
7:30 - 8:15pm: Nidhi's talk
8:15 - 9:00pm: Amy's talk
9:00 - 9:30pm: Socializing

H2O: Open Source ML Engine for Big Data