Scalable Machine Learning in R and Python with H2O


Details
Join us for our first East Bay meetup!!
Agenda:
6:00 - 6:30 - Doors Open & Pizza
6:30 - 7:30 - Erin's talk on H2O in R & Python
7:30 - 8:00 - Q&A, Network & Finish Pizza
Talk Abstract:
The focus of this presentation is scalable machine learning using the h2o R and Python packages. H2O (https://github.com/h2oai/h2o-3) is an open source, distributed machine learning platform designed for big data, with the added benefit that it's easy to use on a laptop (in addition to a multi-node Hadoop or Spark cluster). The core machine learning algorithms of H2O are implemented in high-performance Java, however, fully-featured APIs are available in R, Python, Scala, REST/JSON, and also through a web interface.
Since H2O's algorithm implementations are distributed, this allows the software to scale to very large datasets that may not fit into RAM on a single machine. H2O currently features distributed implementations of Generalized Linear Models, Gradient Boosting Machines, Random Forest, Deep Neural Nets, Stacked Ensembles (aka "Super Learners"), dimensionality reduction methods (PCA, GLRM), clustering algorithms (K-means), anomaly detection methods, among others.
R and Python code with H2O machine learning code examples will be demoed live and will be made available on GitHub (https://github.com/h2oai/h2o-tutorials) for participants to follow along on their laptops if they choose. For those interested in running the code on a multi-node Amazon EC2 cluster, an H2O AMI is also available.
Speaker Bio:
Dr. Erin LeDell is a Machine Learning Scientist at H2O.ai (https://www.h2o.ai/), the company that produces the open source machine learning platform, H2O. Erin received her Ph.D. in Biostatistics with a Designated Emphasis in Computational Science and Engineering from UC Berkeley. Before joining H2O.ai, she was the Principal Data Scientist at Wise.io (acquired by GE in 2016) and Marvin Mobile Security (acquired by Veracode in 2012) and the founder of DataScientific, Inc.

Scalable Machine Learning in R and Python with H2O