Scalable Machine Learning in R with H2O

This is a past event

48 people went

Location image of event venue


We will have pizza and beers. Doors open at 6:00 pm - talk starts 6:30 pm

Hashtag for R meetups: #pdxrlang


The discussion will begin with a brief overview of the current machine learning landscape in R. After the introduction, we will discuss H2O, a scalable open source machine learning library. H2O has APIs in R, Python, Scala and Java, and the focus of this talk will be the `h2o` and `h2oEnsemble` R packages. All of H2O's algorithm implementations are distributed, which allows the software to scale to big data. H2O can be used to speed up machine learning problems on your laptop (as a local multicore cluster), or it can be used in a multi-node cluster setting (for example, on Amazon EC2). H2O currently features distributed implementations of GLM, GBM, Random Forest and Deep Neural Nets., the company behind H2O, is based in Mountain View, CA and has a scientific advisory council comprised of very well known contributors to machine learning community: Trevor Hastie, Rob Tibshirani and Stephen Boyd, all from Stanford University.

Speaker Bio:

Erin is a Statistician and Machine Learning Scientist at, and the author of several R packages. Erin received her Ph.D. in Biostatistics with a Designated Emphasis in Computational Science and Engineering from University of California, Berkeley. Her research focuses on ensemble machine learning, learning from imbalanced binary-outcome data, influence curve based variance estimation and statistical computing.