"Official" February 2019 BARUG Meetup

Are you going?

95 people going

Share:
Location image of event venue

Details

Image from: https://bit.ly/2SyjaNm

Agenda:
6:30 - Pizza and Networking
7:00 - Announcements
7:05 - Joseph Rickert - Searching for R packages (Lightning talk)
7:20 - Ali Zaidi, Bob Horton, and Mario Inchiosa- Using transfer learning, active learning, and hyperparameter tuning to train high-performance text classifiers with limited labels.
8:05 - Erin LeDell - Scalable Automatic Machine Learning with H2O
####################
Searching for R packages

Joseph Rickert

Finding the right R package to do something of interest is one of the most vexing problems for new R users. I will highlight a few R packages that are useful for searching for other packages, and describe a simple strategy for using them.

####################
Using transfer learning, active learning, and hyperparameter tuning to train high-performance text classifiers with limited labels.

Ali Zaidi, Bob Horton, and Mario Inchiosa

The labels given to training examples are one of the main ways in which human judgement can be represented in patterns that computers can learn. Unfortunately, even when data is plentiful, labels suitable for supervised machine learning may not be. If determining a label requires significant effort or expertise, collecting labels can be a slow or expensive process. We will show three approaches that can be integrated together to help you make the best use of a limited labeling budget: 1) Use transfer learning from complex language models trained on large datasets to generate features that can be used by simple classifiers capable of learning from small datasets, 2) employ these classifiers in an active learning process to judiciously select the most useful cases to label so you can iteratively build better models, and 3) optimize the whole process by careful tuning of hyperparameters.

#######################
Scalable Automatic Machine Learning with H2O

Erin LeDell

The focus of this presentation is scalable and automatic machine learning using the H2O machine learning platform. We will provide a brief overview of the field of Automatic Machine Learning, followed by a detailed look inside H2O's AutoML algorithm, available in the "h2o" R package. H2O AutoML provides an easy-to-use interface which automates data pre-processing, training and tuning a large selection of candidate models (including multiple stacked ensemble models for superior model performance), and due to the distributed nature of the H2O platform, H2O AutoML can scale to very large datasets. The result of the AutoML run is a "leaderboard" of H2O machine learning models which can be easily exported for use in production. R code examples are available on GitHub for participants to follow along on their laptops.