Skip to content

"Official"December 2016 BARUG Meeting

Photo of Joseph Rickert
Hosted By
Joseph R.
"Official"December 2016 BARUG Meeting

Details

Agenda:
6:30 - Food and networking
7:00 - Announcements
7:05 - Nicole White: Codenames: Playing Spymaster with R
7:25 - John Mount: Cleaning real world data in R using the vtreat package
8:00 - Norm Matloff: recsys: an Advanced Tool for Recommender Systems

Note: Due to illness Jeremy Stanley's talk has been canceled.

#------------------
Nicole White

Codenames: Playing Spymaster with R

In Codenames (https://boardgamegeek.com/boardgame/178900/codenames), a popular party game, two teams compete to identify all of their words (or codenames) on a grid of 25 words. One player on each team (called the spymaster) is tasked with giving one-word clues to their teammates to help them identify their words. In this presentation, I'll talk about using R to automate the spymaster's task. Each codename on the board is treated as a document and machine learning techniques are used to find similarities among the codenames, cluster them, and determine the best one-word clue for each cluster. See my blog post (https://nicolewhite.github.io/2016/07/19/spymaster.html) for more details.

#------------------

John Mount

Cleaning real world data in R using the vtreat (https://cran.r-project.org/web/packages/vtreat/index.html) package

I’ll share some typical examples of analysis killing real-world data issues and show how to quickly and correctly prepare data for predictive modeling using the R package vtreat. vtreat is an R data.frame processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner. I will work through preparing string-valued variables for analysis, missing values, and new values appearing after model deployment. I will also discuss some potential pitfalls of data preparation (nested model bias) and how we avoid them. Users have said vtreat has saved their projects and data science careers. I will show why vtreat should become a key part of your predictive modeling workflow.

#------------

Norm Matloff

"recsys: an Advanced Tool for Recommender Systems"

The notion of collaborative filtering for recommender systems will be introduced, and several methods will be discussed, some existing and some novel. Our R package 'rectools' implementing these methods will be introduced, and examples given. Applications include both the "traditional," i.e. marketing, and the innovative, such as medical.

Photo of Bay Area useR Group (R Programming Language) group
Bay Area useR Group (R Programming Language)
See more events