addressalign-toparrow-leftarrow-rightbackbellblockcalendarcameraccwcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-checkcircle-with-crosscircle-with-pluscrossdots-three-verticaleditemptyheartexporteye-with-lineeyefacebookfolderfullheartglobegmailgooglegroupsimageimagesinstagramlinklocation-pinm-swarmSearchmailmessagesminusmoremuplabelShape 3 + Rectangle 1outlookpersonJoin Group on CardStartprice-ribbonShapeShapeShapeImported LayersImported LayersImported Layersshieldstartickettrashtriangle-downtriangle-uptwitteruserwarningyahoo

"Official"­December 2016 BARUG Meeting

  • Instacart

    50 Beale Street, 11th floor, San Francisco, CA (map)

    37.791256 -122.396400

  • Agenda:
    6:30 - Food and networking
    7:00 - Announcements
    7:05 - Nicole White: Codenames: Playing Spymaster with R
    7:20 - Jeremy Stanley: XGBoost with Quantile Regression for Predicting Variability in Delivery Times
    7:50 - John Mount: Cleaning real world data in R using the vtreat package
    8:20 - Norm Matloff: recsys: an Advanced Tool for Recommender Systems

    #------------------
    Nicole White

    Codenames: Playing Spymaster with R

    In Codenames, a popular party game, two teams compete to identify all of their words (or codenames) on a grid of 25 words. One player on each team (called the spymaster) is tasked with giving one-word clues to their teammates to help them identify their words. In this presentation, I'll talk about using R to automate the spymaster's task. Each codename on the board is treated as a document and machine learning techniques are used to find similarities among the codenames, cluster them, and determine the best one-word clue for each cluster. See my blog post for more details.

    #------------------
    Jeremy Stanley

    XGBoost with Quantile Regression for Predicting Variability in Delivery Times

    At Instacart, we optimize shopper routing to balance the efficiency with which we can fulfill orders with the risk of causing late deliveries. By predicting the quantiles of the expected delivery time for routes in planning, we can estimate the chance a route will result in late deliveries.

    In this talk, we will cover:
    * Quantile estimation with check loss function
    * A smooth approximation that is twice differentiable
    * Approximate quantile regression in XGBoost in R with custom objective functions
    * Visualizing delivery time variability in maps using ggmaps

    #------------------

    John Mount

    Cleaning real world data in R using the vtreat package

    I’ll share some typical examples of analysis killing real-world data issues and show how to quickly and correctly prepare data for predictive modeling using the R package vtreat.  vtreat is an R data.frame processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner.  I will work through preparing string-valued variables for analysis, missing values, and new values appearing after model deployment.  I will also discuss some potential pitfalls of data preparation (nested model bias) and how we avoid them. Users have said vtreat has saved their projects and data science careers.  I will show why vtreat should become a key part of your predictive modeling workflow.

    #------------

    Norm Matloff

    "recsys: an Advanced Tool for Recommender Systems"

    The notion of collaborative filtering for recommender systems will be introduced, and several methods will be discussed, some existing and some novel.  Our R package 'rectools' implementing these methods will be introduced, and examples given.  Applications include both the "traditional," i.e. marketing, and the innovative, such as medical.




Join or login to comment.

  • John M.

    Nina Zumel and I just got our formal article on vtreat methodology up on arXiv: https://arxiv.org/abs/1611.09477 . (this isn't homework and nobody has to read ahead!)

    November 29

  • David W.

    Already had xgboost v 0.4.4 (from CRAN), so was a) surprised that the github install failed and then b) that it was trying to install v[masked]

    November 29

Want to go?

Join and RSVP

149 going

51 spots available

9 not going

(See all)

Our Sponsors

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy