Skip to content

Random Forest Classification Workshop: Practical Tips

Photo of Sinziana E.
Hosted By
Sinziana E.
Random Forest Classification Workshop: Practical Tips

Details

Event and refreshments sponsored by OnDeck. Space sponsored by ThoughtWorks.

Agenda:

10:00am - 12:00pm: Talk by Anita Schmid and Christine Hurtubise

12:00pm - 1:00pm: Lunch

1:00pm - 3:00pm: Breakouts

Please bring a laptop as this is a hands-on workshop. Knowledge of either R or Python (with Pandas and Scikit-learn) is required.

During the morning session Anita will describe the random forest algorithm for classification and go over some use cases.

Participants will work in teams on a classification problem using a public data set (Airline on-time performance data (http://stat-computing.org/dataexpo/2009/) or Kaggle's Titanic data set (https://www.kaggle.com/c/titanic)). Tutors from OnDeck's data science team (Anaelle Bohbot, Siying Chen, Justin Law and Abhra Mitra) will guide the teams during the breakouts session, and at the end of the workshop each team will present their solution. We have identified suitable public datasets, such as the Titanic dataset and the airline delay dataset and will ask participants to download these before the workshop. Participants can either use R or Python, whichever language they are more accustomed to.

Anita Schmid, Senior Data Scientist at OnDeck (http://www.ondeck.com/), @OnDeckCapital (https://twitter.com/OnDeckCapital)

In 2014, Anita joined OnDeck, a small business lending company, where she works closely with the Sales and Marketing departments. She works on Marketing Models, Sales Optimization, Funnel Reporting, Attribution and A/B testing. She has a Diploma (equivalent to MSc) in Physics from the ETH (Swiss Federal Institute of Technology) in Zurich, Switzerland, and earned a Ph.D. in the field of Systems Neuroscience at the same institution before moving to New York in 2006. Before transitioning to Data Science with the help of the Insight Data Science Fellows Program (http://insightdatascience.com/), she worked as research faculty at Weill Cornell Medical College in NYC.

Christine Hurtubise (@cfhurtub) currently works as a Manager in the Risk department at OnDeck, where she leads model validation initiatives. She previously worked at SunGard (now FIS) as a Senior Consultant, focusing on developing credit risk models for regional and commercial banks. Christine has worked with technology platforms and frameworks which predict customer and portfolio behavior for the past six years. Prior to joining the industry, she worked on data visualization techniques in a biology research lab at the University of Pennsylvania. Christine graduated magna cum laude in 2008 from the Mathematics department at the University of Pennsylvania.

Photo of NYC Women in Machine Learning & Data Science group
NYC Women in Machine Learning & Data Science
See more events
ThoughtWorks
99 Madison Ave., 15th Floor · New York, NY