ATOM is meeting on Tuesday, 18th December, 6:30pm, at Galvanize!
Our discussion this month will be led by Brandon Sherman. We'll be discussing a paper entitled, "Statistical Modeling : The Two Cultures".
In 2001, the field of statistics worked primarily by assuming an underlying data model, then trying to infer parameters of that data model. Breiman critiques this paradigm as leading to irrelevant theory, questionable conclusions, and as preventing statisticians from working on many interesting problems. He suggests a switch to algorithmic models that aim to maximize accuracy instead of estimating an underlying data model.
Some context: Leo Breiman is among the most influential people in modern data analysis. He introduced decision trees in 1984, bagging in 1994, and random forests a few months before this paper was published.
Some questions to think about while reading this paper:
1. Which aspects of Breiman's argument do you agree with, and which do you disagree with?
2. This paper was written in 2001. Do you think it reflects the state of statistics and machine learning in 2018?
3. How should we consider tradeoffs between model accuracy and interpretability?
Advanced Topics on Machine learning ( ATOM ) is a learning and discussion group for cutting-edge machine learning techniques in the real world. We work through winning Kaggle competition entries or real-world ML projects, learning from those who have successfully applied sophisticated data science pipelines to complex problems.
As a discussion group, we strongly encourage participation, so be sure to read up about the topic of conversation beforehand !
ATOM can be found on PuPPy’s Slack under the channel #atom, and on PuPPy’s Meetup.com events.
We're kindly hosted by Galvanize (https://www.galvanize.com). Thank you !