DataTalks #31: Understanding Overfitting in Machine Learning ⛹️♀️🤾♂️🤽♀️🧠
Details
DataTalks #31: Understanding Overfitting in Machine Learning ⛹️♀️♂️♀️
Our 31st DataTalks meetup will be held online and will focus on overfitting in machine learning.
: https://us02web.zoom.us/j/84384138987?pwd=VitvVEtBRy9DQXFZbXMrMTAvRFVJZz09
:
16:30 - 17:15 – When Machine Learning Hits Reality – Dana Racah and Inbal Budowski-Tal, EverCompliant
17:20 - 18:05 – How long does your data live? Test-set re-use in modern ML ♻️ – Gal Yona, Weizmann Institute of Science
---------------------
– -
We did everything by the book.
We divided our dataset into train-test-validation. We checked the learning-curve to make sure the model is not overfitted. We gathered another large dataset and tested the model against it, for final validation of the model's performances. And yet, after deploying to production, the model's performances were much lower than what we measured. Why, oh why???
In this talk, we will explain what went wrong, and explain how we test our models now, as a result of this experience.
: Inbal is the Director of AI at EverCompliant. Dana is a data scientist at EverCompliant.
---------------------
? - - ♻️ –
In modern ML the community typically continuously evaluates models on the same data-sets, often with the same train-test splits. This creates a feedback loop, as future models now implicitly depend on the test sets. This adaptive setting, in which models are not independent of the test set they are evaluated on, enjoys exponentially worse generalization guarantees than the non-adaptive setting. This raises suspicion regarding the statistical validity of our results, and recent progress in general: Are we still making progress on the underlying tasks, or have we simply “exhausted” our existing datasets? More generally, how long does data “live” in modern ML applications?
In this talk I will discuss two recent clever attempts to answer the above questions, as well as their (somewhat surprising, given the above backdrop) findings. The first approach uses replication studies of common vision benchmarks and the second conducts a meta-analysis of overfitting on Kaggle competitions. We’ll conclude by highlighting practical takeaways this line of work may suggest for increasing the longevity of ML benchmarks in your organizational workflow.
:
http://proceedings.mlr.press/v97/recht19a/recht19a.pdf
http://papers.neurips.cc/paper/9117-a-meta-analysis-of-overfitting-in-machine-learning.pdf
: Gal is a Computer Science Ph.D student in the Weizmann Institute of Science.
---------------------
: https://us02web.zoom.us/j/84384138987?pwd=VitvVEtBRy9DQXFZbXMrMTAvRFVJZz09
