Skip to content

DataTalks #31: Understanding Overfitting in Machine Learning ⛹️‍♀️🤾‍♂️🤽‍♀️🧠

Photo of Shay Palachy Affek
Hosted By
Shay Palachy A.
DataTalks #31: Understanding Overfitting in Machine Learning ⛹️‍♀️🤾‍♂️🤽‍♀️🧠

Details

DataTalks #31: Understanding Overfitting in Machine Learning ⛹️‍♀️🤾‍♂️🤽‍♀️🧠

Our 31st DataTalks meetup will be held online and will focus on overfitting in machine learning.

𝗭𝗼𝗼𝗺 𝗹𝗶𝗻𝗸: https://us02web.zoom.us/j/84384138987?pwd=VitvVEtBRy9DQXFZbXMrMTAvRFVJZz09

𝗔𝗴𝗲𝗻𝗱𝗮:
🔶 16:30 - 17:15 – When Machine Learning Hits Reality 🧱 – Dana Racah and Inbal Budowski-Tal, EverCompliant
🔴 17:20 - 18:05 – How long does your data live? Test-set re-use in modern ML ♻️ – Gal Yona, Weizmann Institute of Science

---------------------

𝗪𝗵𝗲𝗻 𝗠𝗮𝗰𝗵𝗶𝗻𝗲 𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴 𝗛𝗶𝘁𝘀 𝗥𝗲𝗮𝗹𝗶𝘁𝘆 🧱 – 𝗗𝗮𝗻𝗮 𝗥𝗮𝗰𝗮𝗵 𝗮𝗻𝗱 𝗜𝗻𝗯𝗮𝗹 𝗕𝘂𝗱𝗼𝘄𝘀𝗸𝗶-𝗧𝗮𝗹

We did everything by the book.

We divided our dataset into train-test-validation. We checked the learning-curve to make sure the model is not overfitted. We gathered another large dataset and tested the model against it, for final validation of the model's performances. And yet, after deploying to production, the model's performances were much lower than what we measured. Why, oh why??? 😱😭

In this talk, we will explain what went wrong, and explain how we test our models now, as a result of this experience.

𝗕𝗶𝗼: Inbal is the Director of AI at EverCompliant. Dana is a data scientist at EverCompliant.

---------------------

𝗛𝗼𝘄 𝗹𝗼𝗻𝗴 𝗱𝗼𝗲𝘀 𝘆𝗼𝘂𝗿 𝗱𝗮𝘁𝗮 𝗹𝗶𝘃𝗲? 𝗢𝗻 𝘁𝗵𝗲 𝘁𝗵𝗲𝗼𝗿𝘆 𝗮𝗻𝗱 𝗽𝗿𝗮𝗰𝘁𝗶𝗰𝗲 𝗼𝗳 𝘁𝗲𝘀𝘁-𝘀𝗲𝘁 𝗿𝗲-𝘂𝘀𝗲 𝗶𝗻 𝗺𝗼𝗱𝗲𝗿𝗻 𝗺𝗮𝗰𝗵𝗶𝗻𝗲 𝗹𝗲𝗮𝗿𝗻𝗶𝗻𝗴 ♻️ – 𝗚𝗮𝗹 𝗬𝗼𝗻𝗮

In modern ML the community typically continuously evaluates models on the same data-sets, often with the same train-test splits. This creates a feedback loop, as future models now implicitly depend on the test sets. This adaptive setting, in which models are not independent of the test set they are evaluated on, enjoys exponentially worse generalization guarantees than the non-adaptive setting. This raises suspicion regarding the statistical validity of our results, and recent progress in general: Are we still making progress on the underlying tasks, or have we simply “exhausted” our existing datasets? More generally, how long does data “live” in modern ML applications?

In this talk I will discuss two recent clever attempts to answer the above questions, as well as their (somewhat surprising, given the above backdrop) findings. The first approach uses replication studies of common vision benchmarks and the second conducts a meta-analysis of overfitting on Kaggle competitions. We’ll conclude by highlighting practical takeaways this line of work may suggest for increasing the longevity of ML benchmarks in your organizational workflow.

𝗣𝗮𝗽𝗲𝗿 𝗹𝗶𝗻𝗸:
http://proceedings.mlr.press/v97/recht19a/recht19a.pdf
http://papers.neurips.cc/paper/9117-a-meta-analysis-of-overfitting-in-machine-learning.pdf

𝗕𝗶𝗼: Gal is a Computer Science Ph.D student in the Weizmann Institute of Science.

---------------------

𝗭𝗼𝗼𝗺 𝗹𝗶𝗻𝗸: https://us02web.zoom.us/j/84384138987?pwd=VitvVEtBRy9DQXFZbXMrMTAvRFVJZz09

Photo of DataHack - Data Science, Machine Learning & Statistics group
DataHack - Data Science, Machine Learning & Statistics
See more events