As companies collect more and more data, there is a growing need for trained professionals to build models out of those data: humans, however, are costly and do not scale too well. On the opposite side of the spectrum, tools for automated data analysis promise to do everything with one click; in reality they are often just a list of pre-compiled models, with huge variance in accuracy and interpretability.
In this talk, we will try to find some middle ground by taking a fresh look at "symbolic regression", i.e. the idea that you can fit arbitrary datasets by evolving a population of mathematical expressions until you find something that fits data well enough. Instead of taking the standard optimization route provided by genetic algorithms, we will borrow ideas from probabilistic programming to build a small "robot scientist" with webPPL: if mathematics is a language, why can't we use ideas from natural language processing to explore the space of expressions?
We conclude with some comments on model explainability and recent academic work in "compositional" data analysis.
Topics: symbolic regression, automl, probabilistic programming.
Bio:
Jacopo Tagliabue was co-founder, CTO and nerd in chief at Tooso - a Gartner "Cool Vendor" company acquired by Coveo in 2019. Before Tooso, he was a "Best & Brightest" fellow at the Santa Fe Institute and led the Data Science team of AxonVibe in New York. In previous lives, he managed to get a Ph.D. (UNISR/MIT), do scienc-y things for a professional basketball team, simulate a pre-Columbian civilization and give an academic talk on video games (among others improbable "achievements"). His research and industry work has been featured several times in international conferences and the general press.
This event will be held in English.
Agenda:
18:45 Doors opening
19:00 Data Science Milan community opening
19:20 Talk