Skip to content

PyData Paris - October 2022 Meetup

Photo of Sylvain Corlay
Hosted By
Sylvain C. and Sandrine P.
PyData Paris - October 2022 Meetup

Details

Mark your calendar for the next session of the PyData Paris Meetup, on October 20th 2022. This Meetup will be hosted by OVHCloud, 19 place Françoise Dorin, 75017 Paris.

The speakers for this session are Mitzi Morris and Maria Teleńczuk.

Schedule

7:00pm - 7:15pm: Community announcements
7:15pm - 8:00pm: Mitzi Morris - Probabilistic Programming with CmdStanPy and plotnine
8:00pm - 8:45pm: Maria Teleńczuk - Standardized benchmarks for Federated Learning
8:45pm - 9:30: Buffet

Abstracts

Mitzi Morris - Probabilistic Programming with CmdStanPy and plotnine

This talk is a gentle introduction to multilevel regression modeling in Stan
using the CmdStanPy interface for inference and prediction and the plotnine package for visualization. We demonstrate key best practices for Bayesian data analysis and prediction.

Probabilistic programs use imperfect data to make inferences about a stochastic process.
For structured data, we use multilevel models to describe relationships in the data.
The program outputs - estimates and predictions - are the result of conditioning the model on the data.
Prior predictive and posterior predictive checks determine whether or not the model is well-specified for the data.

Data visualization drives design, testing, and documentation.
We will show how to create beautiful layered plots in plotnine which
provide an intuitive way to explore the data and resulting inferences and predictions.
This talk is based on the Stan Case Study
Multilevel regression modeling with CmdStanPy and plotnine.

Maria Teleńczuk - Standardized benchmarks for Federated Learning

Federated Learning is a new and growing field where the goal is to train an ML model from data stored in different centers.
In this talk, I will introduce federated learning (FL) and discuss examples where it is useful. I will give a high-level overview of most common FL strategies. I will then discuss why standard datasets (benchmarks) especially in cross silo FL are needed. As a hands-on example, I will demonstrate how to easily access datasets with open source code and benchmark your next contribution to FL.

Bios

Mitzi Morris is a member of the Stan Development Team and serves on the Stan Governing Body.
Since 2017 she has been a full-time Stan developer, working for Professor Andrew Gelman at Columbia University,
where she has contributed to the core Stan C++ platform and developed CmdStanPy, a modern Python interface for Stan.
She is also an active Stan user, developing, publishing, and presenting on Bayesian models for disease mapping.
Prior to that, she worked as a software engineer in both academia and industry,
working on natural language processing and search applications as well as data analysis pipelines for genomics and bioinformatics.

Maria Teleńczuk is a Data Scientist at Owkin in Federated Learning Group. She holds a PhD in Computational Neuroscience. Before joining Owkin she worked at Inria contributing to open source projects such as Scikit-learn and Ramp. She is also an organizer of the PyLadies Paris.

Photo of PyData Paris group
PyData Paris
See more events
OVHcloud
42 Av. de la Prte de Clichy · Paris