Skip to content

PyData Paris - March 2024 Meetup

Photo of Sylvain Corlay
Hosted By
Sylvain C. and Sandrine P.
PyData Paris - March 2024 Meetup

Details

Mark your calendar for the next session of the PyData Paris Meetup, on March 21st 2024. This Meetup will be hosted by Scaleway, Europe's empowering cloud provider at the Iliad group office, 16 rue de la ville l'evêque 75008 Paris.
The speakers for this session, that will be dedicated to Taipy are Alexandre Sajus and Florian Jacta.

Schedule
7:00pm - 7:15pm: Community announcements & short address by Fred Bardolle, Lead Product Manager AI at Scaleway.
7:15pm - 7:45pm: Get the best from your scikit-learn classifier: trusted probabilities and optimal binary decision, Guillaume Lemaître
7:45pm - 8:30pm: Deploy your Data Project on the Web using only Python, Alexandre Sajus & Florian Jacta
8:30pm - 9:30pm: Buffet

Speakers

Alexandre Sajus is a customer success engineer at Taipy. He graduated with a Master's of Engineering from Centrale Paris. Florian Jacta is a data scientist and community manager at Taipy.

Guillaume Lemaitre is an open-source scientific software developer at :probabl. and a core developer of the scikit-learn project.

Abstracts

Deploy your Data Project on the Web using only Python, Alexandre Sajus & Florian Jacta
In the Python ecosystem, many packages are available for running algorithms, training models, and visualizing data. Despite this, over 85% of data science projects stay at the proof-of-concept stage and never reach the production stage. With Taipy, Python developers can build great pilots as well as stunning production-ready web applications designed for end-users.

Get the best from your scikit-learn classifier: trusted probabilties and optimal binary decision, Guillaume Lemaitre
When operating a classifier in a production setting (i.e. predictive phase), practitioners are interested in potentially two different outputs: a "hard" decision used to leverage a business decision or/and a "soft" decision to get a confidence score linked to each potential decision (e.g. usually related to class probabilities).
Scikit-learn does not provide any flexibility to go from "soft" to "hard" predictions: it uses a cut-off point at a confidence score of 0.5 (or 0 when using decision_function) to get class labels. However, optimizing a classifier to get a confidence score close to the true probabilities (i.e. a calibrated classifier) does not guarantee to obtain accurate "hard" predictions using this heuristic. Reversely, training a classifier for an optimum "hard" prediction accuracy (with the cut-off constraint at 0.5) does not guarantee obtaining a calibrated classifier.
In this talk, we will present a new scikit-learn meta-estimator allowing us to get the best of the two worlds: a calibrated classifier providing optimum "hard" predictions. This meta-estimator will land in a future version of scikit-learn: https://github.com/scikit-learn/scikit-learn/pull/26120.
We will provide some insights regarding the way to obtain accurate probabilities and predictions and also illustrate how to use in practice this model on different use cases: cost-sensitive problems and imbalanced classification problems.

Photo of PyData Paris group
PyData Paris
See more events
Iliad
16 Rue de la Ville-l'Évêque · Paris