Aller au contenu

Détails

The meetup will be hosted at Doctrine's office.

As Doctrine's office can only welcome around 40-50 people, it will be on a first arrived first served basis. (Besides, you need to register on this page, it will be checked).

The meetup will also be available online on this link: https://meet.google.com/wrr-hihm-war

----------
Talks:

Alex Combessie & Jean-Marie John-Mathews, Co-Founders, giskard.ai

Biases in NLP models: What They Are & Where to Find Them?

Many ML biases have been identified in NLP over the past few years. For example, popular word embedding algorithms exhibit stereotypical biases, such as gender bias. Amazon’s automated resume screening discriminated against women in 2015. NLP applications’ biased decisions not only perpetuate historical biases and injustices but potentially amplify existing biases at an unprecedented scale and speed.

As a data scientist, how to find them? How to correct them? A lot of research paper in the FairML community has been produced over a couple of years to create mitigation strategies in NLP.

Inspired by principles of behavioral testing in software engineering from research, we will present some very recent task-agnostic methodology for testing NLP models. We will then show a practical application by introducing AI Inspect, a platform to identify biases and implement tests.

Hugo Mailfait, Kili Technology

Drift detection: How can I efficiently retrain my model?

Do you think a product recommendation model, trained before COVID-19, could work equally well during the pandemic? So, what should I do when changes appear in the data passed to my model? This phenomenon is called concept drift and if this issue is not tackled, model results will gradually deteriorate and the outputs could not be trusted anymore.

How can I detect concept drift in streaming data? Several drift detection techniques do exist, from error-based ones to algorithms relying on data distribution. In this talk, we will compare several of these techniques, using real-world examples of drifting data streams. Then, we will present the challenge of dealing with model retraining (eg. what is the most relevant dataset to use?) and we will propose some solutions.

Sujets connexes

Natural Language Processing
Data Science

Vous aimerez peut-être aussi