Skip to content
This event was canceled

Interfacing Data science models with ORM and Word2vec for Topic Modeling

O
Hosted By
Olivier B.
Interfacing Data science models with ORM and Word2vec for Topic Modeling

Details

This Meetup is going to talk about using better approaches. First of all, Benjamin Denis is going to talk about a better way to build and maintain models using ORM. Then, Étienne Labrie-Dion is going to talk about a better way to do topic modelling using Word2Vec.

### Speakers ###

Benjamin Denis, Behaviour Interactive

Étienne Labrie-Dion, GSoft

Interested in giving a talk? Reach out to us at olivier.blais@moov.ai

### Schedule ###

6:00 | Doors open

6:30 | First Presentation

Interfacing Data science models with Object Relational Mapping, Benjamin Denis

Agility and robustness are core to our Analytics team. With ORM, we enabled our data science teams to ship their models (Analytics, ML) independently, with minimal help for ETLs.

To build and maintain models, we developed python objects which represent our different database’s tables. These python objects allow us to interact with our databases more efficiently, but also to have a code representation and automatic documentation of our models. With these different tools, Data Scientists can focus more on their core job and less on engineering and documentation. In this talk, we will explain why we developed such tools, how we use them and what the next steps are.

7:15 | Second Presentation

Word2vec for Topic Modeling, Étienne Labrie-Dion

Employee engagement tools such as Officevibe can collect thousands of user comments when deployed in an average company. Modern natural language processing (NLP) techniques such as topic modeling can summarize these comments to extract important trends. However, building a great user experience for topic modeling is challenging. For example, having a set of fixed, pre-determined models (ex. LDA) limits your capacity to detect emerging trends. On the other hand, regularly deploying new models can break continuity between pre-existing topics and lead to a frustrating user experience.

In this talk, we will go over the process of building Outline, our topic modeling solution for user comments. Our solution efficiently maps keywords to user comments using word embeddings built from a custom word2vec model. This approach creates automatically named topics that are easy to interpret from a user perspective. In addition, topics are not limited to the actual words contained in the comment, instead being picked from a list of keywords that most represent the overall meaning of the comment. We will show the strengths and limitations of our implementation and discuss the challenges we faced.

### Speaker Bios ###
Benjamin Denis
Benjamin Denis is a Data Scientist - Tech Lead for analytics at Behaviour Interactive. He's responsible for developing and ensuring good data practices to deliver amazing player experiences.

As part of that role, he helped the design and implementation of a big data pipeline, as well as showing that HR data could help product owners with project management.

Étienne Labrie-Dion
Étienne is a Data Scientist at GSoft, a software company known for its Officevibe and ShareGate brands. Inside the GLab, GSoft's R&D division, he builds and tests new artificial intelligence products and functionalities. Étienne's contributions at GSoft include an anomaly detection module for ShareGate: Overcast, a time series prediction algorithm for the discontinued Snoozit product, and topic modeling for Officevibe. Before working as a data scientist, Étienne was a microscopy and neuroscience specialist in academia. In addition to his scientific research, he established and led a microscopy core facility in McGill and produced an online class for Université Laval.

------------------------
Hosted by Behaviour Interactive

Photo of MTL Data group
MTL Data
See more events

Canceled

Behaviour Interactive
6666 Rue Saint-Urbain #500 · Montréal, QC