Ryan Chesler will present his work on the Kaggle Toxic Comment Classification competition, where he placed 3rd out of 4551 (abstract below). This will be followed by a discussion and networking session.
Introduction and announcements (6:00 PM - 6:15 PM)
Main Talk (6:15 PM - 7:00 PM)
Discussion and Networking (7:00 PM - 8:00 PM)
ScaleMatrix has graciously agreed to host us:
The ScaleMatrix Launch Center is located behind the Fun Bike Center on Kearny Villa Rd.
5795 Kearny Villa Rd., San Diego, 92123
Identifying toxic comments online is a key capability for facilitating productive discourse. Kaggle recently hosted a Toxic Comment Classification Challenge competition with the objective of identifying types of toxic comments, such as threats, obscenity, insults, and identity-based hate. In this presentation, Ryan Chesler will give an overview of his work on this project, where he ultimately placed 3rd out of 4551. Ryan will start by covering some basics of how words are embedded as numbers, including Word2Vec using Gensim and the sklearn packages CountVectorizer and TfidfVectorizer. Thereafter, he'll move to more advanced topics and discuss the models he used for the Toxic Comment challenge--what worked and what didn’t work. This talk will have content for both beginners and more advanced people, and code snippets and visualization will be available for download.