A talk on Natural Language Processing (NLP) by Thomas Levi, Senior Data Scientist @ POF.
NLP to Find User Archetypes for Search & Matching
Plenty of Fish (POF), the world’s largest free dating site, would like to be able to match with and allow users to search for people with similar interests. However, we allow our users to enter their interests as free text on their profiles. This presents difficult problems in clustering, searching and machine learning if we want to move beyond simple ‘exact match’ solutions to a deeper archetypal user profiling and thematic search system. Some of the common issues that arise are misspellings, synonyms (e.g. biking, cycling and bicycling) and similar interests (e.g. snowboarding and skiing) on a several million user scale. In this talk I will demonstrate how we built a system utilizing topic modelling with Latent Dirichlet Allocation (LDA) on a several hundred thousand word vocabulary over ten million+ North American users and explore its applications at POF.
About Thomas Levi
Thomas Levi started out with a doctorate in Theoretical Physics and String Theory from the University of Pennsylvania in 2006. His post-doctoral studies in cosmology and string theory, where he wrote 19 papers garnering 650+ citations, then took him to NYU and finally UBC. In 2012, he decided to move into industry and took on the role of Senior Data Scientist at POF. Thomas has been involved in diverse projects such as behaviour analysis, social network analysis, scam detection, Bot detection, matching algorithms, topic modelling and semantic analysis.
• 6:00PM Doors are open, feel free to mingle
• 6:30 Presentation start
• ~8:00 Off to a nearby restaurant (Mr Brownstones) for food, drinks, and breakout discussions
By transit there a number of high frequency buses (check Google Maps or the Translink site for your particular case) that will get you there. For the drivers, there is a fair bit of street parking (free and pay) in the area, especially after 6.
How to Contact Us / Re Comments
Please note any comments you add to this event (below) will be e-mailed to all members of the group. We're trying to avoid spamming the list, so please do not use comments for jokes, job postings, requests for help programming something or anything else off topic. If you have questions or need to contact us, use the 'contact us' link on the left. Thanks!