A talk on Natural Language Processing (NLP) by Thomas Levi, Senior Data Scientist @ POF.
As the world’s largest free dating site, Plenty Of Fish would like to be able to match with and allow users to search for people with similar interests. However, we allow our users to enter their interests as free text on their profiles. This presents a difficult problem in clustering, search and machine learning if we want to move beyond simple ‘exact match’ solutions to a deeper archetypal user profiling and thematic search system. Some of the common issues that arise are misspellings, synonyms (e.g. biking, cycling and bicycling) and similar interests (e.g. snowboarding and skiing) on a several million user scale. In this talk I will demonstrate how we built a system utilizing topic modelling with Latent Dirichlet Allocation (LDA) on a several hundred thousand word vocabulary over ten million+ North American users and explore its applications at POF.
Thomas Levi started out with a doctorate in Theoretical Physics and String Theory from the University of Pennsylvania in 2006. His post-doctoral studies in cosmology and string theory, where he wrote 19 papers garnering 650+ citations, then took him to NYU and finally UBC. In 2012, he decided to move into industry, and took on the role of Senior Data Scientist at POF. Thomas has been involved in diverse projects such as behaviour analysis, social network analysis, scam detection, Bot detection, matching algorithms, topic modelling and semantic analysis.
• 6:00PM Doors are open, feel free to mingle
• 6:30 Presentations start
• 8:00 Off to a nearby watering hole (Mr. Brownstone?) for a pint, food, and/or breakout discussions
By transit there a number of high frequency buses (check Google Maps or the Translink site for your particular case) that will get you there. For the drivers, there is a fair bit of street parking (free and pay) in the area, especially after 6.