November Meetup: Big Data Analysis with Topic Models


Details
Now that we've hit our stride, November's meetup will be at the same bat-time (second Wednesday of the month at 6:30pm) and same bat-channel (Stetsons Famous Bar & Grill (http://stetsons-dc.com/index.php) in Adams Morgan). Please join us for a captivating presentation, stimulating conversation, and refreshing libations.
This month we've got a special guest: Prof. Jordan Boyd-Graber (http://www.umiacs.umd.edu/~jbg/) from UMIACS (http://www.umiacs.umd.edu/), who'll be presenting a single hour-long talk.
==========
Big Data Analysis with Topic Models: Human Interaction, Streaming Computation, and Social Science Applications
A common information need is to understand large, unstructured datasets: millions of e-mails during e-discovery, a decade worth of science correspondence, or a day’s tweets. In the last decade, topic models have become a common tool for navigating such datasets. This talk investigates the foundational research that allows successful tools for these data exploration tasks: how to know when you have an effective model of the dataset; how to correct bad models; how to scale to large datasets; and how to detect framing and spin using these techniques. After introducing topic models, I argue why traditional measures of topic model quality---borrowed from machine learning---are inconsistent with how topic models are actually used. In response, I describe interactive topic modeling, a technique that enables users to impart their insights and preferences to models in a principled, interactive way. I will then address computational and statistical limits to existing approaches and how streaming topic models, with an "infinite vocabulary", can be applied to real-world online datasets. Finally, I’ll discuss ongoing collaborations with political scientists to use these techniques to detect spin and framing in political and online interactions.
Speaker Bio: Jordan Boyd-Graber is an assistant professor in the University of Maryland's iSchool and the Institute for Advanced Computer Studies. He is a 2010 graduate of Princeton University, with a PhD thesis on "Linguistic Extensions of Topic Models", under David Blei.
==========
We'll meet at 6:30pm in the upstairs bar at Stetsons near the intersection of 16th and U Streets NW in Adams Morgan. Introductions & announcements will start around 7:00pm, and the presentation will begin at 7:30pm. Afterwards, there'll be plenty of time for follow-up questions, networking, and drinks.
Please note: Stetsons is a 21 and over venue.
DC NLP meets on the second Wednesday of each month to network, socialize, and learn about the interesting work folks are doing in natural language processing, computational linguistics, text analytics, and more.
Do you have something you'd like to share with the group? Let us know! We're always looking for speakers to give talks at future meetups, and don't forget to follow @DCNLP (https://twitter.com/DCNLP/) on Twitter!

Sponsors
November Meetup: Big Data Analysis with Topic Models