Daniel Peterson of the Center for Computational Language and Education Research (CLEAR) at CU Boulder will be presenting an overview of topic modeling.
Topic modeling is currently receiving a great deal of interest, because it successfully condenses a corpus full of information into small, often semantically-interpretable chunks of terms (topics). These topics are actually full unigram language models - each one represents the sorts of terms that get used in a particular style of document. Topic modeling works by asserting a probabilistic generative model that created the documents, and then using Bayesian inference to infer the parameters underlying this model, given the output (the corpus of interest). In this talk I will give an overview of the generative model and the sampling process, and try to convey an understanding for why and how topic modeling works.
Daniel Peterson grew up in Wyoming. He has a BA in Mathematics and a MS in Electrical Engineering, both from the University of Wyoming. Currently he is in the 2nd year of a PhD program in Natural Language Processing from the University of Colorado. His current interests and research are in unsupervised clustering algorithms, including topic modeling and Bayesian modeling in general. With a background in math an interest in a project, you can get pretty far.
6:00PM: network and socialize
7:30PM: more socializing
We will be meeting in the Muenzinger Psychology Building Room D428/430. This is the large conference room on the fourth floor of the south half of Muenzinger.