Michigan Information Retrieval Enthusiasts Group Quarterly Meetup


Details
Schedule:
- 6:00pm-6:45pm pizza
- 6:45pm-8:00pm presentations
- 8:00pm-9:00pm discussion
Presentations:
- Bayesian Language Model
This talk presents a Bayesian language model, originally described by (Teh 2006), which uses a hierarchical Pitman-Yor process to describe the distribution of n-grams in an n-gram language model and which allows for a Bayesian back-off and smoothing strategy. The language model, which assumes a power-law prior over the n-gram space, compares favorably with language models based upon state of the art empirical n-gram smoothing techniques. In addition to the language model, and primarily because the background information required to understand it is somewhat difficult, that material, most of which does not appear in (Teh 2006), is also presented in some detail. In particular, background information related to the Dirichlet distribution and the Dirichlet process is given. The Dirichlet process is then related to the Pitman-Yor process, and the hierarchical Pitman-Yor process is also presented.
Speaker:
Craig Wright has more than 10 years of professional software experience in the computer-aided engineering industry and is focused on developing technologies that enable the intelligent integration of disparate physics simulation tools. His company Comet Solutions is an industry thought leader in this area and is working to make complex simulation processes accessible to non-expert analysts throughout the product design process. In April Craig earned an MSE in EE:Systems at the University of Michigan with a broad emphasis on intelligent systems.
- Using GATE for Word Polarity in Context Classification
GATE (General Architecture for Text Engineering) is an open source software for creating text processing workflows. Core GATE includes the tools for solving many text engineering issues: modelling and persistence of specialised data structures; measurement, evaluation, benchmarking; visualisation and editing of annotations, ontologies, parse trees, etc.; extraction of training instances for machine learning; pluggable machine learning implementations. This tutorial will show how to use GATE for advanced machine learning applications. Detecting word polarity in context will be used as an example to show some of the GATE features. The tutorial project is based on the latest sentiment analysis research, specifically the work by Theresa Wilson, Janyce Wiebe, Paul Hoffmann "Recognizing Contextual Polarity: An Exploration of Features for Phrase-Level Sentiment Analysis", 2009. Using different features (words, part of speech, negations, etc...) SVM classifier is trained and evaluated.
Speaker:
Ivan Provalov is a software developer and architect with over 15 years of professional experience in enterprise architecture design and development. He currenlty holds information architect position at Cengage Learning where he works on the search engine platform team. Ivan is currently taking courses for the Master's in CS from the University of Michigan - Dearborn. His professional interests are in the areas of information retrieval and natural language processing.

Michigan Information Retrieval Enthusiasts Group Quarterly Meetup