Name: Rephil: Extracting Concepts from Text
Start: 2012-01-10T19:00:00-08:00
End: 2012-01-10T22:00:00-08:00
Location: Shopzilla Inc

﻿﻿﻿﻿﻿This talk will describe Rephil, a system used throughout Google to identify the concepts or topics that underlie a given piece of text. Rephil determines, for example, that "apple pie" falls under some of the same topics as "chocolate cake", but has little in common with "apple ipod". The concepts used by Rephil are not pre-specified; instead, they are derived by an unsupervised learning algorithm running on massive amounts of text. The result of this learning process is a Rephil model -- a giant Bayesian network with concepts as nodes. I will discuss the structure of Rephil models, the distributed machine learning algorithm that we use to build these models from terabytes of data, and the Bayesian network inference algorithm that we use to identify concepts in new texts under tight time constraints. I will also discuss how Rephil relates to ongoing academic research on probabilistic topic models.

Rob Zinkov

LA Machine Learning

Technology

Machine Learning

Data Science

Big Data

Artificial Intelligence

Natural Language Processing

Data Mining

Computer Science

Predictive Analytics

Rephil: Extracting Concepts from Text

Shopzilla Inc

Share

LA Machine Learning

Rephil: Extracting Concepts from Text

LA Machine Learning

Details

Members are also interested in