addressalign-toparrow-leftarrow-rightbackbellblockcalendarcameraccwcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-checkcircle-with-crosscircle-with-pluscontroller-playcrossdots-three-verticaleditemptyheartexporteye-with-lineeyefacebookfolderfullheartglobegmailgooglegroupshelp-with-circleimageimagesinstagramFill 1launch-new-window--smalllight-bulblinklocation-pinm-swarmSearchmailmessagesminusmoremuplabelShape 3 + Rectangle 1ShapeoutlookpersonJoin Group on CardStartprice-ribbonprintShapeShapeShapeShapeImported LayersImported LayersImported Layersshieldstartickettrashtriangle-downtriangle-uptwitteruserwarningyahoo

Re: [Cleveland-AI-ML-support-group] Topic modeling w/ neural nets

From: Joe
Sent on: Monday, November 7, 2011 1:13 AM
 The softmax paper is pretty mindbending. It sounds very cutting edge given that it claims to both be faster than LDA and also has the critical additional feature of much more precise applicability of combinations of categories. That feature would seem to be a really huge benefit. For example allowing much more targeted demographic categorization in numerous more categories.

 Here are a few thoughts on implementing it:
-might want to start with LDA with a very simple made up data set for practice and results familiarization first, since the paper contrasts LDA results with the seemingly more involved softmax. I would start with getting the better softmax categorization feature tested/compared with LDA and a contrived data set first, before trying the variable document length feature.

-The way to make the RBM for each individual document a 'softmax' seems to be:
a. to use equation 3 for the conditional probability of what would normally be the RBM input cells.
b. equation 6 corresponds to setting the weights for the graph lines, I think.
c. equation 5 corresponds to summing the 'activation energy' from the conditional probabilities x the weights.
d. the bias for each RBM is set in a softmax specific way relating to the length of each document.
e. I suspect there is some additional factor in how this works that I'm overlooking, like are the RBM's in different documents supposed to effect each other in some way during the above steps. The answer must be 'yes', right?
f. The papers additional new twist of monte carlo related RBM partition estimating, and their large data set, should be left for after getting the simplest possible data and algos working.
g. They appear to have used Matlab, thus presumably Octave would be good to start with. There may not be any code samples for this yet, other than the authors Matlab code.


--- On Sun, 11/6/11, Timmy Wilson <[address removed]> wrote:

From: Timmy Wilson <[address removed]>
Subject: [Cleveland-AI-ML-support-group] Topic modeling w/ neural nets
To: [address removed]
Date: Sunday, November 6, 2011, 12:42 PM

Inspired by these two great talks:

- Geoffrey Hinton -- The Next Generation of Neural Networks --

- Andrew Ng -- Unsupervised Feature Learning and Deep Learning --

i'm interested in using deep learning to model latent topics

i did some digging, and found Ruslan Salakhutdinov's -- Replicated
Softmax: an Undirected Topic Model --

The model can be efficiently trained using Contrastive
Divergence, it has a better way of dealing with documents
of different lengths, and computing the posterior distribution
over the latent topic values is easy. We will also demonstrate
that the proposed model is able to generalize much better
compared to a popular Bayesian mixture model, Latent
Dirichlet Allocation (LDA) [2], in terms of both the
log-probability on previously unseen documents and the
retrieval accuracy.


The proposed model have several key advantages: the
learning is easy and stable, it can model documents of
different lengths, and computing the posterior distribution
over the latent topic values is easy. Furthermore, using
stochastic gradient descent, scaling up learning to billions
of documents would not be particularly difficult.

i want to 'cobble together' a distributed python implementation --
she'll feel right at home in -- if
Radim will have her :]

i figured i'd spam everyone that may be interested, and ask/plead for
help/existing code examples

Please Note: If you hit "REPLY", your message will be sent to everyone on this mailing list ([address removed])
This message was sent by Timmy Wilson ([address removed]) from Cleveland AI + ML support group.
To learn more about Timmy Wilson, visit his/her member profile:
To unsubscribe or to update your mailing list settings, click here:
Meetup, PO Box 4668 #37895 New York, New York[masked] | [address removed]

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy