|Sent on:||Monday, November 7, 2011 1:13 AM|
| The softmax paper is pretty mindbending. It sounds very cutting edge given that it claims to both be faster than LDA and also has the critical additional feature of much more precise applicability of combinations of categories. That feature would seem to be a really huge benefit. For example allowing much more targeted demographic categorization in numerous more categories.|
Here are a few thoughts on implementing it:
-might want to start with LDA with a very simple made up data set for practice and results familiarization first, since the paper contrasts LDA results with the seemingly more involved softmax. I would start with getting the better softmax categorization feature tested/compared with LDA and a contrived data set first, before trying the variable document length feature.
-The way to make the RBM for each individual document a 'softmax' seems to be:
a. to use equation 3 for the conditional probability of what would normally be the RBM input cells.
b. equation 6 corresponds to setting the weights for the graph lines, I think.
c. equation 5 corresponds to summing the 'activation energy' from the conditional probabilities x the weights.
d. the bias for each RBM is set in a softmax specific way relating to the length of each document.
e. I suspect there is some additional factor in how this works that I'm overlooking, like are the RBM's in different documents supposed to effect each other in some way during the above steps. The answer must be 'yes', right?
f. The papers additional new twist of monte carlo related RBM partition estimating, and their large data set, should be left for after getting the simplest possible data and algos working.
g. They appear to have used Matlab, thus presumably Octave would be good to start with. There may not be any code samples for this yet, other than the authors Matlab code.
--- On Sun, 11/6/11, Timmy Wilson <[address removed]> wrote: