Skip to content

Spectral Clustering - NU ML Journal Club

Photo of Jeremy Watt
Hosted By
Jeremy W.
Spectral Clustering - NU ML Journal Club

Details

Below is a standard dataset used to illustrate the fundamental weakness of the standard K-means clustering algorithm: that it assumes incoming data lie in convex globular clusters.

http://photos1.meetupstatic.com/photos/event/e/1/5/2/event_253497682.jpeg

You see two subgroups of data - the two concentric rings - and so if we were to cluster this set we would want to recover those two rings as our clusters. But thats not what K-means with K = 2 will do if we apply it to this dataset. Below is the result of applying K-means with K = 2 to the dataset (on the left), and the correct separation given by spectral clustering - the topic of this journal club meeting - on the right.

http://photos2.meetupstatic.com/photos/event/e/2/4/2/event_253497922.jpeg http://photos2.meetupstatic.com/photos/event/e/2/c/4/event_253498052.jpeg

Now this concentric ring set may be somewhat extreme, but we’re all adults here - do we really expect that the data we want to cluster in our various walks of life naturally lies in convex globular clusters (as is assumed by K-means)? Not likely.

So in this journal club meeting we’re going to learn about a clustering approach that throws out that assumption and does something different: spectral clustering. Spectral clustering clusters data points by treating them not as distinct convex globular clusters as in K-means but as connected subgraphs. This change in perspective allows separation of non-convex globular clusters like the concentric rings example above - something K-means just can’t do.

Here's the tutorial paper on spectral clustering that we'll read and discuss

http://www.kyb.mpg.de/fileadmin/user_upload/files/publications/attachments/Luxburg07_tutorial_4488%5b0%5d.pdf

You can also find a set of notes I wrote explaining spectral clustering - as well as a few cool MATLAB toys for playing around with different clustering algorithms (used to generate the images in this post) - below

http://neonwatty.wordpress.com/2013/07/03/an-intro-to-spectral-clustering-complete-with-illustrative-matlab-toys/

Photo of Northwestern Machine Learning Meetup group
Northwestern Machine Learning Meetup
See more events
Ford Motor Company Engineering Design Center Room 3.340
2133 Sheridan Road Evanston 60208 · Evanston, IL