addressalign-toparrow-leftarrow-rightbackbellblockcalendarcameraccwchatcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-checkcircle-with-crosscircle-with-pluscrosseditemptyheartexportfacebookfolderfullheartglobegmailgoogleimageimagesinstagramlinklocation-pinmagnifying-glassmailminusmoremuplabelShape 3 + Rectangle 1outlookpersonplusprice-ribbonImported LayersImported LayersImported Layersshieldstartickettrashtriangle-downtriangle-uptwitteruseryahoo

Spectral Clustering - NU ML Journal Club

Below is a standard dataset used to illustrate the fundamental weakness of the standard K-means clustering algorithm: that it assumes incoming data lie in convex globular clusters.

You see two subgroups of data - the two concentric rings - and so if we were to cluster this set we would want to recover those two rings as our clusters. But thats not what K-means with K = 2 will do if we apply it to this dataset.  Below is the result of applying K-means with K = 2 to the dataset (on the left), and the correct separation given by spectral clustering - the topic of this journal club meeting - on the right.

Now this concentric ring set may be somewhat extreme, but we’re all adults here - do we really expect that the data we want to cluster in our various walks of life naturally lies in convex globular clusters (as is assumed by K-means)? Not likely.

So in this journal club meeting we’re going to learn about a clustering approach that throws out that assumption and does something different: spectral clustering. Spectral clustering clusters data points by treating them not as distinct convex globular clusters as in K-means but as connected subgraphs. This change in perspective allows separation of non-convex globular clusters like the concentric rings example above - something K-means just can’t do.

Here's the tutorial paper on spectral clustering that we'll read and discuss

http://www.kyb.mpg.de/fileadmin/user_upload/files/publications/attachments/Luxburg07_tutorial_4488%5b0%5d.pdf

You can also find a set of notes I wrote explaining spectral clustering - as well as a few cool MATLAB toys for playing around with different clustering algorithms (used to generate the images in this post) - below

http://neonwatty.wordpress.com/2013/07/03/an-intro-to-spectral-clustering-complete-with-illustrative-matlab-toys/

 



Join or login to comment.

  • Anup

    Excellent ! Got to meet new people and the topic was well explained.

    July 3, 2013

  • Jeremy W.

    My notes and MATLAB code are now up - see the hyperlink at the end of the meetup announcement.

    July 3, 2013

    • Jeremy W.

      Apologies - some people told me they got an error earlier when attempting to access my site. This should be fixed now.

      July 3, 2013

  • Jeremy W.

    John Fahrenbach gave me the heads up on the scikit clustering package - complete with descriptions, python code, and links to relevant papers located here - http://scikit-learn.org/stable/modules/clustering.html

    John gives top marks to DBSCAN!

    July 1, 2013

  • Jeremy W.

    Apologies - we'll meet Wed July 3 instead of Thurs (which is a holiday this week).

    June 29, 2013

14 went

Our Sponsors

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy