# Spectral Clustering - NU ML Journal Club

• Jul 3, 2013 · 5:00 PM

Below is a standard dataset used to illustrate the fundamental weakness of the standard K-means clustering algorithm: that it assumes incoming data lie in convex globular clusters.

You see two subgroups of data - the two concentric rings - and so if we were to cluster this set we would want to recover those two rings as our clusters. But thats not what K-means with K = 2 will do if we apply it to this dataset.  Below is the result of applying K-means with K = 2 to the dataset (on the left), and the correct separation given by spectral clustering - the topic of this journal club meeting - on the right.

Now this concentric ring set may be somewhat extreme, but we’re all adults here - do we really expect that the data we want to cluster in our various walks of life naturally lies in convex globular clusters (as is assumed by K-means)? Not likely.

So in this journal club meeting we’re going to learn about a clustering approach that throws out that assumption and does something different: spectral clustering. Spectral clustering clusters data points by treating them not as distinct convex globular clusters as in K-means but as connected subgraphs. This change in perspective allows separation of non-convex globular clusters like the concentric rings example above - something K-means just can’t do.

Here's the tutorial paper on spectral clustering that we'll read and discuss

You can also find a set of notes I wrote explaining spectral clustering - as well as a few cool MATLAB toys for playing around with different clustering algorithms (used to generate the images in this post) - below

http://neonwatty.wordpress.com/2013/07/03/an-intro-to-spectral-clustering-complete-with-illustrative-matlab-toys/

• ##### Anup

Excellent ! Got to meet new people and the topic was well explained.

July 3, 2013

• ##### Jeremy W.

My notes and MATLAB code are now up - see the hyperlink at the end of the meetup announcement.

July 3, 2013

• ##### Jeremy W.

Apologies - some people told me they got an error earlier when attempting to access my site. This should be fixed now.

July 3, 2013

• ##### Jeremy W.

John Fahrenbach gave me the heads up on the scikit clustering package - complete with descriptions, python code, and links to relevant papers located here - http://scikit-learn.org/stable/modules/clustering.html

John gives top marks to DBSCAN!

July 1, 2013

• ##### Jeremy W.

Apologies - we'll meet Wed July 3 instead of Thurs (which is a holiday this week).

June 29, 2013

### 14 went

• ##### Anup
• A former member
• A former member

### Evanston, IL

Founded Aug 30, 2012

#### Organizers:

• ##### EECS Department

Provides space for our events!

#### People in this Meetup are also in:

• ##### Data Science Chicago

3,825 Data Scientists

855 Peers

• ##### Girl Develop It Chicago

3,727 Nerdettes

• ##### Chicago Big Data Analytics Meetup

1,324 Chicago Big Data Analytics Fans

1,531 Coders

• ##### Big Data & Analytics Meetup, Chicago

2,139 Members