Introduction to k-means

Kmeans is one of the simplest machine learning algorithms. This is probably why it is so popular. The goal is to partition a dataset into sets of points that are very similar. Computing the exact kmeans algorithm is NP Hard. The expectation maximization algorithm is much faster but it can converge to local optima. One of the major problems of kmeans is that it can converge to different solutions every time we run it.

Time to play
The dataset is a little bit big so let's do the trick and pick only 40000 rows. From your linux command line type
head -n40000 flights_planes_weather.csv >flights_small
> library(stats)
>mydata=read.csv("flights_small")
>summary(mydata)

Now let's make a dataset with some of the weather conditions

>attach(mydata)
>weather_data=data.frame(w_fog, w_heatindexm, w_pressurem, w_precipm, w_pressurei, w_precipi, w_snow, w_rain)
>var(weather_data)
Oups a lot of NA
Do this
>weather_data=na.omit(weather_data)
>summary(weather_data)
>var(weather_data)
>length(weather_data$fog)
not many were left

now do kmeans
>result=kmeans(weather_data, 3)

What does the result mean

Now let's do some variance normalization

Table of Contents

Page title Most recent update Last edited by
ICML 2013 Review August 2, 2013 4:15 PM nikolaos v.
Lesson 8 April 10, 2013 1:57 PM nikolaos v.
Lesson 7 April 3, 2013 11:44 AM nikolaos v.
Other clustering December 6, 2012 3:33 PM nikolaos v.
Distributed k-means December 5, 2012 11:23 PM nikolaos v.
Introduction to k-means December 5, 2012 11:09 PM nikolaos v.
Download a virtual machine November 28, 2012 9:48 AM nikolaos v.
Lesson 3 December 6, 2012 4:21 PM nikolaos v.
Decision Tree November 16, 2012 3:21 PM nikolaos v.
Regression Tree November 16, 2012 3:10 PM nikolaos v.
Lesson 2 Run a big logistic regression November 16, 2012 2:33 PM nikolaos v.
Lesson 2 Logistic Regression November 16, 2012 2:23 PM nikolaos v.

Our Sponsors

  • Ismion Inc

    The instructor for teaching the courses

  • LogicBlox Inc

    LogicBlox offers space, equipment and instructors payment

  • Predictix

    Paying for cloud time and for TAs

  • Kabbage

    Space and great pizza

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy