Intermediate R workshop: Logistic Regression, PCA and clustering

NYC Open Data
NYC Open Data
Public group
Location image of event venue


Social Media:

Twitter: @Vivian__Zhang ( @SupStat ( @NycDataSci (

Learn with our NYC Data Science Program ( (We offered corporate and individual training for more than 40 firms in NYC alone). We offer 12 week immersive program, weekend and weekday night Data Science training.


Ilan Man, adjunct instructor of NYC Data Science Academy will present Machine Learning in R, and cover the following:

Ilan Man works in Strategy Operations at Squarespace where he analyzes event-based datasets and builds predictive models in R and Python.


In this talk Ilan will build upon the topics discussed in his earlier presentation (Machine Learning in R), and cover the following:

- Logistic Regression: determining and approximating the cost function
- PCA: eigenvalue decomposition and derivation (basic knowledge of linear algebra assumed)
- Clustering: 3 types of common clustering techniques and examples, pros and cons
- Decision Trees: various implementations, entropy and other tree-features discussed.

Some prior knowledge of these algorithms is helpful, but not assumed.

His talk will be in R, however knowledge of R is not necessary to attend. If you'd like to follow along, having RStudio installed is recommended, as Ilan will be using datasets from the UCI Machine Learning Repository in his talk.