Skip to content
Tidy data

Details

This month we are incredibly excited to host Hadley Wickham (http://had.co.nz/), Assistant Professor of Statistics at Rice University, and creator of many of the most popular R packages in CRAN.

It's often said that 80% of the effort of analysis is spent just getting the data ready to analyse, the process of data cleaning. Data cleaning is not only a vital first step, but it is oftenrepeated multiple times over the course of an analysis as new problems come to light. Despite the amount of time it takes up, there has been little research on how to do clean data well. Part of the challenge is the breadth of activities that cleaning encompasses, from outlier checking to date parsing to missing value imputation. To get a handle on the problem, this talk focusses on a small, but important, subset of data cleaning that I call data "tidying'": getting the data in a format that is easy to manipulate, model, and visualise.

In this talk you'll see some of the crazy data sets that I've struggled with over the years, and learn the basic tools for making messy data tidy. I'll also discuss tidy tools, tools that take tidy data as input and return tidy data as output. The idea of a tidy tool is useful for critiquing existing R functions, and will help to explain why some tasks that seem like they should be easy are in fact quite hard. This work ties together reshape2 (http://cran.r-project.org/web/packages/reshape2/index.html), plyr (http://cran.r-project.org/web/packages/plyr/) and ggplot2 (http://had.co.nz/ggplot2/) with a consistent philosophy of data. Once you master this data format, you'll find it much easier to manipulate, model and visualise your data.

This meetup will follow our usual schedule, with pizza and networking starting at 6:15PM, Prof. Wickham at 7PM, and a post-meetup reception at a nearby bar.

Photo of New York Open Statistical Programming Meetup group
New York Open Statistical Programming Meetup
See more events
Microsoft
1290 Avenue of the Americas, 6th Floor · New York, NY