Past Meetup

Data Science with the Tidyverse

This Meetup is past

282 people went

Price: $5.00 /per person
Location image of event venue

Details

We are very excited to be hosting Hadley Wickham again in September! He'll be discussing the tidyverse and how all his packages fit together.

If you want even more Hadley, you can also sign up for his two-day master class (http://www.eventbrite.com/e/master-r-developer-workshop-new-york-city-tickets-21347014495?discount=NYCMeetup) in September. RStudio (http://www.rstudio.com) has graciously offered members of our group a $200 discount with code NYCMeetup (http://www.eventbrite.com/e/master-r-developer-workshop-new-york-city-tickets-21347014495?discount=NYCMeetup).

This month's meetup is sponsored by O'Reilly, who has given us a 20% discount to Strata (http://oreil.ly/25TWXOg) (September 26-29) with code UGNYHACKR20 (http://oreil.ly/25TWXOg).

A big thank you to Work-Bench (http://www.work-bench.com/) for hosting us this month.

About the Talk:

My goal is to create an environment for data science where you can spend your precious mental energy on the problem at hand, rather than fighting with a programming language. To this end, I've developed a number of packages that make individual parts of the process easier (ggplot2 (http://ggplot2.org/) for visualisation, dplyr (https://cran.rstudio.com/web/packages/dplyr/vignettes/introduction.html) for data manipulation, tidyr (https://github.com/hadley/tidyr) for data tidying, ...). Recently I've been thinking more about how the pieces fit together. In the words of Hal Abelson (https://www.csail.mit.edu/user/1535), "No matter how complex and polished the individual operations are, it is often the quality of the glue that most directly determines the power of the system."

In this talk, I'll discuss the idea of the tidyverse (https://twitter.com/hashtag/tidyverse), a set of conventions that ties together disparate package to provide a uniform interface for doing data science. Along the way you'll learn about tidy data, tibbles (https://github.com/hadley/tibble), list columns, pure functions, referential integrity, piping, and why ggplot2 (http://ggplot2.org/) should never have existed.

About Hadley:

Hadley is Chief Scientist at RStudio (https://www.rstudio.com/) and a member of the R Foundation (https://www.r-project.org/foundation/). He builds tools (both computational and cognitive) that make data science easier, faster, and more fun. His work includes packages for data science (ggplot2 (http://ggplot2.org/), dplyr (https://cran.rstudio.com/web/packages/dplyr/vignettes/introduction.html), tidyr (https://github.com/hadley/tidyr)), data ingest (readr (https://github.com/hadley/readr), readxl (https://github.com/hadley/readxl), haven (https://github.com/hadley/haven)), and principled software development (roxygen2 (https://github.com/yihui/roxygen2), testthat (https://github.com/hadley/testthat), devtools (https://github.com/hadley/devtools)). He is also a writer, educator, and frequent speaker promoting the use of R for data science. Learn more on his homepage, http://hadley.nz (http://hadley.nz/).

Pizza starts at 6, the talk at 7 and then we will go to the local bar.