For our second meetup this month we are very excited to have Hadley Wickham presenting his new bigviz initiative.
About his talk:
R has a notorious reputation for not being able to deal with "big" data (and ggplot2 is a frequent culprit). Fortunately, this isn't an underlying problem with R, and it's something that we can fix with good programming practices and intelligent use of compiled code. In
this talk, I'll introduce a new package, bigvis, that aims to make it easier (and faster) to work with very large datasets.
Bigvis makes it possible to visualise[masked] million observations in just a few seconds. It is built around a pipeline of bin, summarise, smooth and visualise, and makes minimal sacrifices of flexibility to achieve fast performance. As well as discussing the visualisation
challenges when you have 10s of millions of observations, I'll also discuss the performance challenges, and how C++ and Rcpp make it pleasurable to integrate compiled code into R.
For those of you who don't know about Hadley:
Hadley Wickham is Chief Scientist at RStudio and an adjunct Assistant Professor at Rice University. He's interested in building tools (both computational and cognitive) that make data preparation, visualization and analysis easier. His contributions to R include over 30 R
packages, for data analysis (ggplot2, plyr, reshape), making frustrating parts of R easier to use (lubridate for dates, stringr for strings, httr for accessing web APIs), and for streamlining the R package development process (roxygen2, testthat, devtools, profr,
Pizza starts at 6:15, Hadley will begin at 7 and we will head to the bar after.
Thank you again to Knewton for hosting.