Visualising high-dimensional data tutorial by Prof Di Cook
Details
Visualising high-dimensional data
This workshop will focus on parts A and B, and if time allows we will venture into high dimensions, part C:
A. Review basic visualisation and inference with graphics: This part covers making plots using the grammar of graphics and how this fits into statistical inference. We will use the packages ggplot2 and nullabor.
B. Plotting multiple dimensions in a single static plot, adding interaction: The building blocks to viewing high-dimensions are generalised pairs plots and parallel coordinate plots, available in the R package GGally. There are many variations and options that will be discussed, along with making these interactive with the plotly package.
C. Using dynamic plots (tours) to examine models in the data space, beyond 3D: This part will cover the use of tours to examine multivariate spaces, in relation to dimension reduction techniques like principal component analysis and t-SNE, supervised and unsupervised classification models. We will also examine high-dimension, low-sample size problems. The tourr and spinifex packages will be used.
The workshop is interactive, bring your laptop and work along with the instructor, and do challenge exercises.
Materials are designed for an intermediate audience, users who are familiar with R, basic visualisation and tidyverse tools, and who would like to improve their knowledge about data visualisation.
Biography
Dianne Cook is Professor of Business Analytics at Monash University in Melbourne, Australia. She is a world leader in data visualisation, especially the visualisation of high-dimensional data using tours with low dimensional projections, and projection pursuit. She is currently focussing on bridging the gap between exploratory graphics an statistical inference. Di is a Fellow of the American Statistical Association, was recently the editor of the Journal of Computational and Graphical Statistics, and has been elected as an Ordinary Member of the R Foundation. Several of her students have won the prestigious American Statistical Association John Chambers Software Award, including Hadley Wickham, Yihui Xie, Carson Sievert, and most recently, Monash student Earo Wang.
