Description
Introduce yourself to R and the powerful graphing library based on the Grammar of Graphics--ggplot2. Attendees will work in small teams to learn how to generate basic and advanced plots in ggplot2 to solve a variety of problems. The workshop will also review the fundamentals of data visualization to increase the readability and clarity of plots.
The workshop is open to all types of users, including those who are unfamiliar with R. We will mix some demonstration with small group-based projects. Basic principles of data visualization will also be emphasized alongside ggplot2 demonstrations to put the program into a larger context.
Audience
The workshop is targeted to individuals who are not familiar with ggplot2, including beginners who are new to the R software. Attendees will need to bring there own computer where we will install the R and ggplot2 software--don't worry, both are open source and free.
Time, Location, & Signup
The workshop begins October 8 in the IMSA classroom at 1871 located on the 12th floor of The Merchandise Mart (222 W. Merchandise Mart Plaza). It includes four sessions (outline below) meeting on consecutive Mondays at 6pm. The IMSA room isn't available on October 22. Depending on the number of attendees we will either meet in a smaller conference room or push the schedule a week.
There are only 30 seats available for this workshop due to the size limitations of the IMSA classroom. Interested attendees need to go to the CDVG meetup site to sign up for each of the four sessions.
Workshop Leaders
The workshop will be led by CDVG member Tom Schenk. Tom is a Senior Research Data Analyst at Northwestern University, Department of Medical Social Sciences. You can read more about Tom on his website. He also curates Data Nouveau--a collection of interesting data visualizations on the web.
Tom will be assisted by CDVG member Josh Doyle (who is relatively new to R & ggplot2 and will ask the dumb questions so others won't have to). We also expect to have some other experienced folks in the room to help out.
Workshop Outline
Introduction to R (October 8)
We will familiarized ourselves with the R environment with a gentle introduction to the basic functions. After installing R, we will import and inspect data sets while becoming familiar with R terminology. By the end of the class, we will conduct basic descriptions and plots of the data.
Introduction to ggplot2 (October 15)
We will begin to use the ggplot2 package to create basic, but handsome, univariate, bivariate, and time-series graphs. We will introduce the functions and terminology used in ggplot2. We will also explain the fundamentals of proper data visualization techniques and how it relates to the ggplot2 defaults.
Grammar of Graphics (October 22 or 29)
We will continue to show more advanced features of ggplot2, including how it relates to Leland Wilkinson's Grammar of Graphics. We will show how to plot more than 2 variables in a single graph using colors, shapes, and sizes. We will also discuss how human ability to perceive different shapes and colors should drive the choices we make in data visualization.
Plots for Publications (October 29 or November 5)
After learning how to make plots, we will learn how to customize graphs with custom colors, labels, and themes. We will emphasize how to create a customized look to be included in publications, including adding labels in diagrams to help readers.
Hi all, a couple afterthoughts re. topics from last night:
We saw that 1:10 + 1:2 is valid and "recycles" the shorter vector in order to perform the addition. Similarly, matrix (1:3, nrow = 6, ncol = 4) is valid, and recycles the provided content (which is c(1,2,3)) to fill the requested dimensions (6x4). "Recycling" is a pervasive concept in R.
Re. the equivalence of <- and = it should be noted that = is valid to define default parameter values in function definitions while <- is not.
I would add re. R and MATLAB that R is not particularly efficient, and that speaking generally MATLAB is for _computation_ while R is for _data_.
An example of using a logical vector to select vector elements, a topic we touched briefly, is:
v [v > 5]
where v is some numerical (or any kind) vector. The expression inside brackets evaluates to a logical vector equal in length to v.
Best, Daniel B
1 · October 9
Introductory, basic, succinct and informative with good leadership by Josh and participation by the attendees.
October 8
Short example on logical_vectors -- in class:
Examine output from the following:
1&0
1&-1
(10:0)
(10:0)&1
(10:0)&&1
October 8
Join or login to comment.