Manipulating DataFrames with Julia and R


Details
One of the most critical part in doing regression, classification, or visualization of any data is the pre-processing involved. Once the proper data transformation is carried out, exploratory data analysis and modeling become trivial in many cases.
The presentation follows the basic workflow popularized by Hadley Wickam (https://r4ds.had.co.nz/explore-intro.html) to perform data mangling of DataFrames akin to SQL queries in databases.
Julia and R provide complementary toolsets to perform both exploratory data analysis and modeling. The discussion will present examples where both Julia and R can be used in the same workflow exploiting the corresponding strengths provided by their libraries to seamlessly process the same dataframe in one code where Julia interfaces with R to do some tasks best processed by R libraries.
Outline:
- split-apply-combine
- reshaping
- joins and merge
- missing data
- threads and distributed processing
- visualization
We will have a practical session after the presentation where we will analyze the dataset (anonymized) you bring or available in the internet for download to have more deeper understanding of the processing involved.

Manipulating DataFrames with Julia and R