We're excited to host Jared Lander, Chief Data Scientist of Lander Analytics, the organizer of the New York Open Statistical Programming Meetup and the New York R Conference, and author of R for Everyone, to talk about parallel computing in R.
6:15-7: Food & networking
7-7:10: Kick-off and announcements
Everyone wants their code to run faster and there are numerous ways to achieve this goal. We start by looking at popular packages `dplyr`, `data.table` and `purrr` and the corresponding parallel implementations. We then turn our attention to writing simple C++ functions integrated into R, both sequentially and in parallel. We also build a `data.frame` aggregation function, starting sequentially, ending in parallel. Throughout this talk we see how to speed up code by running in parallel, locally and across nodes, in R and C++, all within the friendly confines of RStudio.
Jared Lander is the Chief Data Scientist of Lander Analytics a data science consultancy based in New York City, the Organizer of the New York Open Statistical Programming Meetup and the New York R Conference and an Adjunct Professor of Statistics at Columbia University. With a masters from Columbia University in statistics and a bachelors from Muhlenberg College in mathematics, he has experience in both academic research and industry. His work for both large and small organizations ranges from music and fund raising to finance and humanitarian relief efforts.
He specializes in data management, multilevel models, machine learning, generalized linear models, data management and statistical computing. He is the author of R for Everyone: Advanced Analytics and Graphics, a book about R Programming geared toward Data Scientists and Non-Statisticians alike.