Parallelization of Simulations with the foreach Package and Missing Data in R

Details
Come out to our March Meetup to hear talks from two great R Ladies!
First up is Elizabeth Sweeney of Flatiron Health, a loyal R Ladies NYC member, who will talk about Parallelization of Simulations with the foreach R Package: Progression Free Survival Assessed Using Electronic Health Records. About the talk: In oncology, progression free survival (PFS) is defined as the time to detection of disease progression or death, and is commonly captured in clinical trials as an endpoint to evaluate treatment efficacy. In trial settings, progression is assessed at regular intervals. In contrast, progression from electronic health record (EHR) data is collected at variable intervals when patients see their oncologist. Therefore, to better understand how sensitive EHR-derived real world progression variable is to treatment effects, we have designed a set of simulations of progression in both the clinical trial and EHR settings at Flatiron Health. We simulate two major sources of error (variable progression assessments and error from extracting data from the EHR), and measure the impact of these errors on our ability to detect a known treatment effect. To speed the run time of these simulations, we use the foreach package. The foreach package provides a new looping construct for executing R code repeatedly which supports parallel execution. In this talk I will give an overview of the simulations with a focus on the use of the foreach package for parallelization.
Following Elizabeth is Mine Dogucu, founder of both R Ladies Columbus and R Ladies Sarasota, who will introduce missing data terminology and ways of handling missing data using R. Details: It is common for data scientists and researchers to encounter missing data. The term “missing” often has a bad connotation and is usually considered a nuisance. Advances in missing data treatments such as multiple imputation and maximum likelihood procedures allow data analysts to obtain unbiased parameter estimates under certain circumstances.
About the speakers:
Elizabeth Sweeney is a senior quantitative scientist at Flatiron Health where she works on the analysis of electronic health record (EHR) data in oncology. Previously she was a Rice Academy Postdoctoral Fellow in the Statistics department at Rice University. She did her PhD on methods for neuroimaging analysis in the Department of Biostatistics at the Johns Hopkins Bloomberg School of Public Health with Ciprian Crainiceanu and Taki Shinohara. She is an avid R user and co-taught a course of neuroimaging analysis in R on Coursera called Neurohacking in R.
Mine Dogucu completed her PhD at the Ohio State University. During her graduate studies she founded R Ladies Columbus. In August 2017, she joined the Data Science Program at New College of Florida as Assistant Professor of Applied Statistics. She has been teaching statistics courses both at the undergraduate and graduate levels. She also has founded R Ladies Sarasota.

Parallelization of Simulations with the foreach Package and Missing Data in R