Skip to content

R workshop XX: Parallel Computing with R

Photo of Vivian Zhang
Hosted By
Vivian Z.
R workshop XX: Parallel Computing with R

Details

Contributed by

Yuan Huang, PM Intern of SupStat Inc,

Tong He, Data Scientist of SupStat Inc.

Vivian Zhang, CTO of SupStat Inc, will deliver this workshop.

Content:

Our slides can be found:

http://nycdatascience.com/slides/parallel_R/index.html#1

http://nycdatascience.com/slides/parallel_R/examples_general.html

http://nycdatascience.com/slides/parallel_R/example_crossvalidataion.html

http://nycdatascience.com/slides/parallel_R/example_web.html

We will go over the steps toward parallel computing.

1.Whether the problem is parallel-able ?

2.Tips to improve the parallel computing's efficiency.

3.Implementation in R.

We will discuss how to do load balance, how to reduce parallel over-head, how to make sure each nodes have different random number and the few statistical models to be paralleled.

And do a overview of

1.Rmpi ( R interface to MPI; flexible; powerful, but more complex.)

2.Snow (will be used for backends with foreach package today)

3.multicore (work only on a single node and Linux-like machine)

4.parallel (hybrid package containing snow and multicore)

5.foreach (parallel backends doSNOW / doMPI / doMC)

In the end, we will give examples by using foreach package:

  1. Bootstrapping: calculate CI for median.

2.Random Forest

  1. Calculate the pairwise distance

4.Cross Validataion

  1. Web scrapper
Photo of AI Zero to Hero group
AI Zero to Hero
See more events
On Deck
1400 Broadway 25th Floor · new york, NY