doMC vs doSNOW vs doSMP vs doMPI:为什么“foreach”在功能上的相同之处不同?

时间:2020-12-13 09:17:42

I've got a few test pieces of code that I've been running on various machines, always with the same results. I thought the philosophy behind the various do... packages was that they could be used interchangeably as a backend for foreach's %dopar%. Why is this not the case?

我有一些在不同的机器上运行的测试代码,它们的结果总是相同的。我认为各种各样背后的哲学是……包可以作为每个的%dopar%的后端交换使用。为什么不是这样呢?

For example, this code snippet works:

例如,这个代码片段工作:

library(plyr)
library(doMC)
registerDoMC()
x <- data.frame(V= c("X", "Y", "X", "Y", "Z" ), Z = 1:5)
ddply(x, .(V), function(df) sum(df$Z),.parallel=TRUE)

While each of these code snippets fail:

当这些代码片段失败时:

library(plyr)
library(doSMP)
workers <- startWorkers(2)
registerDoSMP(workers)
x <- data.frame(V= c("X", "Y", "X", "Y", "Z" ), Z = 1:5)
ddply(x, .(V), function(df) sum(df$Z),.parallel=TRUE) 
stopWorkers(workers)

library(plyr)
library(snow)
library(doSNOW)
cl <- makeCluster(2, type = "SOCK")
registerDoSNOW(cl)
x <- data.frame(V= c("X", "Y", "X", "Y", "Z" ), Z = 1:5)
ddply(x, .(V), function(df) sum(df$Z),.parallel=TRUE) 
stopCluster(cl)

library(plyr)
library(doMPI)
cl <- startMPIcluster(count = 2)
registerDoMPI(cl)
x <- data.frame(V= c("X", "Y", "X", "Y", "Z" ), Z = 1:5)
ddply(x, .(V), function(df) sum(df$Z),.parallel=TRUE) 
closeCluster(cl)

In all four cases, foreach(i = 1:3,.combine = "c") %dopar% {sqrt(i)} yields the exact same result, so I know I have the packages installed and working properly on each machine I've tested them on.

在这四种情况下,foreach(i = 1:3,.combine = "c") %dopar% {sqrt(i)}会产生完全相同的结果,因此我知道我已经在我测试过的每台机器上安装了包并能正常工作。

What is doMC doing differently from doSMP, doSNOW, and doMPI?

doMC与doSMP、doSNOW和doMPI有什么不同?

1 个解决方案

#1


31  

doMC forks the current R process so it inherits all the existing variables. All the other do backends only pass on explicitly requested variables. Unfortunately I didn't realise that, and only tested with doMC - this is something I hope to fix in the next version of plyr.

doMC派生当前的R进程,因此它继承所有现有的变量。所有其他do后端只传递显式请求的变量。不幸的是,我没有意识到这一点,只使用doMC进行了测试——我希望在下一个版本的plyr中修复这一点。

#1


31  

doMC forks the current R process so it inherits all the existing variables. All the other do backends only pass on explicitly requested variables. Unfortunately I didn't realise that, and only tested with doMC - this is something I hope to fix in the next version of plyr.

doMC派生当前的R进程,因此它继承所有现有的变量。所有其他do后端只传递显式请求的变量。不幸的是,我没有意识到这一点,只使用doMC进行了测试——我希望在下一个版本的plyr中修复这一点。