在R中并行运行for循环

时间:2022-11-17 13:48:47

I have a for loop that is something like this:

我有一个这样的for循环:

for (i=1:150000) {
   tempMatrix = {}
   tempMatrix = functionThatDoesSomething() #calling a function
   finalMatrix =  cbind(finalMatrix, tempMatrix)

}

Could you tell me how to make this parallel ?

你能告诉我如何使它平行吗?

I tried this based on an example online, but am not sure if the syntax is correct. It also didn't increase the speed much.

我基于在线示例尝试了这个,但我不确定语法是否正确。它也没有太多提高速度。

finalMatrix = foreach(i=1:150000, .combine=cbind) %dopar%  {
   tempMatrix = {}
   tempMatrix = functionThatDoesSomething() #calling a function

   cbind(finalMatrix, tempMatrix)

}

1 个解决方案

#1


33  

Thanks for your feedback. I did look up parallel after I posted this question.

感谢您的反馈意见。我发布这个问题后,我确实查了一下。

Finally after a few tries, I got it running. I have added the code below in case it is useful to others

经过几次尝试后,我开始运行了。我已经添加了下面的代码,以防它对其他人有用

library(foreach)
library(doParallel)

#setup parallel backend to use many processors
cores=detectCores()
cl <- makeCluster(cores[1]-1) #not to overload your computer
registerDoParallel(cl)

finalMatrix <- foreach(i=1:150000, .combine=cbind) %dopar% {
   tempMatrix = functionThatDoesSomething() #calling a function
   #do other things if you want

   tempMatrix #Equivalent to finalMatrix = cbind(finalMatrix, tempMatrix)
}
#stop cluster
stopCluster(cl)

Note - I must add a note that if the user allocates too many processes, then user may get this error: Error in serialize(data, node$con) : error writing to connection

注意 - 我必须添加一个注释,如果用户分配了太多进程,那么用户可能会收到此错误:序列化错误(数据,节点$ con):写入连接时出错

Note - If .combine in the foreach statement is rbind , then the final object returned would have been created by appending output of each loop row-wise.

注 - 如果foreach语句中的.combine是rbind,则返回的最终对象将通过逐行追加每个循环的输出来创建。

Hope this is useful for folks trying out parallel processing in R for the first time like me.

希望这对于像我这样第一次在R中尝试并行处理的人来说非常有用。

References: http://www.r-bloggers.com/parallel-r-loops-for-windows-and-linux/ https://beckmw.wordpress.com/2014/01/21/a-brief-foray-into-parallel-processing-with-r/

参考文献:http://www.r-bloggers.com/parallel-r-loops-for-windows-and-linux/ https://beckmw.wordpress.com/2014/01/21/a-brief-foray-成并行处理与 - R /

#1


33  

Thanks for your feedback. I did look up parallel after I posted this question.

感谢您的反馈意见。我发布这个问题后,我确实查了一下。

Finally after a few tries, I got it running. I have added the code below in case it is useful to others

经过几次尝试后,我开始运行了。我已经添加了下面的代码,以防它对其他人有用

library(foreach)
library(doParallel)

#setup parallel backend to use many processors
cores=detectCores()
cl <- makeCluster(cores[1]-1) #not to overload your computer
registerDoParallel(cl)

finalMatrix <- foreach(i=1:150000, .combine=cbind) %dopar% {
   tempMatrix = functionThatDoesSomething() #calling a function
   #do other things if you want

   tempMatrix #Equivalent to finalMatrix = cbind(finalMatrix, tempMatrix)
}
#stop cluster
stopCluster(cl)

Note - I must add a note that if the user allocates too many processes, then user may get this error: Error in serialize(data, node$con) : error writing to connection

注意 - 我必须添加一个注释,如果用户分配了太多进程,那么用户可能会收到此错误:序列化错误(数据,节点$ con):写入连接时出错

Note - If .combine in the foreach statement is rbind , then the final object returned would have been created by appending output of each loop row-wise.

注 - 如果foreach语句中的.combine是rbind,则返回的最终对象将通过逐行追加每个循环的输出来创建。

Hope this is useful for folks trying out parallel processing in R for the first time like me.

希望这对于像我这样第一次在R中尝试并行处理的人来说非常有用。

References: http://www.r-bloggers.com/parallel-r-loops-for-windows-and-linux/ https://beckmw.wordpress.com/2014/01/21/a-brief-foray-into-parallel-processing-with-r/

参考文献:http://www.r-bloggers.com/parallel-r-loops-for-windows-and-linux/ https://beckmw.wordpress.com/2014/01/21/a-brief-foray-成并行处理与 - R /