parSapply在全局环境中找不到对象。

时间:2021-09-24 13:54:36

I am trying to run code on several cores (I tried both the snow and parallel packages). I have

我正在尝试在几个核心上运行代码(我尝试了snow和parallel包)。我有

cl <- makeCluster(2)
y  <- 1:10
sapply(1:5, function(x) x + y)  # Works
parSapply(cl, 1:5, function(x) x + y)

The last line returns the error:

最后一行返回错误:

Error in checkForRemoteErrors(val) : 
  2 nodes produced errors; first error: object 'y' not found

Clearly parSapply isn't finding y in the global environment. Any ways to get around this? Thanks.

显然,parSapply在全球环境中找不到y。有办法解决这个问题吗?谢谢。

2 个解决方案

#1


22  

The nodes don't know about the y in the global environment on the master. You need to tell them somehow.

节点不知道全局环境中主节点上的y。你需要告诉他们。

library(parallel)
cl <- makeCluster(2)
y  <- 1:10
# add y to function definition and parSapply call
parSapply(cl, 1:5, function(x,y) x + y, y)
# export y to the global environment of each node
# then call your original code
clusterExport(cl, "y")
parSapply(cl, 1:5, function(x) x + y)

#2


6  

It is worth mentioning that your example will work if parSapply is called from within a function, although the real issue is where the function function(x) x + y is created. For example, the following code works correctly:

值得一提的是,如果从函数中调用parSapply,那么您的示例将起作用,尽管真正的问题是创建函数函数(x) x + y的位置。例如,以下代码工作正确:

library(parallel)
fun <- function(cl, y) {
  parSapply(cl, 1:5, function(x) x + y)
}
cl <- makeCluster(2)
fun(cl, 1:10)
stopCluster(cl)

This is because functions that are created in other functions are serialized along with the local environment in which they were created, while functions created from the global environment are not serialized along with the global environment. This can be useful at times, but it can also lead to a variety a problems if you're not aware of the issue.

这是因为在其他函数中创建的函数与创建它们的本地环境一起被序列化,而在全局环境中创建的函数与全局环境一起被序列化。这有时是有用的,但是如果你没有意识到这个问题,它也会导致各种各样的问题。

#1


22  

The nodes don't know about the y in the global environment on the master. You need to tell them somehow.

节点不知道全局环境中主节点上的y。你需要告诉他们。

library(parallel)
cl <- makeCluster(2)
y  <- 1:10
# add y to function definition and parSapply call
parSapply(cl, 1:5, function(x,y) x + y, y)
# export y to the global environment of each node
# then call your original code
clusterExport(cl, "y")
parSapply(cl, 1:5, function(x) x + y)

#2


6  

It is worth mentioning that your example will work if parSapply is called from within a function, although the real issue is where the function function(x) x + y is created. For example, the following code works correctly:

值得一提的是,如果从函数中调用parSapply,那么您的示例将起作用,尽管真正的问题是创建函数函数(x) x + y的位置。例如,以下代码工作正确:

library(parallel)
fun <- function(cl, y) {
  parSapply(cl, 1:5, function(x) x + y)
}
cl <- makeCluster(2)
fun(cl, 1:10)
stopCluster(cl)

This is because functions that are created in other functions are serialized along with the local environment in which they were created, while functions created from the global environment are not serialized along with the global environment. This can be useful at times, but it can also lead to a variety a problems if you're not aware of the issue.

这是因为在其他函数中创建的函数与创建它们的本地环境一起被序列化,而在全局环境中创建的函数与全局环境一起被序列化。这有时是有用的,但是如果你没有意识到这个问题,它也会导致各种各样的问题。

相关文章