For several efforts I'm involved in at the moment, I am running large datasets with numerous parameter combinations through a series of functions. The functions have a wrapper (so I can mclapply
) for ease of operation on a cluster. However, I run into two major challenges.
对于我目前参与的几项工作,我通过一系列功能运行具有众多参数组合的大型数据集。这些函数有一个包装器(所以我可以mclapply),以便在集群上轻松操作。但是,我遇到了两个主要挑战。
a) My parameter combinations are large (think 20k to 100k). Sometimes particular combinations will fail (e.g. survival is too high and mortality is too low so the model never converges as a hypothetical scenario). It's difficult for me to suss out ahead of time exactly which combinations will fail (life would be easier if I could do that). But for now I have this type of setup:
a)我的参数组合很大(想想20k到100k)。有时特定的组合会失败(例如,生存率太高而死亡率太低,因此模型永远不会收敛为假设情景)。我很难提前确定哪些组合会失败(如果我能做到这一点,生活会更容易)。但是现在我有这种类型的设置:
failsafe <- failwith(NULL, my_wrapper_function)
# This is what I run
# Note that input_variables contains a list of variables in each list item
results <- mclapply(input_variables, failsafe, mc.cores = 72)
# On my local dual core mac, I can't do this so the equivalent would be:
results <- llply(input_variables, failsafe, .progress = 'text')
The skeleton for my wrapper function looks like this:
我的包装函数的骨架如下所示:
my_wrapper_function <- function(tlist) {
run <- tryCatch(my_model(tlist$a, tlist$b, tlist$sA, tlist$Fec, m = NULL) , error=function(e) NULL)
...
return(run)
}
Is this the most efficient approach? If for some reason a particular combination of variables crashes the model, I need it to return a NULL
and carry on with the rest. However, I still have issues that this fails less than gracefully.
这是最有效的方法吗?如果由于某种原因,特定的变量组合会使模型崩溃,我需要它返回一个NULL并继续执行其余的操作。但是,我仍然有一些问题,这个问题的失败不如优雅。
b) Sometimes a certain combination of inputs does not crash the model but takes too long to converge. I set a limit on the computation time on my cluster (say 6 hours) so I don't waste my resources on something that is stuck. How can I include a timeout such that if a function call takes more than x time on a single list item, it should move on? Calculating the time spent is trivial but a function mid simulation can't be interrupted to check the time, right?
b)有时某些输入组合不会使模型崩溃,但需要很长时间才能收敛。我对我的集群上的计算时间设置了一个限制(比如说6个小时),所以我不会把我的资源浪费在被卡住的东西上。如何包含超时,如果函数调用在单个列表项上花费的时间超过x时间,它应该继续?计算花费的时间是微不足道的,但是模拟中间的函数不能被中断以检查时间,对吧?
Any ideas, solutions or tricks are appreciated!
任何想法,解决方案或技巧都表示赞赏!
1 个解决方案
#1
12
You may well be able to manage graceful-exits-upon-timout using a combination of tryCatch()
and evalWithTimeout()
from the R.utils
package. See also this post, which presents similar code and unpacks it in a bit more detail.
您可以使用R.utils包中的tryCatch()和evalWithTimeout()组合管理graceful-on-timout。另请参阅这篇文章,它提供了类似的代码并将其更详细地解压缩。
require(R.utils)
myFun <- function(x) {Sys.sleep(x); x^2}
## evalWithTimeout() times out evaluation after 3.1 seconds, and then
## tryCatch() handles the resulting error (of class "TimeoutException") with
## grace and aplomb.
myWrapperFunction <- function(i) {
tryCatch(expr = evalWithTimeout(myFun(i), timeout = 3.1),
TimeoutException = function(ex) "TimedOut")
}
sapply(1:5, myWrapperFunction)
# [1] "1" "4" "9" "TimedOut" "TimedOut"
#1
12
You may well be able to manage graceful-exits-upon-timout using a combination of tryCatch()
and evalWithTimeout()
from the R.utils
package. See also this post, which presents similar code and unpacks it in a bit more detail.
您可以使用R.utils包中的tryCatch()和evalWithTimeout()组合管理graceful-on-timout。另请参阅这篇文章,它提供了类似的代码并将其更详细地解压缩。
require(R.utils)
myFun <- function(x) {Sys.sleep(x); x^2}
## evalWithTimeout() times out evaluation after 3.1 seconds, and then
## tryCatch() handles the resulting error (of class "TimeoutException") with
## grace and aplomb.
myWrapperFunction <- function(i) {
tryCatch(expr = evalWithTimeout(myFun(i), timeout = 3.1),
TimeoutException = function(ex) "TimedOut")
}
sapply(1:5, myWrapperFunction)
# [1] "1" "4" "9" "TimedOut" "TimedOut"