R - tryCatch警告消息被写入数据

时间:2022-12-19 20:22:30

What I'm trying to achieve

我想要实现的目标

I'm trying to write my own 'impute' function in R with a tryCatch statement which: 1. outputs a warning/error message containing the function name so I can debug easier. 2. Raises a warning if the function runs ok but doesn't impute all the missing values.

我正在尝试使用tryCatch语句在R中编写自己的'impute'函数,其中:1。输出包含函数名称的警告/错误消息,以便我可以更轻松地进行调试。 2.如果函数运行正常但是没有计算所有缺失值,则发出警告。

ImputeVariables <- function(impute.var, impute.values, 
                        filter.var){
# function to impute values. 
# impute.var = variables with NAs
# impute.values = the missing value(s) to replace NAs, value labesl are levels
# filter.var = the variables to filter on. 
# filter.levels = the categories of filter.var
tryCatch({
    filter.levels <- names(impute.values)
    # Validation
    stopifnot(class(impute.var) == class(impute.values), 
             length(impute.values) > 0,
             sum(is.na(impute.values)) == 0)
    # Impute values
    for(level in filter.levels){
        impute.var[which(filter.var == level & is.na(impute.var))] <- 
            impute.values[level]
    }
    # Check if all NAs removed.  Throw warning if not. 
    if(sum(is.na(impute.var)) > 0){
        warning("Not all NAs removed")
    }
    # Return values
    return(impute.var)

}, 
    error = function(err) print(paste0("ImputeValues: ",err)),
    warning = function(war) print(paste0("ImputeValues: ",war))
)
}

impute.var and filter.var are vectors taken from a data.frame (they are vectors of Ages and Titles (e.g. 'Mr', 'Mrs') impute.values is a vector of the same type as impute.var but with labels taken from filter.var (i.e. is of the form c('Mr' = 30, 'Mrs' = 25...))

impute.var和filter.var是从data.frame中获取的向量(它们是Ages和Titles的向量(例如'Mr','Mrs')impute.values是与impute.var相同类型的向量但带有标签取自filter.var(即形式为c('Mr'= 30,'Mrs'= 25 ......))

The problem

问题

To check if my validation was working I supplied the function with a named vector of NAs, thusly:

为了检查我的验证是否有效,我为该函数提供了一个NAs的命名向量,因此:

ages <-   c(34, 22, NA, 17, 38, NA)
titles <- c("Mr", "Mr", "Mr", "Mrs", "Mrs", "Mrs")
ages.values <- c("Mr" = NA, "Mrs" = NA)

ages.new <- ImputeVariables(ages, ages.values, titles)

print(ages.new)

But it outputs the following:

但它输出如下:

 "ImputeValues: Error: class(impute.var) == class(impute.values) is not TRUE\n"
 "ImputeValues: Error: class(impute.var) == class(impute.values) is not TRUE\n"

The two lines are due to the function printing the ages.new vector and the following print statement printing ages.new (why?)

这两行是由于函数打印ages.new向量和以下print语句打印ages.new(为什么?)

If I comment out the validation (the stopifnot function) then I just get:

如果我注释掉验证(stopifnot函数),那么我得到:

"ImputeValues: simpleWarning in doTryCatch(return(expr), name, parentenv, handler): Not all NAs removed\n" 

What I'm asking

我在问什么

  1. Why does the tryCatch block behave this way?
  2. 为什么tryCatch块会以这种方式运行?
  3. Is my validation and error handling strategy optimal (obviously without the bug)?
  4. 我的验证和错误处理策略是否最佳(显然没有错误)?

Many thanks for your time.

非常感谢你的时间。

Rob

2 个解决方案

#1


1  

Thanks Oliver.

谢谢奥利弗。

The working code is now:

工作代码现在是:

 ImputeVariables <- function(impute.var, impute.values, 
                        filter.var){
# function to impute values. 
# impute.var = variables with NAs
# impute.values = the missing value(s) to replace NAs, value labesl are levels
# filter.var = the variables to filter on. 
# filter.levels = the categories of filter.var
tryCatch({
    filter.levels <- names(impute.values)
    # Validation
    stopifnot(class(impute.var) == class(impute.values), 
             length(impute.values) > 0,
             sum(is.na(impute.values)) == 0)
    # Impute values
    for(level in filter.levels){
        impute.var[which(filter.var == level & is.na(impute.var))] <- 
            impute.values[level]
    }
    # Check if all NAs removed.  Throw warning if not. 
    if(sum(is.na(impute.var)) > 0){
        warning("Not all NAs removed")
    }
    # Return values
    return(impute.var)

}, 
    error = function(err) stop(paste0("ImputeValues: ",err)),
    warning = function(war) {
        message(paste0("ImputeValues: ",war))
        return(impute.var)}
)
}

#2


0  

This is essentially two different problems. The first problem is that print statements within a function do not print to the terminal, they print to the scope of the function. As an example:

这基本上是两个不同的问题。第一个问题是函数内的print语句不会打印到终端,它们会打印到函数的范围内。举个例子:

> foo <- function(){
     print("bar")
  }
> foo()
[1] "bar"

It didn't print "bar" to your screen, it printed it to the function scope and then returned it. The reason it returned it was that it was the last value printed to the function scope, and so (lacking an explicit return() call) is the best candidate for what to return.

它没有在屏幕上打印“bar”,它将其打印到功能范围然后返回。返回它的原因是它是打印到函数作用域的最后一个值,因此(缺少显式的return()调用)是返回的最佳候选者。

So, your code is (in sequence):

所以,你的代码是(按顺序):

  1. Throwing an error;
  2. 抛出错误;
  3. Not treating that error normally, but instead passing it into tryCatch's error handler, where it is printed;
  4. 不正常处理该错误,而是将其传递到tryCatch的错误处理程序中,并在其中打印;
  5. Because it is the last thing printed within the function scope, since the return() statement is never hit due to the error, treating it as the return value from the function.
  6. 因为它是在函数范围内打印的最后一件事,因为由于错误而永远不会命中return()语句,所以将其视为函数的返回值。

If you really want to continue processing the input values even if the stopifnot() conditions are met, you don't want a stopifnot(): however you structure that it's likely to prevent the return() call from running and cause weirdness. What I'd suggest is instead moving the conditional checks currently in stopifnot() outside the tryCatch, and sticking them in a series of if() statements that throw warnings (not errors) if they don't match up. tryCatch isn't really necessary in this situation.

如果你真的想继续处理输入值,即使满足stopifnot()条件,你也不需要stopifnot():但是你构造它可能会阻止return()调用运行并导致怪异。我建议的是在tryCatch之外移动当前在stopifnot()中的条件检查,并将它们粘贴在一系列if()语句中,如果它们不匹配则抛出警告(而不是错误)。在这种情况下,tryCatch并不是必需的。

#1


1  

Thanks Oliver.

谢谢奥利弗。

The working code is now:

工作代码现在是:

 ImputeVariables <- function(impute.var, impute.values, 
                        filter.var){
# function to impute values. 
# impute.var = variables with NAs
# impute.values = the missing value(s) to replace NAs, value labesl are levels
# filter.var = the variables to filter on. 
# filter.levels = the categories of filter.var
tryCatch({
    filter.levels <- names(impute.values)
    # Validation
    stopifnot(class(impute.var) == class(impute.values), 
             length(impute.values) > 0,
             sum(is.na(impute.values)) == 0)
    # Impute values
    for(level in filter.levels){
        impute.var[which(filter.var == level & is.na(impute.var))] <- 
            impute.values[level]
    }
    # Check if all NAs removed.  Throw warning if not. 
    if(sum(is.na(impute.var)) > 0){
        warning("Not all NAs removed")
    }
    # Return values
    return(impute.var)

}, 
    error = function(err) stop(paste0("ImputeValues: ",err)),
    warning = function(war) {
        message(paste0("ImputeValues: ",war))
        return(impute.var)}
)
}

#2


0  

This is essentially two different problems. The first problem is that print statements within a function do not print to the terminal, they print to the scope of the function. As an example:

这基本上是两个不同的问题。第一个问题是函数内的print语句不会打印到终端,它们会打印到函数的范围内。举个例子:

> foo <- function(){
     print("bar")
  }
> foo()
[1] "bar"

It didn't print "bar" to your screen, it printed it to the function scope and then returned it. The reason it returned it was that it was the last value printed to the function scope, and so (lacking an explicit return() call) is the best candidate for what to return.

它没有在屏幕上打印“bar”,它将其打印到功能范围然后返回。返回它的原因是它是打印到函数作用域的最后一个值,因此(缺少显式的return()调用)是返回的最佳候选者。

So, your code is (in sequence):

所以,你的代码是(按顺序):

  1. Throwing an error;
  2. 抛出错误;
  3. Not treating that error normally, but instead passing it into tryCatch's error handler, where it is printed;
  4. 不正常处理该错误,而是将其传递到tryCatch的错误处理程序中,并在其中打印;
  5. Because it is the last thing printed within the function scope, since the return() statement is never hit due to the error, treating it as the return value from the function.
  6. 因为它是在函数范围内打印的最后一件事,因为由于错误而永远不会命中return()语句,所以将其视为函数的返回值。

If you really want to continue processing the input values even if the stopifnot() conditions are met, you don't want a stopifnot(): however you structure that it's likely to prevent the return() call from running and cause weirdness. What I'd suggest is instead moving the conditional checks currently in stopifnot() outside the tryCatch, and sticking them in a series of if() statements that throw warnings (not errors) if they don't match up. tryCatch isn't really necessary in this situation.

如果你真的想继续处理输入值,即使满足stopifnot()条件,你也不需要stopifnot():但是你构造它可能会阻止return()调用运行并导致怪异。我建议的是在tryCatch之外移动当前在stopifnot()中的条件检查,并将它们粘贴在一系列if()语句中,如果它们不匹配则抛出警告(而不是错误)。在这种情况下,tryCatch并不是必需的。