如何在R函数中检测*变量名[重复]

This question already has an answer here:

这个问题在这里已有答案:

making sure a function does not use a global variable [duplicate] 1 answer

确保函数不使用全局变量[重复] 1个答案

Suppose I have a function:

假设我有一个功能:

f <- function() {
  x + 1
}

Here x is a free variable since its value is not defined within function f. Is there a way that I can obtain the variable name, say x, from a defined function, say f?

这里x是一个*变量,因为它的值没有在函数f中定义。有没有办法可以从定义的函数中获取变量名,比如说x,比如f?

I am asking this question while maintaining others' old R codes. There are a lot of free variables used, and that makes debugging hard.

我在维护别人的旧R代码的同时问这个问题。使用了很多*变量,这使调试变得困难。

Any suggestions are welcomed as well.

任何建议也受到欢迎。

2 个解决方案

#1

The codetools package has functions for this purpose, eg findGlobals

codetools包具有用于此目的的功能,例如findGlobals

findGlobals(f, merge=FALSE)[['variables']]
# [1] "x"

if we redefine the function to have a named argument x then no variables are returned.

如果我们重新定义函数以具有命名参数x,则不返回任何变量。

f2 <- function(x){
  x+1
}
findGlobals(f2, merge=FALSE)[['variables']]
# character(0)

#2

This is a rough stab at it.

这是一个粗略的刺。

find_vars <- function(f, vars=list(found=character(), defined=names(formals(f)))) {
    if( is.function(f) ) {
        # function, begin search on body
        return(find_vars(body(f), vars))
    } else if (is.call(f) && deparse(f[[1]]) == "<-") {
        # assignment with <- operator
        if (is.recursive(f[[2]])) {
           if (is.call(f[[2]]) && deparse(f[[2]][[1]]) == "$") {
               vars$defined <- unique( c(vars$defined, deparse(f[[2]][[1]])) )  
           } else {
               warning(paste("unable to determine assignments variable in", deparse(f[[2]])))
           }
        } else {
            vars$defined <- unique( c(vars$defined, deparse(f[[2]])) )  
        }
        vars <- find_vars(f[[3]], vars)
    } else if (is.call(f) && deparse(f[[1]]) == "$") {
        # assume "b" is ok in a$b
        vars <- find_vars(f[[2]], vars)
    } else if (is.call(f) && deparse(f[[1]]) == "~") {
        #skip formulas
    } else if (is.recursive(f)) {
        # compound object, iterate through sub-parts
        v <- lapply(as.list(f)[-1], find_vars, vars)
        vars$defined <- unique( c(vars$defined, unlist(sapply(v, `[[`, "defined"))) )
        vars$found <- unique( c(vars$found, unlist(sapply(v, `[[`, "found"))) )
    } else if (is(f, "name")) {
        # standard variable name/symbol
        vars$found <- unique( c(vars$found, deparse(f)))
    }
    vars
}

find_free <- function(f) {
    r <- find_vars(f)
    return(setdiff(r$found, r$defined))
}

Then you could use it like

然后你可以像使用它一样

f <- function() {
  z <- x + 1
  z
}
find_free(f)
# [1] "x"

I'm sure there are many possibilities for a false positives and I didn't do any special coding for functions with non standard evaluation. For example

我确信有很多可能存在误报,我没有对非标准评估函数进行任何特殊编码。例如

g <- function(df) {
  with(df, mpg + disp)
}
g(head(mtcars))
# [1] 181 181 131 279 379 243

but

find_free(g)
# [1] "mpg"  "disp"

I already put in a special branch for the $ operator and formulas; you could put in a special branch for functions that have non standard evaluation like with() or subset() or whatever you like. It depends on what your code ends up looking like.

我已经为$运算符和公式添加了一个特殊的分支;您可以为具有非标准评估的函数(例如with()或subset()或任何您喜欢的函数)添加特殊分支。这取决于你的代码最终看起来像什么。

This assumes all assignment is happening via a standard <-. There are other ways to assign variables (ie, assign()) that would go undetected. We also ignore all function calls. So if you call myfun(1), it will not report myfun as being a free variable even though it may potentially be a "free function" defined else where in the code.

这假设所有分配都是通过标准< - 进行的。还有其他方法可以分配未检测到的变量(即assign())。我们也忽略了所有函数调用。因此,如果你调用myfun(1),它就不会将myfun报告为*变量,即使它可能是代码中定义的“*函数”。

So this may not be perfect, but it should act as a decent screen for potential problems.

所以这可能并不完美,但它应该成为解决潜在问题的合适屏幕。

#1