I have a function that I use to get a "quick look" at a data.frame... I deal with a lot of survey data and this acts as a quick tool to see what's what.
我有一个函数,我用它来快速查看data.frame ...我处理了大量的调查数据,这可以作为一个快速的工具,看看是什么。
f.table <- function(x) {
if (is.factor(x[[1]])) {
frequency <- function(x) {
x <- round(length(x)/n, digits=2)
}
x <- na.omit(melt(x,c()))
x <- cast(x, variable ~ value, frequency)
x <- cbind(x,top2=x[,ncol(x)]+x[,ncol(x)-1], bottom=x[,2])
}
if (is.numeric(x[[1]])) {
frequency <- function(x) {
x[x > 1] <- 1
x[is.na(x)] <- 0
x <- round(sum(x)/n, digits=2)
}
x <- na.omit(melt(x))
x <- cast(x, variable ~ ., c(frequency, mean, sd, min, max))
x <- transform(x, variable=reorder(variable, frequency))
}
return(x)
}
What I find happens is that if I don't define "frequency" outside of the function, it returns wonky results for data frames with continuous variables. It doesn't seem to matter which definition I use outside of the function, so long as I do.
我发现的是,如果我没有在函数之外定义“频率”,它会返回具有连续变量的数据帧的奇怪结果。只要我这样做,我在功能之外使用哪个定义似乎并不重要。
try:
n <- 100
x <- data.frame(a=c(1:25),b=rnorm(100),c=rnorm(100))
x[x > 20] <- NA
Now, select either one of the frequency functions and paste them in and try it again:
现在,选择其中一个频率函数并将其粘贴并再次尝试:
frequency <- function(x) {
x <- round(length(x)/n, digits=2)
}
f.table(x)
Why is that?
这是为什么?
2 个解决方案
#1
1
Crucially, I think this is where your problem is. cast()
is evaluating those functions without reference to the function it was called from. Inside cast()
it evaluates fun.aggregate
via funstofun
and, although I don't really follow what it is doing, is getting stats:::frequency
and not your local one.
至关重要的是,我认为这就是你的问题所在。 cast()正在评估这些函数而不参考它所调用的函数。在cast()中,它通过funstofun来评估fun.aggregate,虽然我并不真正关注它正在做的事情,但是获取统计数据:::频率而不是本地数据。
Hence my comment to your Q. What do you wan the function to do? At the moment it would seem necessary to define a "frequency" function in the global environment so that cast()
or funstofun()
finds it. Give it a unique name so it is unlikely to * with anything so it should be the only thing found, say .Frequency()
. Without knowing what you want to do with the function (rather than what you thought the function [f.table] should do) it is a bit difficult to provide further guidance, but why not have .FrequencyNum()
and .FrequencyFac()
defined in the global workspace and rewrite your f.table()
wrapper calls to cast to use the relevant one?
因此,我对你的问题发表评论。你想做什么功能?目前,似乎有必要在全局环境中定义“频率”函数,以便cast()或funstofun()找到它。给它一个唯一的名称,这样它就不可能与任何东西冲突,所以它应该是唯一找到的东西,比如.Frequency()。不知道你想对函数做什么(而不是你认为函数[f.table]应该做什么),提供进一步的指导有点困难,但为什么不定义.FrequencyNum()和.FrequencyFac()在全局工作区中并重写你的f.table()包装调用to cast使用相关的?
.FrequencyFac <- function(X, N) {
round(length(X)/N, digits=2)
}
.FrequencyNum <- function(X, N) {
X[X > 1] <- 1
X[is.na(X)] <- 0
round(sum(X)/N, digits=2)
}
f.table <- function(x, N) {
if (is.factor(x[[1]])) {
x <- na.omit(melt(x, c()))
x <- dcast(x, variable ~ value, .FrequencyFac, N = N)
x <- cbind(x,top2=x[,ncol(x)]+x[,ncol(x)-1], bottom=x[,2])
}
if (is.numeric(x[[1]])) {
x <- na.omit(melt(x))
x <- cast(x, variable ~ ., c(.FrequencyNum, mean, sd, min, max), N = N)
##x <- transform(x, variable=reorder(variable, frequency))
## left this out as I wanted to see what cast returned
}
return(x)
}
Which I thought would work, but it is not finding N, and it should be. So perhaps I am missing something here?
我认为这可行,但它找不到N,它应该是。所以也许我在这里遗漏了一些东西?
By the way, it is probably not a good idea to rely on function that find n
(in your version) from outside the function. Always pass in the variables you need as arguments.
顺便说一下,依靠从函数外部找到n(在你的版本中)的函数可能不是一个好主意。始终传递您需要的变量作为参数。
#2
0
I don't have the package that contains melt
, but there are a couple potential issues I can see:
我没有包含熔化的包装,但我可以看到一些潜在的问题:
- Your
frequency
functions do notreturn
anything. - It's generally bad practice to alter function inputs (
x
is the input and the output). - There is already a generic
frequency
function instats
package in base R, which may cause issues with method dispatch (I'm not sure).
您的频率功能不会返回任何内容。
改变函数输入通常是不好的做法(x是输入和输出)。
基础R中的stats包中已经存在通用频率函数,这可能会导致方法调度问题(我不确定)。
#1
1
Crucially, I think this is where your problem is. cast()
is evaluating those functions without reference to the function it was called from. Inside cast()
it evaluates fun.aggregate
via funstofun
and, although I don't really follow what it is doing, is getting stats:::frequency
and not your local one.
至关重要的是,我认为这就是你的问题所在。 cast()正在评估这些函数而不参考它所调用的函数。在cast()中,它通过funstofun来评估fun.aggregate,虽然我并不真正关注它正在做的事情,但是获取统计数据:::频率而不是本地数据。
Hence my comment to your Q. What do you wan the function to do? At the moment it would seem necessary to define a "frequency" function in the global environment so that cast()
or funstofun()
finds it. Give it a unique name so it is unlikely to * with anything so it should be the only thing found, say .Frequency()
. Without knowing what you want to do with the function (rather than what you thought the function [f.table] should do) it is a bit difficult to provide further guidance, but why not have .FrequencyNum()
and .FrequencyFac()
defined in the global workspace and rewrite your f.table()
wrapper calls to cast to use the relevant one?
因此,我对你的问题发表评论。你想做什么功能?目前,似乎有必要在全局环境中定义“频率”函数,以便cast()或funstofun()找到它。给它一个唯一的名称,这样它就不可能与任何东西冲突,所以它应该是唯一找到的东西,比如.Frequency()。不知道你想对函数做什么(而不是你认为函数[f.table]应该做什么),提供进一步的指导有点困难,但为什么不定义.FrequencyNum()和.FrequencyFac()在全局工作区中并重写你的f.table()包装调用to cast使用相关的?
.FrequencyFac <- function(X, N) {
round(length(X)/N, digits=2)
}
.FrequencyNum <- function(X, N) {
X[X > 1] <- 1
X[is.na(X)] <- 0
round(sum(X)/N, digits=2)
}
f.table <- function(x, N) {
if (is.factor(x[[1]])) {
x <- na.omit(melt(x, c()))
x <- dcast(x, variable ~ value, .FrequencyFac, N = N)
x <- cbind(x,top2=x[,ncol(x)]+x[,ncol(x)-1], bottom=x[,2])
}
if (is.numeric(x[[1]])) {
x <- na.omit(melt(x))
x <- cast(x, variable ~ ., c(.FrequencyNum, mean, sd, min, max), N = N)
##x <- transform(x, variable=reorder(variable, frequency))
## left this out as I wanted to see what cast returned
}
return(x)
}
Which I thought would work, but it is not finding N, and it should be. So perhaps I am missing something here?
我认为这可行,但它找不到N,它应该是。所以也许我在这里遗漏了一些东西?
By the way, it is probably not a good idea to rely on function that find n
(in your version) from outside the function. Always pass in the variables you need as arguments.
顺便说一下,依靠从函数外部找到n(在你的版本中)的函数可能不是一个好主意。始终传递您需要的变量作为参数。
#2
0
I don't have the package that contains melt
, but there are a couple potential issues I can see:
我没有包含熔化的包装,但我可以看到一些潜在的问题:
- Your
frequency
functions do notreturn
anything. - It's generally bad practice to alter function inputs (
x
is the input and the output). - There is already a generic
frequency
function instats
package in base R, which may cause issues with method dispatch (I'm not sure).
您的频率功能不会返回任何内容。
改变函数输入通常是不好的做法(x是输入和输出)。
基础R中的stats包中已经存在通用频率函数,这可能会导致方法调度问题(我不确定)。