I have a large dataframe and want to tabulate all variable-paris. table() and xtabs() both do this, but the problem is:
我有一个大的dataframe,我想把所有的变量paris列出来。table()和xtabs()都能做到这一点,但问题是:
- xtabs() allows me to drop unused variable levels, which I need, but doesn't let me define the names of the dimensions
- xtabs()允许删除我需要的未使用的变量级别,但不允许定义维度的名称
- table() allows me to define dimension names, but not to drop the unused levels.
- 表()允许我定义维度名称,但不删除未使用的级别。
The reason I need to define the dimensionnames is that all this happens inside a for-loop (becasue I need to do 'everybody by everybody'), and this renders the names meaningless. Below is a 'simple' example to show what I mean.
我需要定义维度名称的原因是,所有这些都发生在for循环中(因为我需要做‘每个人都做’),这使得名称变得毫无意义。下面是一个简单的例子来说明我的意思。
var.3=factor(rep(c("m","f","t"), c(5,5,2)))
df <- data.frame(var.1=rep(1:4, 1:4), var.2=rep(c("A","B"), 5), var3=var.3[1:10])
levels(df[,3]) # the "t" level is not in the df!
tabs.list<- list()
xtabs.list<- list()
for (i in 1:(ncol(df)-1)){
for (j in (i+1):ncol(df)) {
tabs.list[[paste(sep=" ", colnames(df)[i], "by",colnames(df)[j])]] <-
table(df[,i],df[,j], dnn=list(colnames(df)[i], colnames(df)[j]))
xtabs.list[[paste(sep=" ", colnames(df)[i], "by",colnames(df)[j])]] <-
xtabs(~df[,i]+df[,j], drop.unused.levels=TRUE)
}
}
tabs.list
xtabs.list
#What I want:
for (i in 1:length(xtabs.list)){
names(dimnames(xtabs.list[[i]])) <- names(dimnames(tabs.list[[i]]))
}
xtabs.list
So two functions for crossclassifying data each have an option I would like to use!? Why can't I do both?
所以两个用于交叉分类的函数都有我想使用的选项!为什么我不能两者都做呢?
1 个解决方案
#1
2
It's pretty easy to "de-factorize" arguments by wrapping in as.character
通过对字符进行包装,可以很容易地“分解”参数。
tabs.list<- list()
for (i in 1:(ncol(df)-1)){
for (j in (i+1):ncol(df)) {
tabs.list[[paste(sep=" ", colnames(df)[i], "by",colnames(df)[j])]] <-
table( as.character(df[,i]),
as.character(df[,j]),
dnn=list(colnames(df)[i], colnames(df)[j]))
}
}
tabs.list
#1
2
It's pretty easy to "de-factorize" arguments by wrapping in as.character
通过对字符进行包装,可以很容易地“分解”参数。
tabs.list<- list()
for (i in 1:(ncol(df)-1)){
for (j in (i+1):ncol(df)) {
tabs.list[[paste(sep=" ", colnames(df)[i], "by",colnames(df)[j])]] <-
table( as.character(df[,i]),
as.character(df[,j]),
dnn=list(colnames(df)[i], colnames(df)[j]))
}
}
tabs.list