R - xtabs() vs table():删除级别并定义变量名?

时间:2022-07-05 14:56:52

I have a large dataframe and want to tabulate all variable-paris. table() and xtabs() both do this, but the problem is:

我有一个大的dataframe,我想把所有的变量paris列出来。table()和xtabs()都能做到这一点,但问题是:

  1. xtabs() allows me to drop unused variable levels, which I need, but doesn't let me define the names of the dimensions
  2. xtabs()允许删除我需要的未使用的变量级别,但不允许定义维度的名称
  3. table() allows me to define dimension names, but not to drop the unused levels.
  4. 表()允许我定义维度名称,但不删除未使用的级别。

The reason I need to define the dimensionnames is that all this happens inside a for-loop (becasue I need to do 'everybody by everybody'), and this renders the names meaningless. Below is a 'simple' example to show what I mean.

我需要定义维度名称的原因是,所有这些都发生在for循环中(因为我需要做‘每个人都做’),这使得名称变得毫无意义。下面是一个简单的例子来说明我的意思。

var.3=factor(rep(c("m","f","t"), c(5,5,2)))
df <- data.frame(var.1=rep(1:4, 1:4), var.2=rep(c("A","B"), 5), var3=var.3[1:10])
levels(df[,3])           # the "t" level is not in the df!
tabs.list<- list()
xtabs.list<- list()
for (i in 1:(ncol(df)-1)){
  for (j in (i+1):ncol(df)) {
    tabs.list[[paste(sep=" ", colnames(df)[i], "by",colnames(df)[j])]] <-
      table(df[,i],df[,j], dnn=list(colnames(df)[i], colnames(df)[j]))
    xtabs.list[[paste(sep=" ", colnames(df)[i], "by",colnames(df)[j])]] <-
      xtabs(~df[,i]+df[,j], drop.unused.levels=TRUE)
  }
}
tabs.list
xtabs.list
#What I want: 
for (i in 1:length(xtabs.list)){
names(dimnames(xtabs.list[[i]])) <- names(dimnames(tabs.list[[i]]))
}
xtabs.list

So two functions for crossclassifying data each have an option I would like to use!? Why can't I do both?

所以两个用于交叉分类的函数都有我想使用的选项!为什么我不能两者都做呢?

1 个解决方案

#1


2  

It's pretty easy to "de-factorize" arguments by wrapping in as.character

通过对字符进行包装,可以很容易地“分解”参数。

tabs.list<- list()
for (i in 1:(ncol(df)-1)){
    for (j in (i+1):ncol(df)) {
      tabs.list[[paste(sep=" ", colnames(df)[i], "by",colnames(df)[j])]] <-
        table( as.character(df[,i]), 
               as.character(df[,j]), 
               dnn=list(colnames(df)[i], colnames(df)[j])) 
                              }
                           }
tabs.list

#1


2  

It's pretty easy to "de-factorize" arguments by wrapping in as.character

通过对字符进行包装,可以很容易地“分解”参数。

tabs.list<- list()
for (i in 1:(ncol(df)-1)){
    for (j in (i+1):ncol(df)) {
      tabs.list[[paste(sep=" ", colnames(df)[i], "by",colnames(df)[j])]] <-
        table( as.character(df[,i]), 
               as.character(df[,j]), 
               dnn=list(colnames(df)[i], colnames(df)[j])) 
                              }
                           }
tabs.list