将长格式转换为宽格式

时间:2021-11-16 04:27:15

My data frame looks like this:

我的数据框如下所示:

 x <- data.frame(c("a","a","a","a","b","b","c","c","c","a", "a"), c(1,2,3,4,1,2,1,2,3, 1, 2))
names(x) <- c("id","nr") 

      id      nr
   1   a       1
   2   a       2
   3   a       3
   4   a       4
   5   b       1
   6   b       2
   7   c       1
   8   c       2
   9   c       3
   10  a       1
   11  a       2

I want to have something like this:

我想要这样的东西:

  id   1  2  3  4
   a   1  2  3  4
   a   1  2  NA NA
   b   1  2  NA NA
   c   1  2  3  NA

I already used dcast(x, id ~ nr, value.var ="nr") but I got the warning:

我已经使用了dcast(x,id~nr,value.var =“nr”),但我得到了警告:

"Aggregation function missing: defaulting to length".

“缺少聚合函数:默认为长度”。

I understand that this is due to non-unique rows. Also I created groups which give me the results above. But is there a way to create it without having to create groups?

我知道这是由于非唯一的行。我也创建了组,它给了我上面的结果。但有没有办法创建它而无需创建组?

x <- data.frame(c("a","a","a","a","b","b","c","c","c","a", "a"), 
c(1,1,1,1,1,1,1,1,1,2,2), c(1,2,3,4,1,2,1,2,3, 1, 2))
names(x) <- c("id", "group","nr")

dcast(x, id + group ~ nr, value.var = "nr")

1 个解决方案

#1


You may need a grouping variable. Instead of creating it manually as showed in the example, we can use rleid and then try with dcast from the devel version of data.table. i.e. v1.9.5+. Instructions to install the devel version are here

您可能需要一个分组变量。我们可以使用rleid而不是像示例中所示手动创建它,然后从devel版本的data.table尝试使用dcast。即v1.9.5 +。安装devel版本的说明在这里

library(data.table)
dcast(setDT(x)[, gr:=rleid(id)], id+gr~nr, value.var='nr')[,gr:=NULL][]
#   id 1 2  3  4
#1:  a 1 2  3  4
#2:  a 1 2 NA NA
#3:  b 1 2 NA NA
#4:  c 1 2  3 NA

Or as @Arun mentioned in the comments, we can do this directly within the dcast itself

或者正如评论中提到的@Arun,我们可以直接在dcast中做到这一点

dcast(setDT(x), id + rleid(id) ~ nr, value.var = 'nr')[,id_1:= NULL]

#1


You may need a grouping variable. Instead of creating it manually as showed in the example, we can use rleid and then try with dcast from the devel version of data.table. i.e. v1.9.5+. Instructions to install the devel version are here

您可能需要一个分组变量。我们可以使用rleid而不是像示例中所示手动创建它,然后从devel版本的data.table尝试使用dcast。即v1.9.5 +。安装devel版本的说明在这里

library(data.table)
dcast(setDT(x)[, gr:=rleid(id)], id+gr~nr, value.var='nr')[,gr:=NULL][]
#   id 1 2  3  4
#1:  a 1 2  3  4
#2:  a 1 2 NA NA
#3:  b 1 2 NA NA
#4:  c 1 2  3 NA

Or as @Arun mentioned in the comments, we can do this directly within the dcast itself

或者正如评论中提到的@Arun,我们可以直接在dcast中做到这一点

dcast(setDT(x), id + rleid(id) ~ nr, value.var = 'nr')[,id_1:= NULL]