My data frame looks like this:
我的数据框如下所示:
x <- data.frame(c("a","a","a","a","b","b","c","c","c","a", "a"), c(1,2,3,4,1,2,1,2,3, 1, 2))
names(x) <- c("id","nr")
id nr
1 a 1
2 a 2
3 a 3
4 a 4
5 b 1
6 b 2
7 c 1
8 c 2
9 c 3
10 a 1
11 a 2
I want to have something like this:
我想要这样的东西:
id 1 2 3 4
a 1 2 3 4
a 1 2 NA NA
b 1 2 NA NA
c 1 2 3 NA
I already used dcast(x, id ~ nr, value.var ="nr")
but I got the warning:
我已经使用了dcast(x,id~nr,value.var =“nr”),但我得到了警告:
"Aggregation function missing: defaulting to length".
“缺少聚合函数:默认为长度”。
I understand that this is due to non-unique rows. Also I created groups which give me the results above. But is there a way to create it without having to create groups?
我知道这是由于非唯一的行。我也创建了组,它给了我上面的结果。但有没有办法创建它而无需创建组?
x <- data.frame(c("a","a","a","a","b","b","c","c","c","a", "a"),
c(1,1,1,1,1,1,1,1,1,2,2), c(1,2,3,4,1,2,1,2,3, 1, 2))
names(x) <- c("id", "group","nr")
dcast(x, id + group ~ nr, value.var = "nr")
1 个解决方案
#1
You may need a grouping variable. Instead of creating it manually as showed in the example, we can use rleid
and then try with dcast
from the devel version of data.table
. i.e. v1.9.5+
. Instructions to install the devel version are here
您可能需要一个分组变量。我们可以使用rleid而不是像示例中所示手动创建它,然后从devel版本的data.table尝试使用dcast。即v1.9.5 +。安装devel版本的说明在这里
library(data.table)
dcast(setDT(x)[, gr:=rleid(id)], id+gr~nr, value.var='nr')[,gr:=NULL][]
# id 1 2 3 4
#1: a 1 2 3 4
#2: a 1 2 NA NA
#3: b 1 2 NA NA
#4: c 1 2 3 NA
Or as @Arun mentioned in the comments, we can do this directly within the dcast
itself
或者正如评论中提到的@Arun,我们可以直接在dcast中做到这一点
dcast(setDT(x), id + rleid(id) ~ nr, value.var = 'nr')[,id_1:= NULL]
#1
You may need a grouping variable. Instead of creating it manually as showed in the example, we can use rleid
and then try with dcast
from the devel version of data.table
. i.e. v1.9.5+
. Instructions to install the devel version are here
您可能需要一个分组变量。我们可以使用rleid而不是像示例中所示手动创建它,然后从devel版本的data.table尝试使用dcast。即v1.9.5 +。安装devel版本的说明在这里
library(data.table)
dcast(setDT(x)[, gr:=rleid(id)], id+gr~nr, value.var='nr')[,gr:=NULL][]
# id 1 2 3 4
#1: a 1 2 3 4
#2: a 1 2 NA NA
#3: b 1 2 NA NA
#4: c 1 2 3 NA
Or as @Arun mentioned in the comments, we can do this directly within the dcast
itself
或者正如评论中提到的@Arun,我们可以直接在dcast中做到这一点
dcast(setDT(x), id + rleid(id) ~ nr, value.var = 'nr')[,id_1:= NULL]