在R中打开一个数据帧。

时间:2022-02-07 22:55:40

I have a stacked data frame

我有一个堆叠的数据框架。

a <- c(1,1,1,1,2,2,3,3,3,3,3,4,4,4,4)  
b <- c(200,201,201,200,220,220,200,220,203,204,204,203,220,200,200)  
d <- c(500,500,500,500,500,501,501,501,501,501,502,502,502,502,502)  
f <- c("G","G","M","M", "G","G","M","M","M","G","M","G","M","G","G")  
df <- data.frame(a,d,b,f)

I use dcast from reshape2 to unstack the data as follows

我使用reshape2中的dcast来将数据解压如下。

dcast(df,a+d+b ~ f)   
      a     d   b G M  
1     1   500 200 1 1  
2     1   500 201 1 1  
3     2   500 220 1 0  
4     2   501 220 1 0  
5     3   501 200 0 1  
6     3   501 203 0 1  
7     3   501 204 1 0  
8     3   501 220 0 1  
9     3   502 204 0 1  
10    4   502 200 2 0  
11    4   502 203 1 0  
12    4   502 220 0 1

It defaults to length since I have not put any aggregating function. What I would like however is to get

它默认为长度,因为我没有添加任何聚合函数。然而我想要的是得到!

a   d   b col_1 col_2  
1 500 200    G     M  
1 500 201    G     M  
2 500 220    G    NA  
...and so on  

I want to "widen" or unstack the data frame by transposing column f for a particular a+d+b combination and appending it to the frame. Is there an elegant way without having to loop through the combinations?

我想要“加宽”或将数据帧通过转置列f (a+d+b)组合,并将其添加到框架中。是否有一种优雅的方式,而不需要对组合进行循环?

EDIT: Not necessarily just 2 levels G & M in col f. I just want to put up col_1 col_2 col_3 which will transpose out the column f per unique a+d+b combination. I've done it with a for loop; but with a large data set it is unwieldy. I was looking to make the code quicker!

编辑:不一定只有两个级别的G & M在col f中,我只是想要设置col_1 col_2 col_3,它将转置出f / a+d+b组合。我做了一个for循环;但是,由于有大量的数据集,它是难以处理的。我想让代码更快!

1 个解决方案

#1


3  

dcast(df, a+d+b ~ f, fun.aggregate = function(x) as.character(x)[1])
#Using f as value column: use value.var to override.
#   a   d   b    G    M
#1  1 500 200    G    M
#2  1 500 201    G    M
#3  2 500 220    G <NA>
#4  2 501 220    G <NA>
#5  3 501 200 <NA>    M
#6  3 501 203 <NA>    M
#7  3 501 204    G <NA>
#8  3 501 220 <NA>    M
#9  3 502 204 <NA>    M
#10 4 502 200    G <NA>
#11 4 502 203    G <NA>
#12 4 502 220 <NA>    M

Re comment: perhaps you want this then:

评论:也许你想要这个:

library(data.table)
dt = data.table(df)

dt[, lapply(1:3, function(i) as.character(f)[i]), by = list(a, d, b)]
#    a   d   b V1 V2 V3
# 1: 1 500 200  G  M NA
# 2: 1 500 201  G  M NA
# 3: 2 500 220  G NA NA
# 4: 2 501 220  G NA NA
# 5: 3 501 200  M NA NA
# 6: 3 501 220  M NA NA
# 7: 3 501 203  M NA NA
# 8: 3 501 204  G NA NA
# 9: 3 502 204  M NA NA
#10: 4 502 203  G NA NA
#11: 4 502 220  M NA NA
#12: 4 502 200  G  G NA

#1


3  

dcast(df, a+d+b ~ f, fun.aggregate = function(x) as.character(x)[1])
#Using f as value column: use value.var to override.
#   a   d   b    G    M
#1  1 500 200    G    M
#2  1 500 201    G    M
#3  2 500 220    G <NA>
#4  2 501 220    G <NA>
#5  3 501 200 <NA>    M
#6  3 501 203 <NA>    M
#7  3 501 204    G <NA>
#8  3 501 220 <NA>    M
#9  3 502 204 <NA>    M
#10 4 502 200    G <NA>
#11 4 502 203    G <NA>
#12 4 502 220 <NA>    M

Re comment: perhaps you want this then:

评论:也许你想要这个:

library(data.table)
dt = data.table(df)

dt[, lapply(1:3, function(i) as.character(f)[i]), by = list(a, d, b)]
#    a   d   b V1 V2 V3
# 1: 1 500 200  G  M NA
# 2: 1 500 201  G  M NA
# 3: 2 500 220  G NA NA
# 4: 2 501 220  G NA NA
# 5: 3 501 200  M NA NA
# 6: 3 501 220  M NA NA
# 7: 3 501 203  M NA NA
# 8: 3 501 204  G NA NA
# 9: 3 502 204  M NA NA
#10: 4 502 203  G NA NA
#11: 4 502 220  M NA NA
#12: 4 502 200  G  G NA