I am looking to do something in R that seems similar to what I would use the reshape package for, but not quite. I am looking to move some rows of a data frame into columns but not all. For example, my data frame looks something like:
我希望在R中做一些看起来类似于我将使用reshape包的东西,但不完全。我希望将数据帧的某些行移动到列中但不是全部。例如,我的数据框看起来像:
v1, v2, v3
info, time, 12:00
info, day, Monday
info, temperature, 70
data, 1, 2
data, 2, 2
data, 3, 1
data, 4, 1
data, 5, 3
I would like to transform it into something like:
我想把它变成像:
v1, v2, v3, info_time, info_day, info_temperature
data, 1, 2, 12:00, Monday, 70
data, 2, 2, 12:00, Monday, 70
data, 3, 1, 12:00, Monday, 70
data, 4, 1, 12:00, Monday, 70
data, 5, 3, 12:00. Monday, 70
Is there an easy way to do this? Does the reshape package help here?
是否有捷径可寻?重塑包有帮助吗?
Thank you in advance for all your help!
提前感谢您的帮助!
Vincent
文森特
2 个解决方案
#1
1
Try
尝试
library(reshape2)
indx <- df$v1=='data'
res <- cbind(df[indx,],dcast(df[!indx,],v1~v2, value.var='v3'))[,-4]
row.names(res) <- NULL
colnames(res)[4:6] <- paste('info', colnames(res)[4:6], sep="_")
res
# v1 v2 v3 info_day info_temperature info_time
#1 data 1 2 Monday 70 12:00
#2 data 2 2 Monday 70 12:00
#3 data 3 1 Monday 70 12:00
#4 data 4 1 Monday 70 12:00
#5 data 5 3 Monday 70 12:00
Or use dplyr/tidyr
或者使用dplyr / tidyr
library(dplyr)
library(tidyr)
cbind(df[indx,],
unite(df[!indx,], Var, v1, v2) %>%
mutate(id=1) %>%
spread(Var, v3)%>%
select(-id))
Or using base R
或使用基地R.
cbind(df[indx,],
reshape(transform(df[!indx,], v2= paste(v1, v2, sep="_")),
idvar='v1', timevar='v2', direction='wide')[,-1])
data
df <- structure(list(v1 = c("info", "info", "info", "data", "data",
"data", "data", "data"), v2 = c("time", "day", "temperature",
"1", "2", "3", "4", "5"), v3 = c("12:00", "Monday", "70", "2",
"2", "1", "1", "3")), .Names = c("v1", "v2", "v3"), class = "data.frame",
row.names = c(NA, -8L))
#2
0
A solution without external packages ( using df structure from Akrun):
没有外部包的解决方案(使用Akrun的df结构):
df1 <- cbind(df[4:8,1:3],apply(df[1:3,3,drop=FALSE],1,function(x) rep(x,nrow(df)-3)))
colnames(df1)[4:6] <- paste("info",df[1:3,2], sep = "_")
df1
> df1
v1 v2 v3 info_time info_day info_temperature
4 data 1 2 12:00 Monday 70
5 data 2 2 12:00 Monday 70
6 data 3 1 12:00 Monday 70
7 data 4 1 12:00 Monday 70
8 data 5 3 12:00 Monday 70
#1
1
Try
尝试
library(reshape2)
indx <- df$v1=='data'
res <- cbind(df[indx,],dcast(df[!indx,],v1~v2, value.var='v3'))[,-4]
row.names(res) <- NULL
colnames(res)[4:6] <- paste('info', colnames(res)[4:6], sep="_")
res
# v1 v2 v3 info_day info_temperature info_time
#1 data 1 2 Monday 70 12:00
#2 data 2 2 Monday 70 12:00
#3 data 3 1 Monday 70 12:00
#4 data 4 1 Monday 70 12:00
#5 data 5 3 Monday 70 12:00
Or use dplyr/tidyr
或者使用dplyr / tidyr
library(dplyr)
library(tidyr)
cbind(df[indx,],
unite(df[!indx,], Var, v1, v2) %>%
mutate(id=1) %>%
spread(Var, v3)%>%
select(-id))
Or using base R
或使用基地R.
cbind(df[indx,],
reshape(transform(df[!indx,], v2= paste(v1, v2, sep="_")),
idvar='v1', timevar='v2', direction='wide')[,-1])
data
df <- structure(list(v1 = c("info", "info", "info", "data", "data",
"data", "data", "data"), v2 = c("time", "day", "temperature",
"1", "2", "3", "4", "5"), v3 = c("12:00", "Monday", "70", "2",
"2", "1", "1", "3")), .Names = c("v1", "v2", "v3"), class = "data.frame",
row.names = c(NA, -8L))
#2
0
A solution without external packages ( using df structure from Akrun):
没有外部包的解决方案(使用Akrun的df结构):
df1 <- cbind(df[4:8,1:3],apply(df[1:3,3,drop=FALSE],1,function(x) rep(x,nrow(df)-3)))
colnames(df1)[4:6] <- paste("info",df[1:3,2], sep = "_")
df1
> df1
v1 v2 v3 info_time info_day info_temperature
4 data 1 2 12:00 Monday 70
5 data 2 2 12:00 Monday 70
6 data 3 1 12:00 Monday 70
7 data 4 1 12:00 Monday 70
8 data 5 3 12:00 Monday 70