Using aggregate
, R creates a list Z that can be indexed on the form a$Z$`1.2`
, where the first number references the corresponding element in X, and likewise for Y. In addition, if X or Y has 10+ elements, the form changes to a$Z$`01.02`
(and assumedly 001.002
for 100+ elements).
使用聚合,R创建一个列表Z,可以在$ Z $`1.2`的形式上编入索引,其中第一个数字引用X中的相应元素,同样适用于Y.此外,如果X或Y有10个以上元素,表单更改为$ Z $`01.02`(对于100+元素,假设为001.002)。
Instead of having to index Z with the zero-padded index value of X and Y, how can I index with the actual X and Y values instead (eg. a$Z$`52.60`
), which seems much more intuitive!
不必使用零填充索引值X和Y来索引Z,而是如何使用实际的X和Y值进行索引(例如,$ Z $`52.60`),这看起来更直观!
df = data.frame(X=c(50, 52, 50), Y=c(60, 60, 60), Z=c(4, 5, 6))
a = aggregate(Z ~ X + Y, df, c)
str(a)
'data.frame': 2 obs. of 3 variables:
$ X: num 50 52
$ Y: num 60 60
$ Z:List of 2
..$ 1.1: num 4 6
..$ 1.2: num 5
2 个解决方案
#1
2
You easily can do this after aggregate
:
聚合后你可以很容易地做到这一点:
names(a$Z) <- paste(a$X, a$Y, sep=".")
Then check it out
然后检查出来
str(a)
'data.frame': 2 obs. of 3 variables:
$ X: num 50 52
$ Y: num 60 60
$ Z:List of 2
..$ 50.60: num 4 6
..$ 52.60: num 5
#2
0
1) Try tapply
instead:
1)尝试tapply:
ta <- tapply(df[[3]], df[-3], c)
ta[["50", "60"]]
## [1] 4 6
ta[["52", "60"]]
## [1] 5
2) subset Consider just not using aggregate
at all and use subset
to retrieve the values:
2)子集考虑根本不使用聚合并使用子集来检索值:
subset(df, X == 50 & Y == 60)$Z
## [1] 4 6
3) data.table Subsetting is even easier with data.table:
3)data.table使用data.table进行子集化更加容易:
library(data.table)
dt <- data.table(df, key = "X,Y")
dt[.(50, 60), Z]
## [1] 4 6
Note: If you are not actually starting with the df
shown in the question but rather a
is the result of a series of complex transformations then we can recover df
like this:
注意:如果你实际上没有从问题中显示的df开始,而是a是一系列复杂转换的结果,那么我们可以像这样恢复df:
df <- tidyr::unnest(a)
at which point any of the above could be used.
此时可以使用上述任何一种。
#1
2
You easily can do this after aggregate
:
聚合后你可以很容易地做到这一点:
names(a$Z) <- paste(a$X, a$Y, sep=".")
Then check it out
然后检查出来
str(a)
'data.frame': 2 obs. of 3 variables:
$ X: num 50 52
$ Y: num 60 60
$ Z:List of 2
..$ 50.60: num 4 6
..$ 52.60: num 5
#2
0
1) Try tapply
instead:
1)尝试tapply:
ta <- tapply(df[[3]], df[-3], c)
ta[["50", "60"]]
## [1] 4 6
ta[["52", "60"]]
## [1] 5
2) subset Consider just not using aggregate
at all and use subset
to retrieve the values:
2)子集考虑根本不使用聚合并使用子集来检索值:
subset(df, X == 50 & Y == 60)$Z
## [1] 4 6
3) data.table Subsetting is even easier with data.table:
3)data.table使用data.table进行子集化更加容易:
library(data.table)
dt <- data.table(df, key = "X,Y")
dt[.(50, 60), Z]
## [1] 4 6
Note: If you are not actually starting with the df
shown in the question but rather a
is the result of a series of complex transformations then we can recover df
like this:
注意:如果你实际上没有从问题中显示的df开始,而是a是一系列复杂转换的结果,那么我们可以像这样恢复df:
df <- tidyr::unnest(a)
at which point any of the above could be used.
此时可以使用上述任何一种。