使用值而不是索引访问聚合数据

时间:2022-02-21 20:10:11

Using aggregate, R creates a list Z that can be indexed on the form a$Z$`1.2`, where the first number references the corresponding element in X, and likewise for Y. In addition, if X or Y has 10+ elements, the form changes to a$Z$`01.02` (and assumedly 001.002 for 100+ elements).

使用聚合,R创建一个列表Z,可以在$ Z $`1.2`的形式上编入索引,其中第一个数字引用X中的相应元素,同样适用于Y.此外,如果X或Y有10个以上元素,表单更改为$ Z $`01.02`(对于100+元素,假设为001.002)。

Instead of having to index Z with the zero-padded index value of X and Y, how can I index with the actual X and Y values instead (eg. a$Z$`52.60`), which seems much more intuitive!

不必使用零填充索引值X和Y来索引Z,而是如何使用实际的X和Y值进行索引(例如,$ Z $`52.60`),这看起来更直观!

df = data.frame(X=c(50, 52, 50), Y=c(60, 60, 60), Z=c(4, 5, 6))
a = aggregate(Z ~ X + Y, df, c)
str(a)

'data.frame':   2 obs. of  3 variables:
 $ X: num  50 52
 $ Y: num  60 60
 $ Z:List of 2
  ..$ 1.1: num  4 6
  ..$ 1.2: num 5

2 个解决方案

#1


2  

You easily can do this after aggregate:

聚合后你可以很容易地做到这一点:

names(a$Z) <- paste(a$X, a$Y, sep=".")

Then check it out

然后检查出来

str(a)
'data.frame':   2 obs. of  3 variables:
 $ X: num  50 52
 $ Y: num  60 60
 $ Z:List of 2
  ..$ 50.60: num  4 6
  ..$ 52.60: num 5

#2


0  

1) Try tapply instead:

1)尝试tapply:

ta <- tapply(df[[3]], df[-3], c)

ta[["50", "60"]]
## [1] 4 6

ta[["52", "60"]]
## [1] 5

2) subset Consider just not using aggregate at all and use subset to retrieve the values:

2)子集考虑根本不使用聚合并使用子集来检索值:

subset(df, X == 50 & Y == 60)$Z
## [1] 4 6

3) data.table Subsetting is even easier with data.table:

3)data.table使用data.table进行子集化更加容易:

library(data.table)

dt <- data.table(df, key = "X,Y")
dt[.(50, 60), Z]

## [1] 4 6

Note: If you are not actually starting with the df shown in the question but rather a is the result of a series of complex transformations then we can recover df like this:

注意:如果你实际上没有从问题中显示的df开始,而是a是一系列复杂转换的结果,那么我们可以像这样恢复df:

df <- tidyr::unnest(a)

at which point any of the above could be used.

此时可以使用上述任何一种。

#1


2  

You easily can do this after aggregate:

聚合后你可以很容易地做到这一点:

names(a$Z) <- paste(a$X, a$Y, sep=".")

Then check it out

然后检查出来

str(a)
'data.frame':   2 obs. of  3 variables:
 $ X: num  50 52
 $ Y: num  60 60
 $ Z:List of 2
  ..$ 50.60: num  4 6
  ..$ 52.60: num 5

#2


0  

1) Try tapply instead:

1)尝试tapply:

ta <- tapply(df[[3]], df[-3], c)

ta[["50", "60"]]
## [1] 4 6

ta[["52", "60"]]
## [1] 5

2) subset Consider just not using aggregate at all and use subset to retrieve the values:

2)子集考虑根本不使用聚合并使用子集来检索值:

subset(df, X == 50 & Y == 60)$Z
## [1] 4 6

3) data.table Subsetting is even easier with data.table:

3)data.table使用data.table进行子集化更加容易:

library(data.table)

dt <- data.table(df, key = "X,Y")
dt[.(50, 60), Z]

## [1] 4 6

Note: If you are not actually starting with the df shown in the question but rather a is the result of a series of complex transformations then we can recover df like this:

注意:如果你实际上没有从问题中显示的df开始,而是a是一系列复杂转换的结果,那么我们可以像这样恢复df:

df <- tidyr::unnest(a)

at which point any of the above could be used.

此时可以使用上述任何一种。