聚合 - 在用户定义的函数中使用多个变量

时间:2021-08-16 16:46:53

I'm working with a large dataset and doing some calculation with the aggregate() function.

我正在使用大型数据集并使用aggregate()函数进行一些计算。

This time I need to group by two different columns and for my calculation I need a user defined function that also uses two columns of the data.frame. That's where I'm stuck.

这次我需要按两个不同的列进行分组,对于我的计算,我需要一个用户定义的函数,它也使用data.frame的两列。那就是我被困住的地方。

Here's an example data set:

这是一个示例数据集:

    dat <- data.frame(Kat = c("a","b","c","a","c","b","a","c"), 
Sex = c("M","F","F","F","M","M","F","M"), 
Val1 = c(1,2,3,4,5,6,7,8)*10,
Val2 = c(2,6,3,3,1,4,7,4))

    > dat
    Kat Sex Val1 Val2
    a   M   10    2
    b   F   20    6
    c   F   30    3
    a   F   40    3
    c   M   50    1
    b   M   60    4
    a   F   70    7
    c   M   80    4

Example of user defined function:

用户定义函数示例:

    sum(Val1 * Val2)    # but grouped by Kat and Sex

I tried this:

我试过这个:

    aggregate((dat$Val1), 
by = list(dat$Kat, dat$Sex), 
function(x, y = dat$Val2){sum(x*y)})

Output:

    Group.1 Group.2    x
    a       F          1710
    b       F           600
    c       F           900
    a       M           300
    b       M          1800
    c       M          2010

But my expected output would be:

但我的预期输出是:

    Group.1 Group.2    x
    a       F           610
    b       F           120
    c       F            90
    a       M            20
    b       M           240
    c       M           370

Is there any way to do this with aggregate()?

有没有办法用aggregate()做到这一点?

Thank you in advance!

先感谢您!

1 个解决方案

#1


2  

As @jogo suggested :

正如@jogo建议:

aggregate(Val1 * Val2 ~ Kat + Sex, FUN =  sum, data = dat)

Or in a tidyverse style

或者是整齐的风格

library(dplyr)
dat %>%
  group_by(Kat, Sex) %>%
  summarize(sum(Val1 * Val2))

Or with data.table

或者使用data.table

library(data.table)
setDT(dat)
dat[ , sum(Val1 * Val2), by = list(Kat, Sex)]

#1


2  

As @jogo suggested :

正如@jogo建议:

aggregate(Val1 * Val2 ~ Kat + Sex, FUN =  sum, data = dat)

Or in a tidyverse style

或者是整齐的风格

library(dplyr)
dat %>%
  group_by(Kat, Sex) %>%
  summarize(sum(Val1 * Val2))

Or with data.table

或者使用data.table

library(data.table)
setDT(dat)
dat[ , sum(Val1 * Val2), by = list(Kat, Sex)]