计算每个ID(组)的累积和

With data frame:

数据帧:

df <- data.frame(id = rep(1:3, each = 5)
                 , hour = rep(1:5, 3)
                 , value = sample(1:15))

I want to add a cumulative sum column that matches the id:

我要添加一个与id匹配的累积和列:

df
   id hour value csum
1   1    1     7    7
2   1    2     9   16
3   1    3    15   31
4   1    4    11   42
5   1    5    14   56
6   2    1    10   10
7   2    2     2   12
8   2    3     5   17
9   2    4     6   23
10  2    5     4   27
11  3    1     1    1
12  3    2    13   14
13  3    3     8   22
14  3    4     3   25
15  3    5    12   37

How can I do this efficiently? Thanks!

我怎样才能有效地做到这一点呢?谢谢!

4 个解决方案

#1

df$csum <- ave(df$value, df$id, FUN=cumsum)

#2

To add to the alternatives, data.table's syntax is nice:

要为替代方案添加数据。表的语法很好:

library(data.table)
DT <- data.table(df, key = "id")
DT[, csum := cumsum(value), by = key(DT)]

Or, more compactly:

或者,更简洁:

library(data.table)
setDT(df)[, csum := cumsum(value), id][]

The above will:

以上将:

Convert the data.frame to a data.table by reference
将data.frame转换为数据。通过引用表
Calculate the cumulative sum of value grouped by id and assign it by reference
计算按id分组的累积值和并按引用分配
Print (the last [] there) the result of the entire operation
打印(最后一个[])整个操作的结果

"df" will now be a data.table with a "csum" column.

“df”现在将成为一个数据。带有“csum”列的表。

#3

Using library plyr.

plyr使用图书馆。

library(plyr)
ddply(df,.(id),transform,csum=cumsum(value))

#4

Using dplyr::

使用dplyr::

require(dplyr)
df %>% group_by(id) %>% mutate(csum = cumsum(value))

#1

df$csum <- ave(df$value, df$id, FUN=cumsum)

#2