With data frame:
数据帧:
df <- data.frame(id = rep(1:3, each = 5)
, hour = rep(1:5, 3)
, value = sample(1:15))
I want to add a cumulative sum column that matches the id
:
我要添加一个与id匹配的累积和列:
df
id hour value csum
1 1 1 7 7
2 1 2 9 16
3 1 3 15 31
4 1 4 11 42
5 1 5 14 56
6 2 1 10 10
7 2 2 2 12
8 2 3 5 17
9 2 4 6 23
10 2 5 4 27
11 3 1 1 1
12 3 2 13 14
13 3 3 8 22
14 3 4 3 25
15 3 5 12 37
How can I do this efficiently? Thanks!
我怎样才能有效地做到这一点呢?谢谢!
4 个解决方案
#1
18
df$csum <- ave(df$value, df$id, FUN=cumsum)
#2
12
To add to the alternatives, data.table
's syntax is nice:
要为替代方案添加数据。表的语法很好:
library(data.table)
DT <- data.table(df, key = "id")
DT[, csum := cumsum(value), by = key(DT)]
Or, more compactly:
或者,更简洁:
library(data.table)
setDT(df)[, csum := cumsum(value), id][]
The above will:
以上将:
- Convert the
data.frame
to adata.table
by reference - 将data.frame转换为数据。通过引用表
- Calculate the cumulative sum of value grouped by id and assign it by reference
- 计算按id分组的累积值和并按引用分配
- Print (the last
[]
there) the result of the entire operation - 打印(最后一个[])整个操作的结果
"df" will now be a data.table
with a "csum" column.
“df”现在将成为一个数据。带有“csum”列的表。
#3
7
Using library plyr
.
plyr使用图书馆。
library(plyr)
ddply(df,.(id),transform,csum=cumsum(value))
#4
2
Using dplyr::
使用dplyr::
require(dplyr)
df %>% group_by(id) %>% mutate(csum = cumsum(value))
#1
18
df$csum <- ave(df$value, df$id, FUN=cumsum)
#2
12
To add to the alternatives, data.table
's syntax is nice:
要为替代方案添加数据。表的语法很好:
library(data.table)
DT <- data.table(df, key = "id")
DT[, csum := cumsum(value), by = key(DT)]
Or, more compactly:
或者,更简洁:
library(data.table)
setDT(df)[, csum := cumsum(value), id][]
The above will:
以上将:
- Convert the
data.frame
to adata.table
by reference - 将data.frame转换为数据。通过引用表
- Calculate the cumulative sum of value grouped by id and assign it by reference
- 计算按id分组的累积值和并按引用分配
- Print (the last
[]
there) the result of the entire operation - 打印(最后一个[])整个操作的结果
"df" will now be a data.table
with a "csum" column.
“df”现在将成为一个数据。带有“csum”列的表。
#3
7
Using library plyr
.
plyr使用图书馆。
library(plyr)
ddply(df,.(id),transform,csum=cumsum(value))
#4
2
Using dplyr::
使用dplyr::
require(dplyr)
df %>% group_by(id) %>% mutate(csum = cumsum(value))