R累积总和除以“复位”(复位)

时间:2022-06-18 07:35:31

This question already has an answer here:

这个问题已经有了答案:

my problem is I'm trying to find the cumulative sum of rainfall by season (DJF, MAM, JJA, SON) and by year (1926 - 2000), with the sum resetting to zero at the end of each season.

我的问题是,我正在努力寻找季节(DJF, MAM, JJA, SON)和年份(1926 - 2000)的累积降雨量,并在每个季节结束时重新设置为零。

I have managed to do it just by year using the code

我只是用了一年的代码才做到了这一点。

rainfall$yearly.cumsum=unlist(tapply(rainfall$RR, rainfall$year, FUN=cumsum))

and tried to adapt it for seasons using

并尝试用它来适应季节。

rainfall$seasonal.cumsum=unlist(tapply(rainfall$RR, .(season,year), transform, FUN=cumsum))

This returns the error

这将返回错误

Error in unique.default(x, nmax = nmax) : 
unique() applies only to vectors

I also tried this:

我也试过这样的:

rainfall$seasonal.cumsum=unlist(tapply(rainfall$RR, rainfall$season, FUN=cumsum))

which is more promising as it does add by season, but does not reset when the season changes. That is, I think the code is first summing DJF for every year, before moving onto MAM for every year, then JJA and finally SON, rather than DJF for one year, reset, MAM for the same year, reset etc.

这更有希望,因为它确实增加了季节,但在季节变化时不会重置。也就是说,我认为代码是每年的第一个summing DJF,然后是每年的MAM,然后是JJA和最后的儿子,而不是DJF一年,重置,MAM在同一年,重置等等。

Here is a part of the data frame. Notice yearly.cumsum is summing the values from the RR column but seasonal.cumsum is not.

这是数据框架的一部分。注意每年。cumsum从RR列中总结出了值,但是是季节性的。cumsum不是。

    DATE  year   month season RR   yearly.cumsum   seasonal.cumsum
 19260529 1926    05    MAM    0          2347            2518
 19260530 1926    05    MAM    0          2347            2518
 19260531 1926    05    MAM    9          2356            2530
 19260601 1926    06    JJA    0          2356            2530
 19260602 1926    06    JJA    3          2359            2530
 19260603 1926    06    JJA   71          2430            2530
 19260604 1926    06    JJA    0          2430            2530
 19260605 1926    06    JJA   48          2478            2534

I hope my question is clear enough!

我希望我的问题足够清楚!

Thanks.

谢谢。

3 个解决方案

#1


2  

May be you can try dplyr

你可以试试dplyr吗?

library(dplyr)
rainfall %>% 
         group_by(season, year) %>%
         mutate(seasonal.cumsum=cumsum(RR))

#          DATE year month season RR yearly.cumsum seasonal.cumsum
#1 19260529 1926     5    MAM  0          2347               0
#2 19260530 1926     5    MAM  0          2347               0
#3 19260531 1926     5    MAM  9          2356               9
#4 19260601 1926     6    JJA  0          2356               0
#5 19260602 1926     6    JJA  3          2359               3
#6 19260603 1926     6    JJA 71          2430              74
#7 19260604 1926     6    JJA  0          2430              74
#8 19260605 1926     6    JJA 48          2478             122

Update

Regarding creating consecutive months to cross the year, you may try this (here, this resets at March 01, starts a new year)

关于创造连续几个月的跨年,你可以试试这个(在这里,这个重置在3月01日,开始一个新的一年)

 indx <- rainfall2$year-min(rainfall2$year) + rainfall2$month %in% c(1,2,12)
 indx1 <- cumsum(c(TRUE,diff(indx) <0))
 rainfall2$year2 <- indx1+ (min(rainfall$year))

 res <-  rainfall2 %>%
                   group_by(season, year2) %>%
                   mutate(seasonal.cumsum=cumsum(RR))

 do.call(rbind,lapply(split(res, res$year2), head,2))
 #       DATE month year season  RR year2 seasonal.cumsum
 #1 19260504     5 1926    MAM  50  1927              50
 #2 19260505     5 1926    MAM  84  1927             134
 #3 19270301     3 1927    MAM  98  1928              98
 #4 19270302     3 1927    MAM 112  1928             210
 #5 19280301     3 1928    MAM  91  1929              91
 #6 19280302     3 1928    MAM  85  1929             176
 #7 19290301     3 1929    MAM  18  1930              18
 #8 19290302     3 1929    MAM 111  1930             129

Update2

If you need year to reset at December1

如果你需要在12月1日重置。

 indx <- rainfall2$year-min(rainfall2$year) + !rainfall2$month %in% c(1,2,12)
 indx1 <- cumsum(c(TRUE,diff(indx) <0))
 rainfall2$year2 <- indx1+ (min(rainfall2$year)-1)      

 res2 <- rainfall2 %>%
        group_by(season, year2) %>%
        mutate(seasonal.cumsum=cumsum(RR))

  do.call(rbind,lapply(split(res2, res2$year2), head,2))
  #        DATE month year season  RR year2 seasonal.cumsum
  #1 19260504     5 1926    MAM  50  1926              50
  #2 19260505     5 1926    MAM  84  1926             134
  #3 19261201    12 1926    DJF 120  1927             120
  #4 19261202    12 1926    DJF  26  1927             146
  #5 19271201    12 1927    DJF 112  1928             112
  #6 19271202    12 1927    DJF  78  1928             190
  #7 19281201    12 1928    DJF  96  1929              96
  #8 19281202    12 1928    DJF  26  1929             122

Explanation

I think it is better to create a small dataset for better understanding

我认为最好是创建一个小的数据集以便更好地理解。

 set.seed(24)
 df <- data.frame(month=rep(rep(1:12,each=4),3), year=rep(1926:1928, each=12*4))

First, we are checking which of the following months c(1,2,12) are found in df$month column using %in%. It returns a logical vector with TRUE denotes those elements that are either 1,2, or 12. By using the negation ! we are trying making TRUE as FALSE and viceversa. That means, here we are looking for months that are not 1, 2, or 12

首先,我们要检查下一个月c(1,2,12)在df$month列中使用% %。它返回的逻辑向量为TRUE,表示这些元素分别为1、2或12。通过使用否定!我们正在努力把真相变成虚伪和胜利。这意味着,我们在这里寻找的不是1,2,12个月。

head(!df$month %in% c(1,2,12), 15)
# [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE
#[13]  TRUE  TRUE  TRUE

Next, we are subtracting the year from the minimum year in the dataset to get values

接下来,我们从数据集的最小年份中减去年份以得到值。

df$year-min(df$year)
#[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
#[38] 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#[75] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
#[112] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

If we add the above two, the TRUE/FALSE in the first will coerce to integer (1/0) and we get

如果我们把上面的两个相加,第一个的真/假将会强制整数(1/0),我们得到。

 indx <- df$year-min(df$year) + !df$month %in% c(1,2,12)
 indx
 #[1] 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 #[38] 1 1 1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 #[75] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 1 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3
 #[112] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 2 2

In the second step, we first do diff or difference between adjacent elements of indx and this returns a vector with one less element than the length of the indx. Then check where this returns values < 0. To make lengths equal, we can use c(TRUE,..)

在第二步中,我们首先对indx的相邻元素进行diff或差分,这将返回一个比indx的长度少一个元素的向量。然后检查返回值< 0。为了使长度相等,我们可以用c(TRUE,..)

  head(diff(indx),55)
  #[1]  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
 #[26]  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 -1  0  0  0  1  0  0
 #[51]  0  0  0  0  0

  head(c(TRUE,diff(indx) <0), 55)
  #[1]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
  #[13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
  #[25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
  #[37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE
  #[49] FALSE FALSE FALSE FALSE FALSE FALSE FALSE

  head(cumsum(c(TRUE,diff(indx) <0)), 55)
  #[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
  #[39] 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2

  indx1 <- cumsum(c(TRUE, diff(indx) <0))

From the previous step, we get indx1 and then we add that with the minimum year

在前面的步骤中,我们得到了indx1,然后加上最小年份。

  head( indx1+ (min(df$year)),55)
  #[1] 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927
  #[16] 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927
  #[31] 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1928
  #[46] 1928 1928 1928 1928 1928 1928 1928 1928 1928 1928

  indx2 <-  indx1+ (min(df$year))
  split(df, indx2) #to check the results

data

rainfall <- structure(list(DATE = c(19260529L, 19260530L, 19260531L, 19260601L, 
 19260602L, 19260603L, 19260604L, 19260605L), year = c(1926L, 
 1926L, 1926L, 1926L, 1926L, 1926L, 1926L, 1926L), month = c(5L, 
 5L, 5L, 6L, 6L, 6L, 6L, 6L), season = c("MAM", "MAM", "MAM", 
 "JJA", "JJA", "JJA", "JJA", "JJA"), RR = c(0L, 0L, 9L, 0L, 3L, 
 71L, 0L, 48L), yearly.cumsum = c(2347L, 2347L, 2356L, 2356L, 
 2359L, 2430L, 2430L, 2478L), seasonal.cumsum = c(2518L, 2518L, 
 2530L, 2530L, 2530L, 2530L, 2530L, 2534L)), .Names = c("DATE", 
 "year", "month", "season", "RR", "yearly.cumsum", "seasonal.cumsum"
 ), class = "data.frame", row.names = c(NA, -8L))

newdata

 DATE= format(seq(as.Date("1926-05-04"), length.out=1200, by='1 day'), '%Y%m%d')
 month <- as.numeric(substr(DATE,5,6))
 year <- as.numeric(substr(DATE,1,4))
 season <- ifelse(month %in% c(12,1,2), 'DJF', 
         ifelse(month %in% 3:5, 'MAM', ifelse(month %in% 6:8, 'JJA','SON')))
 set.seed(25)
 RR <- sample(0:120, 1200, replace=TRUE)

 rainfall2 <- data.frame(DATE, month, year, season, RR, stringsAsFactors=FALSE)

#2


2  

Try data.table:

试试data.table:

> library(data.table)
> ddt = data.table(rainfall)
> ddt[,scumsum:=cumsum(RR),by=list(season,year)]
> ddt
       DATE year month season RR yearly.cumsum seasonal.cumsum scumsum
1: 19260529 1926     5    MAM  0          2347            2518       0
2: 19260530 1926     5    MAM  0          2347            2518       0
3: 19260531 1926     5    MAM  9          2356            2530       9
4: 19260601 1926     6    JJA  0          2356            2530       0
5: 19260602 1926     6    JJA  3          2359            2530       3
6: 19260603 1926     6    JJA 71          2430            2530      74
7: 19260604 1926     6    JJA  0          2430            2530      74
8: 19260605 1926     6    JJA 48          2478            2534     122

#3


1  

You can actually do it with tapply without creating yearly.cumsum (although I do agree tapply behaves a bit awkward by reversing the order)

你可以用tapply,而不用每年创建。cumsum(尽管我同意tapply的做法有点笨拙)

transform(rainfall, 
          seasonal.cumsum = 
          unlist(rev(tapply(RR, list(season, year), FUN = cumsum))))
#       DATE year month season RR yearly.cumsum seasonal.cumsum
# 1 19260529 1926     5    MAM  0          2347               0
# 2 19260530 1926     5    MAM  0          2347               0
# 3 19260531 1926     5    MAM  9          2356               9
# 4 19260601 1926     6    JJA  0          2356               0
# 5 19260602 1926     6    JJA  3          2359               3
# 6 19260603 1926     6    JJA 71          2430              74
# 7 19260604 1926     6    JJA  0          2430              74
# 8 19260605 1926     6    JJA 48          2478             122

#1


2  

May be you can try dplyr

你可以试试dplyr吗?

library(dplyr)
rainfall %>% 
         group_by(season, year) %>%
         mutate(seasonal.cumsum=cumsum(RR))

#          DATE year month season RR yearly.cumsum seasonal.cumsum
#1 19260529 1926     5    MAM  0          2347               0
#2 19260530 1926     5    MAM  0          2347               0
#3 19260531 1926     5    MAM  9          2356               9
#4 19260601 1926     6    JJA  0          2356               0
#5 19260602 1926     6    JJA  3          2359               3
#6 19260603 1926     6    JJA 71          2430              74
#7 19260604 1926     6    JJA  0          2430              74
#8 19260605 1926     6    JJA 48          2478             122

Update

Regarding creating consecutive months to cross the year, you may try this (here, this resets at March 01, starts a new year)

关于创造连续几个月的跨年,你可以试试这个(在这里,这个重置在3月01日,开始一个新的一年)

 indx <- rainfall2$year-min(rainfall2$year) + rainfall2$month %in% c(1,2,12)
 indx1 <- cumsum(c(TRUE,diff(indx) <0))
 rainfall2$year2 <- indx1+ (min(rainfall$year))

 res <-  rainfall2 %>%
                   group_by(season, year2) %>%
                   mutate(seasonal.cumsum=cumsum(RR))

 do.call(rbind,lapply(split(res, res$year2), head,2))
 #       DATE month year season  RR year2 seasonal.cumsum
 #1 19260504     5 1926    MAM  50  1927              50
 #2 19260505     5 1926    MAM  84  1927             134
 #3 19270301     3 1927    MAM  98  1928              98
 #4 19270302     3 1927    MAM 112  1928             210
 #5 19280301     3 1928    MAM  91  1929              91
 #6 19280302     3 1928    MAM  85  1929             176
 #7 19290301     3 1929    MAM  18  1930              18
 #8 19290302     3 1929    MAM 111  1930             129

Update2

If you need year to reset at December1

如果你需要在12月1日重置。

 indx <- rainfall2$year-min(rainfall2$year) + !rainfall2$month %in% c(1,2,12)
 indx1 <- cumsum(c(TRUE,diff(indx) <0))
 rainfall2$year2 <- indx1+ (min(rainfall2$year)-1)      

 res2 <- rainfall2 %>%
        group_by(season, year2) %>%
        mutate(seasonal.cumsum=cumsum(RR))

  do.call(rbind,lapply(split(res2, res2$year2), head,2))
  #        DATE month year season  RR year2 seasonal.cumsum
  #1 19260504     5 1926    MAM  50  1926              50
  #2 19260505     5 1926    MAM  84  1926             134
  #3 19261201    12 1926    DJF 120  1927             120
  #4 19261202    12 1926    DJF  26  1927             146
  #5 19271201    12 1927    DJF 112  1928             112
  #6 19271202    12 1927    DJF  78  1928             190
  #7 19281201    12 1928    DJF  96  1929              96
  #8 19281202    12 1928    DJF  26  1929             122

Explanation

I think it is better to create a small dataset for better understanding

我认为最好是创建一个小的数据集以便更好地理解。

 set.seed(24)
 df <- data.frame(month=rep(rep(1:12,each=4),3), year=rep(1926:1928, each=12*4))

First, we are checking which of the following months c(1,2,12) are found in df$month column using %in%. It returns a logical vector with TRUE denotes those elements that are either 1,2, or 12. By using the negation ! we are trying making TRUE as FALSE and viceversa. That means, here we are looking for months that are not 1, 2, or 12

首先,我们要检查下一个月c(1,2,12)在df$month列中使用% %。它返回的逻辑向量为TRUE,表示这些元素分别为1、2或12。通过使用否定!我们正在努力把真相变成虚伪和胜利。这意味着,我们在这里寻找的不是1,2,12个月。

head(!df$month %in% c(1,2,12), 15)
# [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE
#[13]  TRUE  TRUE  TRUE

Next, we are subtracting the year from the minimum year in the dataset to get values

接下来,我们从数据集的最小年份中减去年份以得到值。

df$year-min(df$year)
#[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
#[38] 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
#[75] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
#[112] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

If we add the above two, the TRUE/FALSE in the first will coerce to integer (1/0) and we get

如果我们把上面的两个相加,第一个的真/假将会强制整数(1/0),我们得到。

 indx <- df$year-min(df$year) + !df$month %in% c(1,2,12)
 indx
 #[1] 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 #[38] 1 1 1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 #[75] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 1 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3
 #[112] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 2 2

In the second step, we first do diff or difference between adjacent elements of indx and this returns a vector with one less element than the length of the indx. Then check where this returns values < 0. To make lengths equal, we can use c(TRUE,..)

在第二步中,我们首先对indx的相邻元素进行diff或差分,这将返回一个比indx的长度少一个元素的向量。然后检查返回值< 0。为了使长度相等,我们可以用c(TRUE,..)

  head(diff(indx),55)
  #[1]  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
 #[26]  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 -1  0  0  0  1  0  0
 #[51]  0  0  0  0  0

  head(c(TRUE,diff(indx) <0), 55)
  #[1]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
  #[13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
  #[25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
  #[37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE
  #[49] FALSE FALSE FALSE FALSE FALSE FALSE FALSE

  head(cumsum(c(TRUE,diff(indx) <0)), 55)
  #[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
  #[39] 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2

  indx1 <- cumsum(c(TRUE, diff(indx) <0))

From the previous step, we get indx1 and then we add that with the minimum year

在前面的步骤中,我们得到了indx1,然后加上最小年份。

  head( indx1+ (min(df$year)),55)
  #[1] 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927
  #[16] 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927
  #[31] 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1927 1928
  #[46] 1928 1928 1928 1928 1928 1928 1928 1928 1928 1928

  indx2 <-  indx1+ (min(df$year))
  split(df, indx2) #to check the results

data

rainfall <- structure(list(DATE = c(19260529L, 19260530L, 19260531L, 19260601L, 
 19260602L, 19260603L, 19260604L, 19260605L), year = c(1926L, 
 1926L, 1926L, 1926L, 1926L, 1926L, 1926L, 1926L), month = c(5L, 
 5L, 5L, 6L, 6L, 6L, 6L, 6L), season = c("MAM", "MAM", "MAM", 
 "JJA", "JJA", "JJA", "JJA", "JJA"), RR = c(0L, 0L, 9L, 0L, 3L, 
 71L, 0L, 48L), yearly.cumsum = c(2347L, 2347L, 2356L, 2356L, 
 2359L, 2430L, 2430L, 2478L), seasonal.cumsum = c(2518L, 2518L, 
 2530L, 2530L, 2530L, 2530L, 2530L, 2534L)), .Names = c("DATE", 
 "year", "month", "season", "RR", "yearly.cumsum", "seasonal.cumsum"
 ), class = "data.frame", row.names = c(NA, -8L))

newdata

 DATE= format(seq(as.Date("1926-05-04"), length.out=1200, by='1 day'), '%Y%m%d')
 month <- as.numeric(substr(DATE,5,6))
 year <- as.numeric(substr(DATE,1,4))
 season <- ifelse(month %in% c(12,1,2), 'DJF', 
         ifelse(month %in% 3:5, 'MAM', ifelse(month %in% 6:8, 'JJA','SON')))
 set.seed(25)
 RR <- sample(0:120, 1200, replace=TRUE)

 rainfall2 <- data.frame(DATE, month, year, season, RR, stringsAsFactors=FALSE)

#2


2  

Try data.table:

试试data.table:

> library(data.table)
> ddt = data.table(rainfall)
> ddt[,scumsum:=cumsum(RR),by=list(season,year)]
> ddt
       DATE year month season RR yearly.cumsum seasonal.cumsum scumsum
1: 19260529 1926     5    MAM  0          2347            2518       0
2: 19260530 1926     5    MAM  0          2347            2518       0
3: 19260531 1926     5    MAM  9          2356            2530       9
4: 19260601 1926     6    JJA  0          2356            2530       0
5: 19260602 1926     6    JJA  3          2359            2530       3
6: 19260603 1926     6    JJA 71          2430            2530      74
7: 19260604 1926     6    JJA  0          2430            2530      74
8: 19260605 1926     6    JJA 48          2478            2534     122

#3


1  

You can actually do it with tapply without creating yearly.cumsum (although I do agree tapply behaves a bit awkward by reversing the order)

你可以用tapply,而不用每年创建。cumsum(尽管我同意tapply的做法有点笨拙)

transform(rainfall, 
          seasonal.cumsum = 
          unlist(rev(tapply(RR, list(season, year), FUN = cumsum))))
#       DATE year month season RR yearly.cumsum seasonal.cumsum
# 1 19260529 1926     5    MAM  0          2347               0
# 2 19260530 1926     5    MAM  0          2347               0
# 3 19260531 1926     5    MAM  9          2356               9
# 4 19260601 1926     6    JJA  0          2356               0
# 5 19260602 1926     6    JJA  3          2359               3
# 6 19260603 1926     6    JJA 71          2430              74
# 7 19260604 1926     6    JJA  0          2430              74
# 8 19260605 1926     6    JJA 48          2478             122