R：如果差异超过阈值，则累计总和

I have a vector of numbers

我有一个数字向量

x <- c(0, 28, 59, 89, 0, 15, 16, 0, 35, 31)
#[1] 0 0 31 30 0 15 16 0 35 31

And I would like to calculate the cumulative sum of it, for a special condition. Given:

并且我想计算它的累积总和,以获得特殊条件。鉴于:

month_vec <- seq(as.Date("2009-02-01"), length = 10, by = "1 month") - 1
day_vec   <- as.numeric(substr(month_vec, 9, 10))
# > day_vec
#[1] 31 28 31 30 31 30 31 31 30 31

I only want to cumsum(x) if the difference to the element before is greater or equal than the value in day_vec.

如果之前元素的差异大于或等于day_vec中的值,我只想要cumsum(x)。

The result should look like this:

结果应如下所示:

my_custom_cumsum(x)
#[1] 0 0 31 61 0 15 16 0 35 66

Because x[4] is equal to day_vec[4], x[3] and x[4]are cumsummed. However, x[6] and x[7] are not cumsummed because they are smaller than their respective position in day_vec. But x[9] and x[10] should be cumsummed again - in other words: The cumsum should reset if the difference to the element before is smaller than the value in day_vec. Does anybody have an idea of how to solve this elegantly?

因为x [4]等于day_vec [4],所以x [3]和x [4]被积累。但是,x [6]和x [7]不是因为它们小于它们在day_vec中的相应位置而被积累。但是x [9]和x [10]应该再次积累 - 换句话说:如果之前元素的差异小于day_vec中的值,则应重置cumsum。有没有人知道如何优雅地解决这个问题?

1 个解决方案

#1

I would do this with a logical index used for subsetting. It should be true for all elements of x that shall be "cumsumed" and false for the rest.

我会用一个用于子集的逻辑索引来做这个。对于x的所有元素应该是“cumumed”并且其余的都是假的。

idx <- x >= day_vec

Now you can use it to compute the cumsum and assign it to the correct elements in x:

现在您可以使用它来计算cumsum并将其分配给x中的正确元素:

x[idx] <- cumsum(x[idx])
x
#[1]  0  0 31 61  0 15 16

#1