熊猫:基于另一列的值的一列的累积总和

时间:2022-02-01 11:09:53

I am trying to calculate some statistics from a pandas dataframe. It looks something like this:

我试图从熊猫数据框计算一些统计数据。它看起来像这样:

id     value     conditional
1      10        0
2      20        0
3      30        1
1      15        1
3      5         0
1      10        1

So, I need to calculate the cumulative sum of the column value for each id from top to botom, but only when conditional is 1.

因此,我需要计算从顶部到botom的每个id的列值的累积和,但仅当条件为1时才计算。

So, this should give me something like:

所以,这应该给我一些像:

id     value     conditional   cumulative sum
1      10        0             0
2      20        0             0
3      30        1             30
1      15        1             15
3      5         0             30
1      10        1             25

So, the sum of id=1 is taken only when conditional=1 in the 4th and 6th row and the 1st row value is not counted. How do I do this in pandas?

因此,仅当第4行和第6行中的条件= 1且不计算第1行值时,才采用id = 1的总和。我怎么在熊猫中这样做?

1 个解决方案

#1


7  

You can create a Series that is the multiplication of value and conditional, and take the cumulative sum of it for each id group:

您可以创建一个系列,它是值和条件的乘积,并为每个id组获取它的累积和:

df['cumsum'] = (df['value']*df['conditional']).groupby(df['id']).cumsum()
df
Out: 
   id  value  conditional  cumsum
0   1     10            0       0
1   2     20            0       0
2   3     30            1      30
3   1     15            1      15
4   3      5            0      30
5   1     10            1      25

#1


7  

You can create a Series that is the multiplication of value and conditional, and take the cumulative sum of it for each id group:

您可以创建一个系列,它是值和条件的乘积,并为每个id组获取它的累积和:

df['cumsum'] = (df['value']*df['conditional']).groupby(df['id']).cumsum()
df
Out: 
   id  value  conditional  cumsum
0   1     10            0       0
1   2     20            0       0
2   3     30            1      30
3   1     15            1      15
4   3      5            0      30
5   1     10            1      25