I am trying to calculate some statistics from a pandas dataframe. It looks something like this:
我试图从熊猫数据框计算一些统计数据。它看起来像这样:
id value conditional
1 10 0
2 20 0
3 30 1
1 15 1
3 5 0
1 10 1
So, I need to calculate the cumulative sum of the column value
for each id
from top to botom, but only when conditional
is 1.
因此,我需要计算从顶部到botom的每个id的列值的累积和,但仅当条件为1时才计算。
So, this should give me something like:
所以,这应该给我一些像:
id value conditional cumulative sum
1 10 0 0
2 20 0 0
3 30 1 30
1 15 1 15
3 5 0 30
1 10 1 25
So, the sum of id=1
is taken only when conditional=1
in the 4th and 6th row and the 1st row value is not counted. How do I do this in pandas?
因此,仅当第4行和第6行中的条件= 1且不计算第1行值时,才采用id = 1的总和。我怎么在熊猫中这样做?
1 个解决方案
#1
7
You can create a Series that is the multiplication of value
and conditional
, and take the cumulative sum of it for each id group:
您可以创建一个系列,它是值和条件的乘积,并为每个id组获取它的累积和:
df['cumsum'] = (df['value']*df['conditional']).groupby(df['id']).cumsum()
df
Out:
id value conditional cumsum
0 1 10 0 0
1 2 20 0 0
2 3 30 1 30
3 1 15 1 15
4 3 5 0 30
5 1 10 1 25
#1
7
You can create a Series that is the multiplication of value
and conditional
, and take the cumulative sum of it for each id group:
您可以创建一个系列,它是值和条件的乘积,并为每个id组获取它的累积和:
df['cumsum'] = (df['value']*df['conditional']).groupby(df['id']).cumsum()
df
Out:
id value conditional cumsum
0 1 10 0 0
1 2 20 0 0
2 3 30 1 30
3 1 15 1 15
4 3 5 0 30
5 1 10 1 25