I wanted to apply a custom operation on a column by grouping the values on another column. Group by column to get the count, then divide the another column value with this count for all the grouped records.
我想通过将值分组到另一列上来对列应用自定义操作。按列分组以获取计数,然后将所有分组记录的另一列值除以此计数。
My Data Frame:
我的数据框架:
emp opp amount
0 a 1 10
1 b 1 10
2 c 2 30
3 b 2 30
4 d 2 30
My scenario:
我的情景:
- For opp=1, two emp's worked(a,b). So the amount should be shared like 10/2 =5
- 对于opp = 1,两个emp工作(a,b)。所以金额应该像10/2 = 5一样分享
- For opp=2, two emp's worked(b,c,d). So the amount should be like 30/3 = 10
- 对于opp = 2,两个emp工作(b,c,d)。所以金额应该是30/3 = 10
Final Output DataFrame:
最终输出数据框架:
emp opp amount
0 a 1 5
1 b 1 5
2 c 2 10
3 b 2 10
4 d 2 10
What is the best possible to do so
什么是最好的可能
2 个解决方案
#1
4
df['amount'] = df.groupby('opp')['amount'].transform(lambda g: g/g.size)
df
# emp opp amount
# 0 a 1 5
# 1 b 1 5
# 2 c 2 10
# 3 b 2 10
# 4 d 2 10
Or:
要么:
df['amount'] = df.groupby('opp')['amount'].apply(lambda g: g/g.size)
does similar thing.
做类似的事情。
#2
3
You could try something like this:
你可以尝试这样的事情:
df2 = df.groupby('opp').amount.count()
df.loc[:, 'calculated'] = df.apply( lambda row: \
row.amount / df2.ix[row.opp], axis=1)
df
Yields:
产量:
emp opp amount calculated
0 a 1 10 5
1 b 1 10 5
2 c 2 30 10
3 b 2 30 10
4 d 2 30 10
#1
4
df['amount'] = df.groupby('opp')['amount'].transform(lambda g: g/g.size)
df
# emp opp amount
# 0 a 1 5
# 1 b 1 5
# 2 c 2 10
# 3 b 2 10
# 4 d 2 10
Or:
要么:
df['amount'] = df.groupby('opp')['amount'].apply(lambda g: g/g.size)
does similar thing.
做类似的事情。
#2
3
You could try something like this:
你可以尝试这样的事情:
df2 = df.groupby('opp').amount.count()
df.loc[:, 'calculated'] = df.apply( lambda row: \
row.amount / df2.ix[row.opp], axis=1)
df
Yields:
产量:
emp opp amount calculated
0 a 1 10 5
1 b 1 10 5
2 c 2 30 10
3 b 2 30 10
4 d 2 30 10