使用Pandas计算Python中每组列的一些值

时间:2021-04-02 07:36:10

I have a DataFrame

我有一个DataFrame

Input

         A   B   C     D          
0      one  50   35  1.5  
1      two  30   40  2.0 
2      one  50   35  3.0 
3    three  40   35  3.5 
4      one  40   35  2.5

and I need to apply a math function on column D and fill with it a new column E, but before I need to group columns B and C. For example the math function will applied on values 1.5 and 3.0 for couple (50,35)

我需要在D列上应用数学函数并在其中填充新的列E,但在我需要对列B和C进行分组之前。例如,数学函数将应用于值为1.5和3.0的情侣(50,35)

B   C   A     D
50  35  one   1.5
        one   3.0

40  35  three 3.5
        one   2.5

30  40  two   2.0

The values are calculated with a custom function that receive in input a numpy array and output an array with same length.

这些值是使用自定义函数计算的,该函数在输入中接收numpy数组并输出具有相同长度的数组。

Output

         A   B   C     D   E          
0      one  50   35  1.5   4.5
1      two  30   40  2.0   4.5
2      one  50   35  3.0   3.5
3    three  40   35  3.5   6.8
4      one  40   35  2.5.  8.9

Can someone help me?

有人能帮我吗?

1 个解决方案

#1


3  

I believe need GroupBy.transform for return Series with same size as original DataFrame:

我认为需要GroupBy.transform返回与原始DataFrame大小相同的Series:

def func(x):
    print (x)
    #custom function, e.g. multiple all together 
    return x.prod()

df['E'] = df.groupby(['B','C'])['D'].transform(func)
print (df)
       A   B   C    D   E
0    one  50  35  1.5  4.50
1    two  30  40  2.0  2.00
2    one  50  35  3.0  4.50
3  three  40  35  3.5  8.75
4    one  40  35  2.5  8.75

#1


3  

I believe need GroupBy.transform for return Series with same size as original DataFrame:

我认为需要GroupBy.transform返回与原始DataFrame大小相同的Series:

def func(x):
    print (x)
    #custom function, e.g. multiple all together 
    return x.prod()

df['E'] = df.groupby(['B','C'])['D'].transform(func)
print (df)
       A   B   C    D   E
0    one  50  35  1.5  4.50
1    two  30  40  2.0  2.00
2    one  50  35  3.0  4.50
3  three  40  35  3.5  8.75
4    one  40  35  2.5  8.75