如何对dataframe上的每一行应用函数?

时间:2022-07-01 21:39:07

I am new to Python and I am not sure how to solve the following problem.

我是Python新手,不知道如何解决下面的问题。

I have a function:

我有一个函数:

def EOQ(D,p,ck,ch):
    Q = math.sqrt((2*D*ck)/(ch*p))
    return Q

Say I have the dataframe

比方说我有dataframe。

df = pd.DataFrame({"D": [10,20,30], "p": [20, 30, 10]})

    D   p
0   10  20
1   20  30
2   30  10

ch=0.2
ck=5

And ch and ck are float types. Now I want to apply the formula to every row on the dataframe and return it as an extra row 'Q'. An example (that does not work) would be:

ch和ck是浮动类型。现在我想将公式应用到dataframe上的每一行,并将其作为额外的行'Q'返回。一个例子(不管用)是:

df['Q']= map(lambda p, D: EOQ(D,p,ck,ch),df['p'], df['D']) 

(returns only 'map' types)

(只返回“地图”类型)

I will need this type of processing more in my project and I hope to find something that works.

在我的项目中,我需要更多的这种类型的处理,我希望能找到一些有用的东西。

2 个解决方案

#1


12  

As I don't know what PartMaster is, the following should work:

因为我不知道PartMaster是什么,下面的工作应该是:

def EOQ(D,p,ck,ch):
    p,D = Partmaster
    Q = math.sqrt((2*D*ck)/(ch*p))
    return Q
ch=0.2
ck=5
df['Q'] = df.apply(lambda row: EOQ(row['D'], row['p'], ck, ch), axis=1)
df

If all you're doing is calculating the square root of some result then use the np.sqrt method this is vectorised and will be significantly faster:

如果你所做的就是计算某个结果的平方根然后使用np。sqrt方法是矢量化的,而且速度会快得多:

In [80]:
df['Q'] = np.sqrt((2*df['D']*ck)/(ch*df['p']))

df
Out[80]:
    D   p          Q
0  10  20   5.000000
1  20  30   5.773503
2  30  10  12.247449

Timings

计时

For a 30k row df:

30k行df:

In [92]:

import math
ch=0.2
ck=5
def EOQ(D,p,ck,ch):
    Q = math.sqrt((2*D*ck)/(ch*p))
    return Q

%timeit np.sqrt((2*df['D']*ck)/(ch*df['p']))
%timeit df.apply(lambda row: EOQ(row['D'], row['p'], ck, ch), axis=1)
1000 loops, best of 3: 622 µs per loop
1 loops, best of 3: 1.19 s per loop

You can see that the np method is ~1900 X faster

你可以看到np方法是~1900倍。

#2


0  

I agree with EdChum's answer. A more general approach would be:

我同意爱德华的回答。更一般的办法是:

def RowWiseOperation(x):
    if x.ExistingColumn1 in x.ExistingColumn.split(','):
       return value1
    else:
       return value2

YourDataFrame['NewColumn'] = YourDataFrame.apply(RowWiseOperation, axis = 1)

#1


12  

As I don't know what PartMaster is, the following should work:

因为我不知道PartMaster是什么,下面的工作应该是:

def EOQ(D,p,ck,ch):
    p,D = Partmaster
    Q = math.sqrt((2*D*ck)/(ch*p))
    return Q
ch=0.2
ck=5
df['Q'] = df.apply(lambda row: EOQ(row['D'], row['p'], ck, ch), axis=1)
df

If all you're doing is calculating the square root of some result then use the np.sqrt method this is vectorised and will be significantly faster:

如果你所做的就是计算某个结果的平方根然后使用np。sqrt方法是矢量化的,而且速度会快得多:

In [80]:
df['Q'] = np.sqrt((2*df['D']*ck)/(ch*df['p']))

df
Out[80]:
    D   p          Q
0  10  20   5.000000
1  20  30   5.773503
2  30  10  12.247449

Timings

计时

For a 30k row df:

30k行df:

In [92]:

import math
ch=0.2
ck=5
def EOQ(D,p,ck,ch):
    Q = math.sqrt((2*D*ck)/(ch*p))
    return Q

%timeit np.sqrt((2*df['D']*ck)/(ch*df['p']))
%timeit df.apply(lambda row: EOQ(row['D'], row['p'], ck, ch), axis=1)
1000 loops, best of 3: 622 µs per loop
1 loops, best of 3: 1.19 s per loop

You can see that the np method is ~1900 X faster

你可以看到np方法是~1900倍。

#2


0  

I agree with EdChum's answer. A more general approach would be:

我同意爱德华的回答。更一般的办法是:

def RowWiseOperation(x):
    if x.ExistingColumn1 in x.ExistingColumn.split(','):
       return value1
    else:
       return value2

YourDataFrame['NewColumn'] = YourDataFrame.apply(RowWiseOperation, axis = 1)