I have the following dataframe:
我有以下数据帧:
Index_Date A B C D
===============================
2015-01-31 10 10 Nan 10
2015-02-01 2 3 Nan 22
2015-02-02 10 60 Nan 280
2015-02-03 10 100 Nan 250
Require:
Index_Date A B C D
===============================
2015-01-31 10 10 10 10
2015-02-01 2 3 23 22
2015-02-02 10 60 290 280
2015-02-03 10 100 3000 250
Column C
is derived for 2015-01-31
by taking value
of D
.
C列是根据D的值得出的2015-01-31。
Then I need to use the value
of C
for 2015-01-31
and multiply by the value
of A
on 2015-02-01
and add B
.
然后我需要使用C的值2015-01-31并乘以2015-02-01的A值并添加B.
I have attempted an apply
and a shift
using an if else
by this gives a key error.
我已尝试使用if else进行应用和移位,这会产生一个关键错误。
3 个解决方案
#1
14
First, create the derived value:
首先,创建派生值:
df.loc[0, 'C'] = df.loc[0, 'D']
Then iterate through the remaining rows and fill the calculated values:
然后迭代剩余的行并填充计算的值:
for i in range(1, len(df)):
df.loc[i, 'C'] = df.loc[i-1, 'C'] * df.loc[i, 'A'] + df.loc[i, 'B']
Index_Date A B C D
0 2015-01-31 10 10 10 10
1 2015-02-01 2 3 23 22
2 2015-02-02 10 60 290 280
#2
10
Given a column of numbers:
给出一列数字:
lst = []
cols = ['A']
for a in range(100, 105):
lst.append([a])
df = pd.DataFrame(lst, columns=cols, index=range(5))
df
A
0 100
1 101
2 102
3 103
4 104
You can reference the previous row with shift:
您可以使用shift引用上一行:
df['Change'] = df.A - df.A.shift(1)
df
A Change
0 100 NaN
1 101 1.0
2 102 1.0
3 103 1.0
4 104 1.0
#3
6
Applying the recursive function on numpy arrays will be faster than the current answer.
在numpy数组上应用递归函数将比当前答案更快。
df = pd.DataFrame(np.repeat(np.arange(2, 6),3).reshape(4,3), columns=['A', 'B', 'D'])
new = [df.D.values[0]]
for i in range(1, len(df.index)):
new.append(new[i-1]*df.A.values[i]+df.B.values[i])
df['C'] = new
Output
A B D C
0 1 1 1 1
1 2 2 2 4
2 3 3 3 15
3 4 4 4 64
4 5 5 5 325
#1
14
First, create the derived value:
首先,创建派生值:
df.loc[0, 'C'] = df.loc[0, 'D']
Then iterate through the remaining rows and fill the calculated values:
然后迭代剩余的行并填充计算的值:
for i in range(1, len(df)):
df.loc[i, 'C'] = df.loc[i-1, 'C'] * df.loc[i, 'A'] + df.loc[i, 'B']
Index_Date A B C D
0 2015-01-31 10 10 10 10
1 2015-02-01 2 3 23 22
2 2015-02-02 10 60 290 280
#2
10
Given a column of numbers:
给出一列数字:
lst = []
cols = ['A']
for a in range(100, 105):
lst.append([a])
df = pd.DataFrame(lst, columns=cols, index=range(5))
df
A
0 100
1 101
2 102
3 103
4 104
You can reference the previous row with shift:
您可以使用shift引用上一行:
df['Change'] = df.A - df.A.shift(1)
df
A Change
0 100 NaN
1 101 1.0
2 102 1.0
3 103 1.0
4 104 1.0
#3
6
Applying the recursive function on numpy arrays will be faster than the current answer.
在numpy数组上应用递归函数将比当前答案更快。
df = pd.DataFrame(np.repeat(np.arange(2, 6),3).reshape(4,3), columns=['A', 'B', 'D'])
new = [df.D.values[0]]
for i in range(1, len(df.index)):
new.append(new[i-1]*df.A.values[i]+df.B.values[i])
df['C'] = new
Output
A B D C
0 1 1 1 1
1 2 2 2 4
2 3 3 3 15
3 4 4 4 64
4 5 5 5 325