根据pandas数据帧中的列值插入行

时间:2020-11-29 21:23:41

Considering the following dataframe in pandas

考虑pandas中的以下数据帧

   date     action    price     
  20150101   buy       10
  20150102   buy       9
  20150103   sell      11
  20150104   sell      10
  20150105   buy       8
  20150106   sell      9

If I want to add rows whenever 'sell' turn in to 'buy', and inserted row is just the copy of previous row except changing 'sell' into 'buy' such as follows:

如果我想在'sell'转入'buy'时添加行,并且插入行只是前一行的副本,除了将'sell'更改为'buy',如下所示:

   date     action    price     
  20150101   buy       10
  20150102   buy       9
  20150103   sell      11
  20150104   sell      10
**20150104   buy       10**
  20150105   buy       8
  20150106   sell      9
**20150106   buy       9 **

Thanks for the help.

谢谢您的帮助。

1 个解决方案

#1


You could identify the transition rows using

您可以使用标识过渡行

mask = (df['action'] == 'sell') & (df['action'].shift(-1) != 'sell')
# In [229]: mask
# Out[229]: 
# 0    False
# 1    False
# 2    False
# 3     True
# 4    False
# 5     True
# Name: action, dtype: bool

Then you could make a new DataFrame, consisting of the rows where mask is True:

然后你可以创建一个新的DataFrame,包括mask为True的行:

new = df.loc[mask].copy()

Set the action to 'buy':

将操作设置为“购买”:

new['action'] = 'buy'
#        date action  price
# 3  20150104    buy     10
# 5  20150106    buy      9

Build a new DataFrame which concatentates df and new:

构建一个新的DataFrame,它连接df和new:

df = pd.concat([df, new])

and sort by date:

并按日期排序:

df = df.sort(['date'])

For example,

import pandas as pd
df = pd.read_table('data', sep='\s+')
mask = (df['action'] == 'sell') & (df['action'].shift(-1) != 'sell')
new = df.loc[mask].copy()
new['action'] = 'buy'
df = pd.concat([df, new])
df = df.sort(['date'])
df = df.reset_index(drop=True)
print(df)

yields

       date action  price
0  20150101    buy     10
1  20150102    buy      9
2  20150103   sell     11
3  20150104   sell     10
4  20150104    buy     10
5  20150105    buy      8
6  20150106   sell      9
7  20150106    buy      9

#1


You could identify the transition rows using

您可以使用标识过渡行

mask = (df['action'] == 'sell') & (df['action'].shift(-1) != 'sell')
# In [229]: mask
# Out[229]: 
# 0    False
# 1    False
# 2    False
# 3     True
# 4    False
# 5     True
# Name: action, dtype: bool

Then you could make a new DataFrame, consisting of the rows where mask is True:

然后你可以创建一个新的DataFrame,包括mask为True的行:

new = df.loc[mask].copy()

Set the action to 'buy':

将操作设置为“购买”:

new['action'] = 'buy'
#        date action  price
# 3  20150104    buy     10
# 5  20150106    buy      9

Build a new DataFrame which concatentates df and new:

构建一个新的DataFrame,它连接df和new:

df = pd.concat([df, new])

and sort by date:

并按日期排序:

df = df.sort(['date'])

For example,

import pandas as pd
df = pd.read_table('data', sep='\s+')
mask = (df['action'] == 'sell') & (df['action'].shift(-1) != 'sell')
new = df.loc[mask].copy()
new['action'] = 'buy'
df = pd.concat([df, new])
df = df.sort(['date'])
df = df.reset_index(drop=True)
print(df)

yields

       date action  price
0  20150101    buy     10
1  20150102    buy      9
2  20150103   sell     11
3  20150104   sell     10
4  20150104    buy     10
5  20150105    buy      8
6  20150106   sell      9
7  20150106    buy      9