I have a dataframe of data that I am trying to append to another dataframe. I have tried various ways with .append() and there has been no successful way. When I print the data from iterrows. I provide 2 possible ways I tried to solve the issue below, one creates an error, the other doesn't populate the dataframe with anything.
我有一个要添加到另一个dataframe的数据aframe。我尝试过使用.append()的各种方法,但是没有成功的方法。当我从迭代中打印数据时。我提供了两种可能的方法来解决下面的问题,一种是创建错误,另一种不是用任何东西填充dataframe。
The workflow I am trying to create is create a dataframe based off of a file that contains transaction history of customer orders. I only want to create a single record per order and I am going to add other logic to update the order details based on updates in the history. By the end of the script, it will have a single record for all of the orders and the end state of those orders after iterating through the history file.
我要创建的工作流是基于包含客户订单事务历史的文件创建一个dataframe。我只希望为每个订单创建一条记录,我将添加其他逻辑,根据历史中的更新更新更新更新订单细节。在脚本的末尾,在遍历历史文件之后,它将拥有所有订单的单一记录和这些订单的最终状态。
class om():
"""Manages over the current state of orders"""
def __init__(self,dataF, desc='NONE'):
self.df = pd.DataFrame
self.data = dataF
print type(dataF)
self.oD= self.df(data=None,columns=desc)
def add_data(self,df):
for i, row in self.data.iterrows():
print 'row '+str(row)
print type(row)
df.append(self.data[i], ignore_index =True) """ This line creates and error"""
df.append(row, ignore_index =True) """This line doesn't append anything to the dataframe."""
test = order_manager(body,header)
test.add_data(test.orderData)
1 个解决方案
#1
3
Use .loc
to enlarge the current df
. See the example below.
使用。loc放大当前的df。看下面的例子。
import pandas as pd
import numpy as np
date_rng = pd.date_range('2015-01-01', periods=200, freq='D')
df1 = pd.DataFrame(np.random.randn(100, 3), columns='A B C'.split(), index=date_rng[:100])
Out[410]:
A B C
2015-01-01 0.2799 0.4416 -0.7474
2015-01-02 -0.4983 0.1490 -0.2599
2015-01-03 0.4101 1.2622 -1.8081
2015-01-04 1.1976 -0.7410 0.4221
2015-01-05 1.3311 1.0399 2.2701
... ... ... ...
2015-04-06 -0.0432 0.6131 -0.0216
2015-04-07 0.4224 -1.1565 2.2285
2015-04-08 0.0663 1.2994 2.0322
2015-04-09 0.1958 -0.4412 0.3924
2015-04-10 0.1622 1.7603 1.4525
[100 rows x 3 columns]
df2 = pd.DataFrame(np.random.randn(100, 3), columns='A B C'.split(), index=date_rng[100:])
Out[411]:
A B C
2015-04-11 1.1196 -1.9627 0.6615
2015-04-12 -0.0098 1.7655 0.0447
2015-04-13 -1.7318 -2.0296 0.8384
2015-04-14 -1.5472 -1.7220 -0.3166
2015-04-15 2.5058 0.6487 1.0994
... ... ... ...
2015-07-15 -1.4803 2.1703 -1.9391
2015-07-16 -1.7595 -1.7647 -1.0622
2015-07-17 1.7900 0.2280 -1.8797
2015-07-18 0.7909 -0.4999 0.3848
2015-07-19 1.2243 0.4681 -1.2323
[100 rows x 3 columns]
# to move one row from df2 to df1, use .loc to enlarge df1
# this is far more efficient than pd.concat and pd.append
df1.loc[df2.index[0]] = df2.iloc[0]
Out[413]:
A B C
2015-01-01 0.2799 0.4416 -0.7474
2015-01-02 -0.4983 0.1490 -0.2599
2015-01-03 0.4101 1.2622 -1.8081
2015-01-04 1.1976 -0.7410 0.4221
2015-01-05 1.3311 1.0399 2.2701
... ... ... ...
2015-04-07 0.4224 -1.1565 2.2285
2015-04-08 0.0663 1.2994 2.0322
2015-04-09 0.1958 -0.4412 0.3924
2015-04-10 0.1622 1.7603 1.4525
2015-04-11 1.1196 -1.9627 0.6615
[101 rows x 3 columns]
#1
3
Use .loc
to enlarge the current df
. See the example below.
使用。loc放大当前的df。看下面的例子。
import pandas as pd
import numpy as np
date_rng = pd.date_range('2015-01-01', periods=200, freq='D')
df1 = pd.DataFrame(np.random.randn(100, 3), columns='A B C'.split(), index=date_rng[:100])
Out[410]:
A B C
2015-01-01 0.2799 0.4416 -0.7474
2015-01-02 -0.4983 0.1490 -0.2599
2015-01-03 0.4101 1.2622 -1.8081
2015-01-04 1.1976 -0.7410 0.4221
2015-01-05 1.3311 1.0399 2.2701
... ... ... ...
2015-04-06 -0.0432 0.6131 -0.0216
2015-04-07 0.4224 -1.1565 2.2285
2015-04-08 0.0663 1.2994 2.0322
2015-04-09 0.1958 -0.4412 0.3924
2015-04-10 0.1622 1.7603 1.4525
[100 rows x 3 columns]
df2 = pd.DataFrame(np.random.randn(100, 3), columns='A B C'.split(), index=date_rng[100:])
Out[411]:
A B C
2015-04-11 1.1196 -1.9627 0.6615
2015-04-12 -0.0098 1.7655 0.0447
2015-04-13 -1.7318 -2.0296 0.8384
2015-04-14 -1.5472 -1.7220 -0.3166
2015-04-15 2.5058 0.6487 1.0994
... ... ... ...
2015-07-15 -1.4803 2.1703 -1.9391
2015-07-16 -1.7595 -1.7647 -1.0622
2015-07-17 1.7900 0.2280 -1.8797
2015-07-18 0.7909 -0.4999 0.3848
2015-07-19 1.2243 0.4681 -1.2323
[100 rows x 3 columns]
# to move one row from df2 to df1, use .loc to enlarge df1
# this is far more efficient than pd.concat and pd.append
df1.loc[df2.index[0]] = df2.iloc[0]
Out[413]:
A B C
2015-01-01 0.2799 0.4416 -0.7474
2015-01-02 -0.4983 0.1490 -0.2599
2015-01-03 0.4101 1.2622 -1.8081
2015-01-04 1.1976 -0.7410 0.4221
2015-01-05 1.3311 1.0399 2.2701
... ... ... ...
2015-04-07 0.4224 -1.1565 2.2285
2015-04-08 0.0663 1.2994 2.0322
2015-04-09 0.1958 -0.4412 0.3924
2015-04-10 0.1622 1.7603 1.4525
2015-04-11 1.1196 -1.9627 0.6615
[101 rows x 3 columns]