将许多python熊猫数据读到一个excel工作表中

时间:2022-07-03 22:54:41

It is quite easy to add many pandas dataframes into excel work book as long as it is different worksheets. But, it is somewhat tricky to get many dataframes into one worksheet if you want to use pandas built-in df.to_excel functionality.

只要是不同的工作表,就很容易在excel工作簿中添加许多熊猫数据aframes。但是,如果您希望使用熊猫内置的df,那么在一个工作表中使用多个数据aframes就有点麻烦了。to_excel功能。

# Creating Excel Writer Object from Pandas  
writer = pd.ExcelWriter('test.xlsx',engine='xlsxwriter')   
workbook=writer.book
worksheet=workbook.add_worksheet('Validation') 
df.to_excel(writer,sheet_name='Validation',startrow=0 , startcol=0)   
another_df.to_excel(writer,sheet_name='Validation',startrow=20, startcol=0) 

The above code won't work. You will get the error of

上面的代码不起作用。你会得到的误差

 Sheetname 'Validation', with case ignored, is already in use.

Now, I have experimented enough that I found a way to make it work.

现在,我已经做了足够的实验,我找到了一种让它工作的方法。

writer = pd.ExcelWriter('test.xlsx',engine='xlsxwriter')   # Creating Excel Writer Object from Pandas  
workbook=writer.book
df.to_excel(writer,sheet_name='Validation',startrow=0 , startcol=0)   
another_df.to_excel(writer,sheet_name='Validation',startrow=20, startcol=0) 

This will work. So, my purpose of posting this question on * is twofold. Firstly, I hope this will help someone if he/she is trying to put many dataframes into a single work sheet at excel.

这将工作。所以,我在*上发布这个问题的目的是双重的。首先,我希望这能帮助某些人,如果他/她想在excel中把许多数据aframes放到一个工作表中。

Secondly, Can someone help me understand the difference between those two blocks of code? It appears to me that they are pretty much the same except the first block of code created worksheet called "Validation" in advance while the second does not. I get that part.

第二,有人能帮我理解这两个代码块之间的区别吗?在我看来,它们几乎是一样的,只是第一个代码块预先创建了称为“Validation”的工作表,而第二个则不是。我得到这部分。

What I don't understand is why should it be any different ? Even if I don't create the worksheet in advance, this line, the line right before the last one,

我不明白的是为什么会有什么不同?即使我没有提前创建工作表,这一行,在最后一行之前,

 df.to_excel(writer,sheet_name='Validation',startrow=0 , startcol=0)  

will create a worksheet anyway. Consequently, by the time we reached the last line of code the worksheet "Validation" is already created as well in the second block of code. So, my question basically, why should the second block of code work while the first doesn't?

将创建一个工作表。因此,当我们到达最后一行代码时,工作表“验证”已经在第二个代码块中创建。所以,我的问题是,为什么第二组代码要工作,而第一个却不行?

Please also share if there is another way to put many dataframes into excel using the built-in df.to_excel functionality !!

如果还有其他方法可以使用内置的df将许多数据aframes放入excel中,请分享。to_excel功能! !

3 个解决方案

#1


9  

To create the Worksheet in advance, you need to add the created sheet to the sheets dict:

要提前创建工作表,您需要将创建的表单添加到sheets命令:

writer.sheets['Validation'] = worksheet

作家。表(“验证”)=工作表

Using your original code:

使用你的原始代码:

# Creating Excel Writer Object from Pandas  
writer = pd.ExcelWriter('test.xlsx',engine='xlsxwriter')   
workbook=writer.book
worksheet=workbook.add_worksheet('Validation')
writer.sheets['Validation'] = worksheet
df.to_excel(writer,sheet_name='Validation',startrow=0 , startcol=0)   
another_df.to_excel(writer,sheet_name='Validation',startrow=20, startcol=0) 

Explanation

If we look at the pandas function to_excel, it uses the writer's write_cells function:

如果我们看看熊猫函数to_excel,它使用作者的write_cells函数:

excel_writer.write_cells(formatted_cells, sheet_name, startrow=startrow, startcol=startcol)

So looking at the write_cells function for xlsxwriter:

看看xlsxwriter的write_cells函数:

def write_cells(self, cells, sheet_name=None, startrow=0, startcol=0):
    # Write the frame cells using xlsxwriter.
    sheet_name = self._get_sheet_name(sheet_name)
    if sheet_name in self.sheets:
        wks = self.sheets[sheet_name]
    else:
        wks = self.book.add_worksheet(sheet_name)
        self.sheets[sheet_name] = wks

Here we can see that it checks for sheet_name in self.sheets, and so it needs to be added there as well.

在这里,我们可以看到它在self中检查sheet_name。表,所以它也需要加进去。

#2


14  

user3817518: "Please also share if there is another way to put many dataframes into excel using the built-in df.to_excel functionality !!"

user3817518:“如果有其他方法可以使用内置的df将许多数据aframes放入excel中,请分享。”to_excel功能! !”

Here's my attempt:

这是我的尝试:

Easy way to put together a lot of dataframes on just one sheet or across multiple tabs. Let me know if this works!

将大量数据aframes放在一个表或多个选项卡上的简单方法。如果可以的话,请告诉我!

-- To test, just run the sample dataframes and the second and third portion of code.

要测试,只需运行样例dataframes和代码的第二和第三部分。

Sample dataframes

import pandas as pd
import numpy as np

# Sample dataframes    
randn = np.random.randn
df = pd.DataFrame(randn(15, 20))
df1 = pd.DataFrame(randn(10, 5))
df2 = pd.DataFrame(randn(5, 10))

Put multiple dataframes into one xlsx sheet

# funtion
def multiple_dfs(df_list, sheets, file_name, spaces):
    writer = pd.ExcelWriter(file_name,engine='xlsxwriter')   
    row = 0
    for dataframe in df_list:
        dataframe.to_excel(writer,sheet_name=sheets,startrow=row , startcol=0)   
        row = row + len(dataframe.index) + spaces + 1
    writer.save()

# list of dataframes
dfs = [df,df1,df2]

# run function
multiple_dfs(dfs, 'Validation', 'test1.xlsx', 1)

Put multiple dataframes across separate tabs/sheets

# function
def dfs_tabs(df_list, sheet_list, file_name):
    writer = pd.ExcelWriter(file_name,engine='xlsxwriter')   
    for dataframe, sheet in zip(df_list, sheet_list):
        dataframe.to_excel(writer, sheet_name=sheet, startrow=0 , startcol=0)   
    writer.save()

# list of dataframes and sheet names
dfs = [df, df1, df2]
sheets = ['df','df1','df2']    

# run function
dfs_tabs(dfs, sheets, 'multi-test.xlsx')

#3


0  

I would be more inclined to concatenate the dataframes first and then turn that dataframe into an excel format. To put two dataframes together side-by-side (as opposed to one above the other) do this:

我更倾向于先连接dataframes,然后将该dataframe转换为excel格式。要将两个dataframes并排放在一起(而不是放在另一个上面),请执行以下操作:

writer = pd.ExcelWriter('test.xlsx',engine='xlsxwriter')   # Creating Excel Writer Object from Pandas  
workbook=writer.book
df.to_excel(writer,sheet_name='Validation',startrow=0 , startcol=0)   
new_df = pd.concat([df, another_df], axis=1)
new_df.to_excel(writer,sheet_name='Validation',startrow=0 , startcol=0)   

#1


9  

To create the Worksheet in advance, you need to add the created sheet to the sheets dict:

要提前创建工作表,您需要将创建的表单添加到sheets命令:

writer.sheets['Validation'] = worksheet

作家。表(“验证”)=工作表

Using your original code:

使用你的原始代码:

# Creating Excel Writer Object from Pandas  
writer = pd.ExcelWriter('test.xlsx',engine='xlsxwriter')   
workbook=writer.book
worksheet=workbook.add_worksheet('Validation')
writer.sheets['Validation'] = worksheet
df.to_excel(writer,sheet_name='Validation',startrow=0 , startcol=0)   
another_df.to_excel(writer,sheet_name='Validation',startrow=20, startcol=0) 

Explanation

If we look at the pandas function to_excel, it uses the writer's write_cells function:

如果我们看看熊猫函数to_excel,它使用作者的write_cells函数:

excel_writer.write_cells(formatted_cells, sheet_name, startrow=startrow, startcol=startcol)

So looking at the write_cells function for xlsxwriter:

看看xlsxwriter的write_cells函数:

def write_cells(self, cells, sheet_name=None, startrow=0, startcol=0):
    # Write the frame cells using xlsxwriter.
    sheet_name = self._get_sheet_name(sheet_name)
    if sheet_name in self.sheets:
        wks = self.sheets[sheet_name]
    else:
        wks = self.book.add_worksheet(sheet_name)
        self.sheets[sheet_name] = wks

Here we can see that it checks for sheet_name in self.sheets, and so it needs to be added there as well.

在这里,我们可以看到它在self中检查sheet_name。表,所以它也需要加进去。

#2


14  

user3817518: "Please also share if there is another way to put many dataframes into excel using the built-in df.to_excel functionality !!"

user3817518:“如果有其他方法可以使用内置的df将许多数据aframes放入excel中,请分享。”to_excel功能! !”

Here's my attempt:

这是我的尝试:

Easy way to put together a lot of dataframes on just one sheet or across multiple tabs. Let me know if this works!

将大量数据aframes放在一个表或多个选项卡上的简单方法。如果可以的话,请告诉我!

-- To test, just run the sample dataframes and the second and third portion of code.

要测试,只需运行样例dataframes和代码的第二和第三部分。

Sample dataframes

import pandas as pd
import numpy as np

# Sample dataframes    
randn = np.random.randn
df = pd.DataFrame(randn(15, 20))
df1 = pd.DataFrame(randn(10, 5))
df2 = pd.DataFrame(randn(5, 10))

Put multiple dataframes into one xlsx sheet

# funtion
def multiple_dfs(df_list, sheets, file_name, spaces):
    writer = pd.ExcelWriter(file_name,engine='xlsxwriter')   
    row = 0
    for dataframe in df_list:
        dataframe.to_excel(writer,sheet_name=sheets,startrow=row , startcol=0)   
        row = row + len(dataframe.index) + spaces + 1
    writer.save()

# list of dataframes
dfs = [df,df1,df2]

# run function
multiple_dfs(dfs, 'Validation', 'test1.xlsx', 1)

Put multiple dataframes across separate tabs/sheets

# function
def dfs_tabs(df_list, sheet_list, file_name):
    writer = pd.ExcelWriter(file_name,engine='xlsxwriter')   
    for dataframe, sheet in zip(df_list, sheet_list):
        dataframe.to_excel(writer, sheet_name=sheet, startrow=0 , startcol=0)   
    writer.save()

# list of dataframes and sheet names
dfs = [df, df1, df2]
sheets = ['df','df1','df2']    

# run function
dfs_tabs(dfs, sheets, 'multi-test.xlsx')

#3


0  

I would be more inclined to concatenate the dataframes first and then turn that dataframe into an excel format. To put two dataframes together side-by-side (as opposed to one above the other) do this:

我更倾向于先连接dataframes,然后将该dataframe转换为excel格式。要将两个dataframes并排放在一起(而不是放在另一个上面),请执行以下操作:

writer = pd.ExcelWriter('test.xlsx',engine='xlsxwriter')   # Creating Excel Writer Object from Pandas  
workbook=writer.book
df.to_excel(writer,sheet_name='Validation',startrow=0 , startcol=0)   
new_df = pd.concat([df, another_df], axis=1)
new_df.to_excel(writer,sheet_name='Validation',startrow=0 , startcol=0)