如何使用Python从多个文本文件中将数据提取到Excel? (每张一个文件的数据)

时间:2021-11-02 03:27:56

So far for my code to read from text files and export to Excel I have:

到目前为止,我的代码从文本文件中读取并导出到Excel我有:

import glob

data = {}
for infile in glob.glob("*.txt"):
    with open(infile) as inf:
        data[infile] = [l[:-1] for l in inf] 

with open("summary.xls", "w") as outf:
    outf.write("\t".join(data.keys()) + "\n")
    for sublst in zip(*data.values()):
        outf.write("\t".join(sublst) + "\n")

The goal with this was to reach all of the text files in a specific folder.

这样做的目的是访问特定文件夹中的所有文本文件。

However, when I run it, Excel gives me an error saying,

但是,当我运行它时,Excel给出了一个错误说,

"File cannot be opened because: Invalid at the top level of the document. Line 1, Position 1. outputgooderr.txt outputbaderr.txt. fixed_inv.txt

“无法打开文件,因为:文档顶层无效。第1行,第1行.outputgooderr.txt outputbaderr.txt.fixed_inv.txt

Note: outputgooderr.txt, outputbaderr.txt.,fixed_inv.txt are the names of the text files I wish to export to Excel, one file per sheet.

注意:outputgooderr.txt,outputbaderr.txt。,fixed_inv.txt是我要导出到Excel的文本文件的名称,每张一个文件。

When I only have one file for the program to read, it is able to extract the data. Unfortunately, this is not what I would like since I have multiple files.

当我只有一个文件供程序读取时,它能够提取数据。不幸的是,这不是我想要的,因为我有多个文件。

Please let me know of any ways I can combat this. I am very much so a beginner in programming in general and would appreciate any advice! Thank you.

请让我知道我可以解决这个问题的任何方法。我非常喜欢编程的初学者,并且非常感谢任何建议!谢谢。

1 个解决方案

#1


1  

If you're not opposed to having the outputted excel file as a .xlsx rather than .xls, I'd recommend making use of some of the features of Pandas. In particular pandas.read_csv() and DataFrame.to_excel()

如果您不反对将输出的excel文件作为.xlsx而不是.xls,我建议您使用Pandas的一些功能。特别是pandas.read_csv()和DataFrame.to_excel()

I've provided a fully reproducible example of how you might go about doing this. Please note that I create 2 .txt files in the first 3 lines for the test.

我已经提供了一个完全可重现的例子,说明如何进行此操作。请注意,我在测试的前3行中创建了2个.txt文件。

import pandas as pd
import numpy as np
import glob

# Creating a dataframe and saving as test_1.txt/test_2.txt in current directory
# feel free to remove the next 3 lines if yo want to test in your directory
df = pd.DataFrame(np.random.randn(10, 3), columns=list('ABC'))
df.to_csv('test_1.txt', index=False)
df.to_csv('test_2.txt', index=False)

txt_list = [] # empty list
sheet_list = [] # empty list

# a for loop through filenames matching a specified pattern (.txt) in the current directory
for infile in glob.glob("*.txt"): 
    outfile = infile.replace('.txt', '') #removing '.txt' for excel sheet names
    sheet_list.append(outfile) #appending for excel sheet name to sheet_list
    txt_list.append(infile) #appending for '...txt' to txtt_list

writer = pd.ExcelWriter('summary.xlsx', engine='xlsxwriter')

# a for loop through all elements in txt_list
for i in range(0, len(txt_list)):
    df = pd.read_csv('%s' % (txt_list[i])) #reading element from txt_list at index = i 
    df.to_excel(writer, sheet_name='%s' % (sheet_list[i]), index=False) #reading element from sheet_list at index = i 

writer.save()

Output example:

输出示例:

如何使用Python从多个文本文件中将数据提取到Excel? (每张一个文件的数据)

#1


1  

If you're not opposed to having the outputted excel file as a .xlsx rather than .xls, I'd recommend making use of some of the features of Pandas. In particular pandas.read_csv() and DataFrame.to_excel()

如果您不反对将输出的excel文件作为.xlsx而不是.xls,我建议您使用Pandas的一些功能。特别是pandas.read_csv()和DataFrame.to_excel()

I've provided a fully reproducible example of how you might go about doing this. Please note that I create 2 .txt files in the first 3 lines for the test.

我已经提供了一个完全可重现的例子,说明如何进行此操作。请注意,我在测试的前3行中创建了2个.txt文件。

import pandas as pd
import numpy as np
import glob

# Creating a dataframe and saving as test_1.txt/test_2.txt in current directory
# feel free to remove the next 3 lines if yo want to test in your directory
df = pd.DataFrame(np.random.randn(10, 3), columns=list('ABC'))
df.to_csv('test_1.txt', index=False)
df.to_csv('test_2.txt', index=False)

txt_list = [] # empty list
sheet_list = [] # empty list

# a for loop through filenames matching a specified pattern (.txt) in the current directory
for infile in glob.glob("*.txt"): 
    outfile = infile.replace('.txt', '') #removing '.txt' for excel sheet names
    sheet_list.append(outfile) #appending for excel sheet name to sheet_list
    txt_list.append(infile) #appending for '...txt' to txtt_list

writer = pd.ExcelWriter('summary.xlsx', engine='xlsxwriter')

# a for loop through all elements in txt_list
for i in range(0, len(txt_list)):
    df = pd.read_csv('%s' % (txt_list[i])) #reading element from txt_list at index = i 
    df.to_excel(writer, sheet_name='%s' % (sheet_list[i]), index=False) #reading element from sheet_list at index = i 

writer.save()

Output example:

输出示例:

如何使用Python从多个文本文件中将数据提取到Excel? (每张一个文件的数据)