如何递归遍历所有子目录和读取文件?

时间:2021-01-14 03:47:16

I have a root-ish directory containing multiple subdirectories, all of which contain a file name data.txt. What I would like to do is write a script that takes in the "root" directory, and then reads through all of the subdirectories and reads every "data.txt" in the subdirectories, and then writes stuff from every data.txt file to an output file.

我有一个包含多个子目录的root-ish目录,所有子目录都包含文件名data.txt。我想做的是编写一个接受“root”目录的脚本,然后读取所有子目录并读取子目录中的每个“data.txt”,然后将每个data.txt文件中的东西写入输出文件。

Here's a snippet of my code:

这是我的代码片段:

import os
import sys
rootdir = sys.argv[1]

with open('output.txt','w') as fout:
    for root, subFolders, files in os.walk(rootdir):
        for file in files:
            if (file == 'data.txt'):
                #print file
                with open(file,'r') as fin:
                    for lines in fin:
                        dosomething()

My dosomething() part -- I've tested and confirmed for it to work if I am running that part just for one file. I've also confirmed that if I tell it to print the file instead (the commented out line) the script prints out 'data.txt'.

我的dosomething()部分 - 如果我只为一个文件运行该部分,我已经测试并确认它可以正常工作。我还确认,如果我告诉它打印文件(注释掉的行),脚本会输出'data.txt'。

Right now if I run it Python gives me this error:

现在,如果我运行它,Python会给我这个错误:

File "recursive.py", line 11, in <module>
    with open(file,'r') as fin:
IOError: [Errno 2] No such file or directory: 'data.txt'

I'm not sure why it can't find it -- after all, it prints out data.txt if I uncomment the 'print file' line. What am I doing incorrectly?

我不确定为什么它找不到它 - 毕竟,如果我取消注释'print file'行,它会打印出data.txt。我做错了什么?

2 个解决方案

#1


52  

You need to use absolute paths, your file variable is just a local filename without a directory path. The root variable is that path:

您需要使用绝对路径,您的文件变量只是一个没有目录路径的本地文件名。根变量是该路径:

with open('output.txt','w') as fout:
    for root, subFolders, files in os.walk(rootdir):
        if 'data.txt' in files:
            with open(os.path.join(root, 'data.txt'), 'r') as fin:
                for lines in fin:
                    dosomething()

#2


0  

[os.path.join(dirpath, filename) for dirpath, dirnames, filenames in os.walk(rootdir) 
                                 for filename in filenames]

A functional approach to get the tree looks shorter, cleaner and more Pythonic.

获得树的功能方法看起来更短,更干净,更Pythonic。

You can wrap the os.path.join(dirpath, filename) into any function to process the files you get or save the array of paths for further processing

您可以将os.path.join(dirpath,filename)包装到任何函数中以处理您获得的文件或保存路径数组以进行进一步处理

#1


52  

You need to use absolute paths, your file variable is just a local filename without a directory path. The root variable is that path:

您需要使用绝对路径,您的文件变量只是一个没有目录路径的本地文件名。根变量是该路径:

with open('output.txt','w') as fout:
    for root, subFolders, files in os.walk(rootdir):
        if 'data.txt' in files:
            with open(os.path.join(root, 'data.txt'), 'r') as fin:
                for lines in fin:
                    dosomething()

#2


0  

[os.path.join(dirpath, filename) for dirpath, dirnames, filenames in os.walk(rootdir) 
                                 for filename in filenames]

A functional approach to get the tree looks shorter, cleaner and more Pythonic.

获得树的功能方法看起来更短,更干净,更Pythonic。

You can wrap the os.path.join(dirpath, filename) into any function to process the files you get or save the array of paths for further processing

您可以将os.path.join(dirpath,filename)包装到任何函数中以处理您获得的文件或保存路径数组以进行进一步处理