在Python中,如何遍历一个迭代器,然后再迭代另一个迭代器?

时间:2022-02-08 07:12:40

I'd like to iterate two different iterators, something like this:

我想迭代两个不同的迭代器,如下所示:

file1 = open('file1', 'r')
file2 = open('file2', 'r')
for item in one_then_another(file1, file2):
    print item

Which I'd expect to print all the lines of file1, then all the lines of file2.

我希望打印file1的所有行,然后是file2的所有行。

I'd like something generic, as the iterators might not be files, this is just an example. I know I could do this with:

我想要一些通用的东西,因为迭代器可能不是文件,这只是一个例子。我知道我可以这样做:

for item in [file1]+[file2]:

but this reads both files into memory, which I'd prefer to avoid.

但这将两个文件读入内存,我宁愿避免这样做。

3 个解决方案

#1


89  

Use itertools.chain:

使用itertools.chain:

from itertools import chain
for line in chain(file1, file2):
   pass

fileinput module also provides a similar feature:

fileinput模块也提供了类似的特性:

import fileinput
for line in fileinput.input(['file1', 'file2']):
   pass

#2


17  

You can also do it with simple generator expression:

您还可以使用简单的生成器表达式:

for line in (l for f in (file1, file2) for l in f):
    # do something with line

with this method you can specify some condition in expression itself:

使用此方法,您可以在表达式本身中指定某个条件:

for line in (l for f in (file1, file2) for l in f if 'text' in l):
    # do something with line which contains 'text'

The example above is equivalent to this generator with loop:

上面的例子相当于这个带循环的生成器:

def genlinewithtext(*files):
    for file in files:
        for line in file:
            if 'text' in line:
                yield line

for line in genlinewithtext(file1, file2):
    # do something with line which contains 'text'

#3


7  

I think the most Pythonic approach to this particular file problem is to use the fileinput module (since you either need complex context managers or error handling with open), I'm going to start with Ashwini's example, but add a few things. The first is that it's better to open with the U flag for Universal Newlines support (assuming your Python is compiled with it, and most are), (r is default mode, but explicit is better than implicit). If you're working with other people, it's best to support them giving you files in any format.

我认为处理这个特定文件问题最python化的方法是使用fileinput模块(因为您要么需要复杂的上下文管理器,要么需要open进行错误处理),我将从Ashwini的示例开始,但是添加一些东西。第一个问题是,最好使用U标志来打开通用换行符(假设您的Python是用它编译的,而且大多数是),(r是默认模式,但是显式比隐式要好)。如果你和其他人一起工作,最好支持他们以任何格式提供文件。

import fileinput

for line in fileinput.input(['file1', 'file2'], mode='rU'):
   pass

This is also usable on the command line as it will take sys.argv[1:] if you do this:

这在命令行中也可以使用,因为它使用的是sys。如果你这样做:

import fileinput

for line in fileinput.input(mode='rU'):
   pass

And you would pass the files in your shell like this:

你可以像这样传递你shell中的文件:

$ python myscript.py file1 file2

#1


89  

Use itertools.chain:

使用itertools.chain:

from itertools import chain
for line in chain(file1, file2):
   pass

fileinput module also provides a similar feature:

fileinput模块也提供了类似的特性:

import fileinput
for line in fileinput.input(['file1', 'file2']):
   pass

#2


17  

You can also do it with simple generator expression:

您还可以使用简单的生成器表达式:

for line in (l for f in (file1, file2) for l in f):
    # do something with line

with this method you can specify some condition in expression itself:

使用此方法,您可以在表达式本身中指定某个条件:

for line in (l for f in (file1, file2) for l in f if 'text' in l):
    # do something with line which contains 'text'

The example above is equivalent to this generator with loop:

上面的例子相当于这个带循环的生成器:

def genlinewithtext(*files):
    for file in files:
        for line in file:
            if 'text' in line:
                yield line

for line in genlinewithtext(file1, file2):
    # do something with line which contains 'text'

#3


7  

I think the most Pythonic approach to this particular file problem is to use the fileinput module (since you either need complex context managers or error handling with open), I'm going to start with Ashwini's example, but add a few things. The first is that it's better to open with the U flag for Universal Newlines support (assuming your Python is compiled with it, and most are), (r is default mode, but explicit is better than implicit). If you're working with other people, it's best to support them giving you files in any format.

我认为处理这个特定文件问题最python化的方法是使用fileinput模块(因为您要么需要复杂的上下文管理器,要么需要open进行错误处理),我将从Ashwini的示例开始,但是添加一些东西。第一个问题是,最好使用U标志来打开通用换行符(假设您的Python是用它编译的,而且大多数是),(r是默认模式,但是显式比隐式要好)。如果你和其他人一起工作,最好支持他们以任何格式提供文件。

import fileinput

for line in fileinput.input(['file1', 'file2'], mode='rU'):
   pass

This is also usable on the command line as it will take sys.argv[1:] if you do this:

这在命令行中也可以使用,因为它使用的是sys。如果你这样做:

import fileinput

for line in fileinput.input(mode='rU'):
   pass

And you would pass the files in your shell like this:

你可以像这样传递你shell中的文件:

$ python myscript.py file1 file2