Python多处理,ValueError:封闭文件上的I/O操作

时间:2021-05-15 20:27:14

I'm having a problem with the Python multiprocessing package. Below is a simple example code that illustrates my problem.

我对Python多处理包有问题。下面是一个简单的示例代码,说明了我的问题。

import multiprocessing as mp
import time

def test_file(f):
  f.write("Testing...\n")
  print f.name
  return None

if __name__ == "__main__":
  f = open("test.txt", 'w')
  proc = mp.Process(target=test_file, args=[f])
  proc.start()
  proc.join()

When I run this, I get the following error.

当我运行这个时,会得到以下错误。

Process Process-1:
Traceback (most recent call last):
  File "C:\Python27\lib\multiprocessing\process.py", line 258, in _bootstrap
    self.run()
  File "C:\Python27\lib\multiprocessing\process.py", line 114, in run
    self.target(*self._args, **self._kwargs)
  File "C:\Users\Ray\Google Drive\Programming\Python\tests\follow_test.py", line 24, in test_file
    f.write("Testing...\n")
ValueError: I/O operation on closed file
Press any key to continue . . .

It seems that somehow the file handle is 'lost' during the creation of the new process. Could someone please explain what's going on?

似乎在创建新进程的过程中,文件句柄以某种方式“丢失”了。谁能解释一下发生了什么事吗?

1 个解决方案

#1


7  

I had similar issues in the past. Not sure whether it is done within the multiprocessing module or whether open sets the close-on-exec flag by default but I know for sure that file handles opened in the main process are closed in the multiprocessing children.

我过去也遇到过类似的问题。不确定它是在多处理模块中完成的,还是open默认设置关闭-exec标志,但我确信在主进程中打开的文件句柄在多处理子进程中是关闭的。

The obvious work around is to pass the filename as a parameter to the child process' init function and open it once within each child (if using a pool), or to pass it as a parameter to the target function and open/close on each invocation. The former requires the use of a global to store the file handle (not a good thing) - unless someone can show me how to avoid that :) - and the latter can incur a performance hit (but can be used with multiprocessing.Process directly).

显而易见的工作是将文件名作为参数传递给子进程的init函数,并在每个子进程中(如果使用池)打开它一次,或者将其作为参数传递给目标函数,并在每次调用时打开/关闭它。前者需要使用全局变量来存储文件句柄(这不是一件好事)——除非有人能告诉我如何避免:)——后者可能会导致性能下降(但可以用于多处理)。过程直接)。

Example of the former:

前者的例子:

filehandle = None

def child_init(filename):
    global filehandle
    filehandle = open(filename,...)
    ../..

def child_target(args):
    ../..

if __name__ == '__main__':
    # some code which defines filename
    proc = multiprocessing.Pool(processes=1,initializer=child_init,initargs=[filename])
    proc.apply(child_target,args)

#1


7  

I had similar issues in the past. Not sure whether it is done within the multiprocessing module or whether open sets the close-on-exec flag by default but I know for sure that file handles opened in the main process are closed in the multiprocessing children.

我过去也遇到过类似的问题。不确定它是在多处理模块中完成的,还是open默认设置关闭-exec标志,但我确信在主进程中打开的文件句柄在多处理子进程中是关闭的。

The obvious work around is to pass the filename as a parameter to the child process' init function and open it once within each child (if using a pool), or to pass it as a parameter to the target function and open/close on each invocation. The former requires the use of a global to store the file handle (not a good thing) - unless someone can show me how to avoid that :) - and the latter can incur a performance hit (but can be used with multiprocessing.Process directly).

显而易见的工作是将文件名作为参数传递给子进程的init函数,并在每个子进程中(如果使用池)打开它一次,或者将其作为参数传递给目标函数,并在每次调用时打开/关闭它。前者需要使用全局变量来存储文件句柄(这不是一件好事)——除非有人能告诉我如何避免:)——后者可能会导致性能下降(但可以用于多处理)。过程直接)。

Example of the former:

前者的例子:

filehandle = None

def child_init(filename):
    global filehandle
    filehandle = open(filename,...)
    ../..

def child_target(args):
    ../..

if __name__ == '__main__':
    # some code which defines filename
    proc = multiprocessing.Pool(processes=1,initializer=child_init,initargs=[filename])
    proc.apply(child_target,args)