I have a question about shared resource with file handle between processes. Here is my test code:
我有一个关于共享资源与进程之间的文件句柄的问题。这是我的测试代码:
from multiprocessing import Process,Lock,freeze_support,Queue
import tempfile
#from cStringIO import StringIO
class File():
def __init__(self):
self.temp = tempfile.TemporaryFile()
#print self.temp
def read(self):
print "reading!!!"
s = "huanghao is a good boy !!"
print >> self.temp,s
self.temp.seek(0,0)
f_content = self.temp.read()
print f_content
class MyProcess(Process):
def __init__(self,queue,*args,**kwargs):
Process.__init__(self,*args,**kwargs)
self.queue = queue
def run(self):
print "ready to get the file object"
self.queue.get().read()
print "file object got"
file.read()
if __name__ == "__main__":
freeze_support()
queue = Queue()
file = File()
queue.put(file)
print "file just put"
p = MyProcess(queue)
p.start()
Then I get a KeyError
like below:
然后我得到如下的KeyError:
file just put
ready to get the file object
Process MyProcess-1:
Traceback (most recent call last):
File "D:\Python26\lib\multiprocessing\process.py", line 231, in _bootstrap
self.run()
File "E:\tmp\mpt.py", line 35, in run
self.queue.get().read()
File "D:\Python26\lib\multiprocessing\queues.py", line 91, in get
res = self._recv()
File "D:\Python26\lib\tempfile.py", line 375, in __getattr__
file = self.__dict__['file']
KeyError: 'file'
I think when I put the File()
object into queue , the object got serialized, and file handle can not be serialized, so, i got the KeyError
:
我想当我把File()对象放入队列时,对象被序列化,文件句柄无法序列化,所以,我得到了KeyError:
Anyone have any idea about that? if I want to share objects with file handle attribute, what should I do?
有人对此有任何想法吗?如果我想与文件句柄属性共享对象,我该怎么办?
1 个解决方案
#1
I have to object (at length, won't just fit in a commentl;-) to @Mark's repeated assertion that file handles just can't be "passed around between running processes" -- this is simply not true in real, modern operating systems, such as, oh, say, Unix (free BSD variants, MacOSX, and Linux, included -- hmmm, I wonder what OS's are left out of this list...?-) -- sendmsg of course can do it (on a "Unix socket", by using the SCM_RIGHTS
flag).
我必须反对(最后,不仅仅适用于评论;-) @ Mark的重复断言,即文件句柄不能“在运行的进程之间传递” - 这在现实,现代中根本不是真的操作系统,比如,哦,比方说,Unix(免费的BSD变种,MacOSX和Linux,包括 - 嗯,我想知道什么操作系统被排除在这个列表之外......? - ) - sendmsg当然可以做到(在“Unix套接字”上,使用SCM_RIGHTS标志)。
Now the poor, valuable multiprocessing
is fully right to not exploit this feature (even assuming there might be black magic to implement it on Windows too) -- most developers would no doubt misuse it anyway (having multiple processes access the same open file concurrently and running into race conditions). The only proper way to use it is for a process which has exclusive rights to open certain files to pass the opened file handles to another process which runs with reduced privileges -- and then never use that handle itself again. No way to enforce that in the multiprocessing
module, anyway.
现在,糟糕,有价值的多处理完全没有利用这个功能(即使假设在Windows上也可能有黑魔法) - 大多数开发人员无疑会滥用它(让多个进程同时访问同一个打开文件)遇到竞争条件)。使用它的唯一正确方法是具有独占权限的进程,该进程具有打开某些文件以将打开的文件句柄传递给另一个以降低的权限运行的进程的权限 - 然后再也不会使用该句柄本身。无论如何,无法在多处理模块中强制执行此操作。
Back to @Andy's original question, unless he's going to work on Linux only (AND with local processes only, too) and willing to play dirty tricks with the /proc filesystem, he's going to have to define his application-level needs more sharply and serialize file
objects accordingly. Most files have a path (or can be made to have one: path-less files are pretty rare, actually non-existent on Windows I believe) and thus can be serialized via it -- many others are small enough to serialize by sending their content over -- etc, etc.
回到@Andy的原始问题,除非他只在Linux上工作(并且只使用本地进程)并且愿意使用/ proc文件系统进行肮脏的技巧,否则他将不得不更加严格地定义他的应用程序级需求。相应地序列化文件对象。大多数文件都有一个路径(或者可以有一个路径:无路径文件非常罕见,在我认为的Windows上实际上不存在)因此可以通过它进行序列化 - 许多其他文件都很小,可以通过发送它们来序列化内容超过等等
#1
I have to object (at length, won't just fit in a commentl;-) to @Mark's repeated assertion that file handles just can't be "passed around between running processes" -- this is simply not true in real, modern operating systems, such as, oh, say, Unix (free BSD variants, MacOSX, and Linux, included -- hmmm, I wonder what OS's are left out of this list...?-) -- sendmsg of course can do it (on a "Unix socket", by using the SCM_RIGHTS
flag).
我必须反对(最后,不仅仅适用于评论;-) @ Mark的重复断言,即文件句柄不能“在运行的进程之间传递” - 这在现实,现代中根本不是真的操作系统,比如,哦,比方说,Unix(免费的BSD变种,MacOSX和Linux,包括 - 嗯,我想知道什么操作系统被排除在这个列表之外......? - ) - sendmsg当然可以做到(在“Unix套接字”上,使用SCM_RIGHTS标志)。
Now the poor, valuable multiprocessing
is fully right to not exploit this feature (even assuming there might be black magic to implement it on Windows too) -- most developers would no doubt misuse it anyway (having multiple processes access the same open file concurrently and running into race conditions). The only proper way to use it is for a process which has exclusive rights to open certain files to pass the opened file handles to another process which runs with reduced privileges -- and then never use that handle itself again. No way to enforce that in the multiprocessing
module, anyway.
现在,糟糕,有价值的多处理完全没有利用这个功能(即使假设在Windows上也可能有黑魔法) - 大多数开发人员无疑会滥用它(让多个进程同时访问同一个打开文件)遇到竞争条件)。使用它的唯一正确方法是具有独占权限的进程,该进程具有打开某些文件以将打开的文件句柄传递给另一个以降低的权限运行的进程的权限 - 然后再也不会使用该句柄本身。无论如何,无法在多处理模块中强制执行此操作。
Back to @Andy's original question, unless he's going to work on Linux only (AND with local processes only, too) and willing to play dirty tricks with the /proc filesystem, he's going to have to define his application-level needs more sharply and serialize file
objects accordingly. Most files have a path (or can be made to have one: path-less files are pretty rare, actually non-existent on Windows I believe) and thus can be serialized via it -- many others are small enough to serialize by sending their content over -- etc, etc.
回到@Andy的原始问题,除非他只在Linux上工作(并且只使用本地进程)并且愿意使用/ proc文件系统进行肮脏的技巧,否则他将不得不更加严格地定义他的应用程序级需求。相应地序列化文件对象。大多数文件都有一个路径(或者可以有一个路径:无路径文件非常罕见,在我认为的Windows上实际上不存在)因此可以通过它进行序列化 - 许多其他文件都很小,可以通过发送它们来序列化内容超过等等