I am trying to share some strings between processes using the sharedctypes part of the multiprocessing module.
我正在尝试使用多处理模块的sharedctypes部分在进程之间共享一些字符串。
TL;DR: I wish to put my strings into a sharedctypes array, like so:
我想把我的字符串放到一个sharedctypes数组中,如下所示:
from multiprocessing.sharedctypes import Array
Array(ctypes.c_char, ['a string', 'another string'])
More Information:
更多信息:
The docs have this note:
文件中有这样的说明:
"Note that an array of ctypes.c_char has value and raw attributes which allow one to use it to store and retrieve strings."
注意,一个c类型数组。c_char有值和原始属性,可以用来存储和检索字符串。
Using c_char
alone:
单独使用c_char:
from multiprocessing.sharedctypes import Array
Array(ctypes.c_char, ['a string', 'another string'])
I get a type error, which makes sense:
我得到一个类型错误,这是有意义的:
TypeError: one character bytes, bytearray or integer expected
This can (kind of) work by splittingthe sting in to bytes (which makes also sense):
这可以(某种程度上)通过将刺分成字节(这也有意义):
from multiprocessing.sharedctypes import Array
multiproccessing.sharedctypes.Array(ctypes.c_char, [b's', b't', b'r', b'i', b'n', b'g'])
But this is not very convenient for storing large lists of strings.
但是对于存储大量字符串列表来说,这并不是很方便。
However when I tried using the value
and raw
attributes shown in the docs here and mentioned in that note there is still no magic:
然而,当我尝试使用文档中显示的值和原始属性并在注释中提到时,仍然没有什么神奇之处:
Array(ctypes.c_char.value, ['string'])
gives this error:
让这个错误:
TypeError: unsupported operand type(s) for *: 'getset_descriptor' and 'int'
and raw
gives this:
和生了这个:
Array(ctypes.c_char.raw, ['string'])
AttributeError: type object 'c_char' has no attribute 'raw'
I have also tried using the c_wchar_p
type which in the table of primitive C compatible data types (found in the docs) corresponds directly to a string:
我也尝试过使用c_wchar_p类型,它在原始C兼容数据类型表(在文档中可以找到)中直接对应于字符串:
Array(ctypes.c_wchar_p, ['string'])
This CRASHES python, no error code is reported, the process simply exits with code 0.
这破坏了python,没有错误代码被报告,进程仅以代码0退出。
Why can't sharedctypes arrays hold pointers like the c_wchar_p type? any other solution or advice on how to store strings in a sharedctype arrays is much welcome!
为什么sharedctypes数组不能像c_wchar_p类型那样保存指针?关于如何在共享ctype数组中存储字符串的任何其他解决方案或建议都非常受欢迎!
Update - This code occasionally works (most of the time python stops working but occasionally I get strings back, although they are mostly gibberish). but the comments mention it working fine on windows.
更新——这段代码偶尔会工作(大多数时候python会停止工作,但偶尔我也会得到字符串,尽管它们大多是胡言乱语)。但评论提到它在windows上运行良好。
from multiprocessing import Process, Lock
from multiprocessing.sharedctypes import Value, Array
import ctypes
def print_strings(S):
"""Print strings in the C array"""
print([a for a in S])
if __name__ == '__main__':
lock = Lock()
string_array = Array(ctypes.c_wchar_p, ['string'])
q = Process(target=print_strings, args=(string_array,))
q.start()
q.join()
Update 2
更新2
This is the gibberish I get:
这是我听到的胡言乱语:
['汣獵癩汥⁹景椠瑮搠祴数\u2e73ਊ††敓\u2065汁潳\u200a†ⴠⴭⴭⴭਭ††捳灩\u2e79灳捥慩\u2e6c癩\u202c捳灩\u2e79灳捥慩\u2e6c癩\u0a65\u200a†丠瑯獥\u200a†ⴠⴭⴭ\u200a†圠\u2065獵\u2065桴\u2065污潧楲桴\u206d異汢獩敨\u2064祢䌠敬獮慨⁷ㅛ彝愠摮爠晥牥湥散\u2064祢\u200a†䄠牢浡睯瑩⁺湡\u2064瑓来湵嬠崲\u2c5f映牯眠楨档琠敨映湵瑣潩\u206e潤慭湩椠ੳ††慰瑲瑩潩敮\u2064湩潴琠敨琠潷椠瑮牥慶獬嬠ⰰ崸愠摮⠠ⰸ湩⥦\u202c湡\u2064桃扥獹敨\u0a76††潰祬潮業污攠灸湡楳湯\u2073牡\u2065浥汰祯摥椠\u206e慥档椠瑮牥慶\u2e6c删汥瑡癩\u2065牥潲\u2072湯\u200a†琠敨搠浯楡\u206eせ㌬崰甠楳杮䤠䕅⁅牡瑩浨瑥捩椠\u2073潤畣敭瑮摥嬠崳\u205f獡栠癡湩\u2067\u0a61††数歡漠\u2066⸵攸ㄭ‶楷桴愠\u206e浲\u2073景ㄠ㐮ⵥ㘱⠠\u206e‽〳〰⤰ਮ\u200a†删晥牥湥散ੳ††ⴭⴭⴭⴭⴭ\u200a†⸠\u202eㅛ⁝\u2e43圠\u202e汃湥桳睡\u202c䌢敨祢桳癥猠牥敩\u2073潦\u2072慭桴浥瑡捩污映湵瑣潩獮Ⱒ椠੮†††††⨠慎楴湯污倠票楳慣\u206c慌潢慲潴祲䴠瑡敨慭楴慣\u206c慔汢獥Ⱚ瘠汯\u202eⰵ䰠湯潤㩮\u200a†††††效\u2072慍敪瑳❹\u2073瑓瑡潩敮祲传晦捩ⱥㄠ㘹⸲\u200a†⸠\u202e㉛⁝\u2e4d䄠牢浡睯瑩⁺湡\u2064\u2e49䄠\u202e瑓来湵\u202c䠪湡扤潯\u206b景䴠瑡敨慭楴慣੬†††††䘠湵瑣潩獮Ⱚㄠ琰\u2068牰湩楴杮\u202c敎⁷潙歲›潄敶Ⱳㄠ㘹ⰴ瀠\u2e70㌠㤷ਮ†††††栠瑴㩰⼯睷\u2e77慭桴献畦挮⽡捾浢愯湡獤瀯条彥㜳⸹瑨੭††⸮嬠崳栠瑴㩰⼯潫敢敳牡档挮慰\u2e6e牯⽧瑨潤獣䴯瑡\u2d68敃桰獥䴯瑡⽨敃桰獥栮浴੬\u200a†䔠慸灭敬ੳ††ⴭⴭⴭⴭ\u200a†㸠㸾渠\u2e70ど嬨⸰⥝\u200a†愠牲祡ㄨ〮\u0a29††㸾‾灮椮⠰せⰮㄠ\u202e\u202b樲⥝\u200a†愠牲祡嬨ㄠ〮〰〰〰⬰⸰\u206a†††Ⱐ†⸰㠱㠷㌵㌷〫㘮㘴㘱㐹樴⥝ਊ††', 'ਊ††敓\u2065汁潳\u200a†ⴠⴭⴭⴭਭ††捳灩\u2e79灳捥慩\u2e6c癩\u202c捳灩\u2e79灳捥慩\u2e6c癩\u0a65\u200a†丠瑯獥\u200a†ⴠⴭⴭ\u200a†圠\u2065獵\u2065桴\u2065污潧楲桴\u206d異汢獩敨\u2064祢䌠敬獮慨⁷ㅛ彝愠摮爠晥牥湥散\u2064祢\u200a†䄠牢浡睯瑩⁺湡\u2064瑓来湵嬠崲\u2c5f映牯眠楨档琠敨映湵瑣潩\u206e潤慭湩椠ੳ††慰瑲瑩潩敮\u2064湩潴琠敨琠潷椠瑮牥慶獬嬠ⰰ崸愠摮⠠ⰸ湩⥦\u202c湡\u2064桃扥獹敨\u0a76††潰祬潮業污攠灸湡楳湯\u2073牡\u2065浥汰祯摥椠\u206e慥档椠瑮牥慶\u2e6c删汥瑡癩\u2065牥潲\u2072湯\u200a†琠敨搠浯楡\u206eせ㌬崰甠楳杮䤠䕅⁅牡瑩浨瑥捩椠\u2073潤畣敭瑮摥嬠崳\u205f獡栠癡湩\u2067\u0a61††数歡漠\u2066⸵攸ㄭ‶楷桴愠\u206e浲\u2073景ㄠ㐮ⵥ㘱⠠\u206e‽〳〰⤰ਮ\u200a†删晥牥湥散ੳ††ⴭⴭⴭⴭⴭ\u200a†⸠\u202eㅛ⁝\u2e43圠\u202e汃湥桳睡\u202c䌢敨祢桳癥猠牥敩\u2073潦\u2072慭桴浥瑡捩污映湵瑣潩獮Ⱒ椠੮†††††⨠慎楴湯污倠票楳慣\u206c慌潢慲潴祲䴠瑡敨慭楴慣\u206c慔汢獥Ⱚ瘠汯\u202eⰵ䰠湯潤㩮\u200a†††††效\u2072慍敪瑳❹\u2073瑓瑡潩敮祲传晦捩ⱥㄠ㘹⸲\u200a†⸠\u202e㉛⁝\u2e4d䄠牢浡睯瑩⁺湡\u2064\u2e49䄠\u202e瑓来湵\u202c䠪湡扤潯\u206b景䴠瑡敨慭楴慣੬†††††䘠湵瑣潩獮Ⱚㄠ琰\u2068牰湩楴杮\u202c敎⁷潙歲›潄敶Ⱳㄠ㘹ⰴ瀠\u2e70㌠㤷ਮ†††††栠瑴㩰⼯睷\u2e77慭桴献畦挮⽡捾浢愯湡獤瀯条彥㜳⸹瑨੭††⸮嬠崳栠瑴㩰⼯潫敢敳牡档挮慰\u2e6e牯⽧瑨潤獣䴯瑡\u2d68敃桰獥䴯瑡⽨敃桰獥栮浴੬\u200a†䔠慸灭敬ੳ††ⴭⴭⴭⴭ\u200a†㸠㸾渠\u2e70ど嬨⸰⥝\u200a†愠牲祡ㄨ〮\u0a29††㸾‾灮椮⠰せⰮㄠ\u202e\u202b樲⥝\u200a†愠牲祡嬨ㄠ〮〰〰〰⬰⸰\u206a†††Ⱐ†⸰㠱㠷㌵㌷〫㘮㘴㘱㐹樴⥝ਊ††']
['汣獵癩汥⁹景椠瑮搠祴数\ u2e73ਊ敓组合\ u2065汁潳\ u200a†ⴠⴭⴭⴭਭ组合捳灩\ u2e79灳捥慩\ u2e6c癩\ u202c捳灩\ u2e79灳捥慩\ u2e6c癩\ u0a65 \ u200a†丠瑯獥\ u200a†ⴠⴭⴭ\ u200a __圠\ u2065獵\ u2065桴\ u2065污潧楲桴\ u206d異汢獩敨\ u2064祢䌠敬獮慨⁷ㅛ彝愠摮爠晥牥湥散\ u2064祢\ u200a†䄠牢浡睯瑩⁺湡\ u2064瑓来湵嬠崲\ u2c5f映牯眠楨档琠敨映湵瑣潩\ u206e潤慭湩椠ੳ组合慰瑲瑩潩敮\ u2064湩潴琠敨琠潷椠瑮牥慶獬嬠ⰰ崸愠摮⠠ⰸ湩⥦\ u202c湡\ u2064桃扥獹敨\ u0a76组合潰祬潮業污攠灸湡楳湯\ u2073牡\ u2065浥汰祯摥椠\ u206e慥档椠瑮牥慶\ u2e6c删汥瑡癩\ u2065牥潲\ u2072湯\ u200a†琠敨搠浯楡\ u206eせ㌬崰甠楳杮䤠䕅⁅牡瑩浨瑥捩椠\ u2073潤畣敭瑮摥嬠崳\ u205f獡栠癡湩\ u2067 \ u0a61组合数歡漠\ u2066⸵攸ㄭ‶楷桴愠\ u206e浲\ u2073景ㄠ㐮ⵥ㘱⠠\ u206e‽〳~⤰ਮ\ u200a†删晥牥湥散ੳ组合ⴭⴭⴭⴭⴭ\ u200__⸠\ u202eㅛ⁝\ u2e43圠\ u202e汃湥桳睡\ u202c䌢敨祢桳癥猠牥敩\ u2073潦\ u2072慭桴浥瑡捩污映湵瑣潩獮Ⱒ椠੮†††††⨠慎楴湯污倠票楳慣\ u206c慌潢慲潴祲䴠瑡敨慭楴慣\ u206c慔汢獥Ⱚ瘠汯\ u202eⰵ䰠湯潤㩮\ u200a†††††效\ u2072慍敪瑳❹\ u2073瑓瑡潩敮祲传晦捩ⱥㄠ㘹⸲\ u200a __⸠\ u202e㉛⁝\ u2e4d䄠牢浡睯瑩⁺湡\ u2064 \ u2e49䄠\ u202e瑓来湵\ u202c䠪湡扤潯\ u206b景䴠瑡敨慭楴慣੬†††††䘠湵瑣潩獮Ⱚㄠ琰\ u2068牰湩楴杮\ u202c敎⁷潙歲;潄敶Ⱳㄠ㘹ⰴ瀠\ u2e70㌠㤷ਮ†††††栠瑴㩰⼯睷\ u2e77慭桴献畦挮⽡捾浢愯湡獤瀯条彥㜳⸹瑨੭组合⸮嬠崳栠瑴㩰⼯潫敢敳牡档挮慰\ u2e6e牯⽧瑨潤獣䴯瑡\ u2d68敃桰獥䴯瑡⽨敃桰獥栮浴੬\ u200a†䔠慸灭敬ੳ组合ⴭⴭⴭⴭ\ u200a†㸠㸾渠\ u2e70ど嬨⸰⥝\ u200a†愠牲祡ㄨ〮\ u0a29组合㸾‾灮椮⠰せⰮㄠ\ u202e \ u202b樲⥝\ u200a†愠牲祡嬨ㄠ〮~ ~ ~⬰⸰\ u206a†††Ⱐ†⸰㠱㠷㌵㌷〫㘮㘴㘱㐹樴⥝ਊ组合”,“ਊ敓组合\ u2065汁潳\ u200a†ⴠⴭⴭⴭਭ组合捳灩\ u2e79灳捥慩\ u2e6c癩\ u202c捳灩\ u2e79灳捥慩\ u2e6c癩\ u0a65 \ u200a†丠瑯獥\ u200a†ⴠⴭⴭ\ u200a __圠\ u2065獵\ u2065桴\ u2065污潧楲桴\ u206d異汢獩敨\ u2064祢䌠敬獮慨⁷ㅛ彝愠摮爠晥牥湥散\ u2064祢\ u200a†䄠牢浡睯瑩⁺湡\ u2064瑓来湵嬠崲\ u2c5f映牯眠楨档琠敨映湵瑣潩\ u206e潤慭湩椠ੳ组合慰瑲瑩潩敮\ u2064湩潴琠敨琠潷椠瑮牥慶獬嬠ⰰ崸愠摮⠠ⰸ湩⥦\ u202c湡\ u2064桃扥獹敨\ u0a76组合潰祬潮業污攠灸湡楳湯\ u2073牡\ u2065浥汰祯摥椠\ u206e慥档椠瑮牥慶\ u2e6c删汥瑡癩\ u2065牥潲\ u2072湯\ u200a†琠敨搠浯楡\ u206eせ㌬崰甠楳杮䤠䕅⁅牡瑩浨瑥捩椠\ u2073潤畣敭瑮摥嬠崳\ u205f獡栠癡湩\ u2067 \ u0a61组合数歡漠\ u2066⸵攸ㄭ‶楷桴愠\ u206e浲\ u2073景ㄠ㐮ⵥ㘱⠠\ u206e‽〳~⤰ਮ\ u200a†删晥牥湥散ੳ组合ⴭⴭⴭⴭⴭ\ u200a __⸠\ u202eㅛ⁝\ u2e43圠\ u202e汃湥桳睡\ u202c䌢敨祢桳癥猠牥敩\ u2073潦\ u2072慭桴浥瑡捩污映湵瑣潩獮Ⱒ椠੮†††††⨠慎楴湯污倠票楳慣\ u206c慌潢慲潴祲䴠瑡敨慭楴慣\ u206c慔汢獥Ⱚ瘠汯\ u202eⰵ䰠湯潤㩮\ u200a†††††效\ u2072慍敪瑳❹\ u2073瑓瑡潩敮祲传晦捩ⱥㄠ㘹⸲\ u200a __⸠\ u202e㉛⁝\ u2e4d䄠牢浡睯瑩⁺湡\ u2064 \ u2e49䄠\ u202e瑓来湵\ u202c䠪湡扤潯\ u206b景䴠瑡敨慭楴慣੬†††††䘠湵瑣潩獮Ⱚㄠ琰\ u2068牰湩楴杮\ u202c敎⁷潙歲;潄敶Ⱳㄠ㘹ⰴ瀠\ u2e70㌠㤷ਮ†††††栠瑴㩰⼯睷\ u2e77慭桴献畦挮⽡捾浢愯湡獤瀯条彥㜳⸹瑨੭组合⸮嬠崳栠瑴㩰⼯潫敢敳牡档挮慰\ u2e6e牯⽧瑨潤獣䴯瑡\ u2d68敃桰獥䴯瑡⽨敃桰獥栮浴੬\ u200a†䔠慸灭敬ੳ组合ⴭⴭⴭⴭ\ u200a†㸠㸾渠\ u2e70ど嬨⸰⥝\ u200a†愠牲祡ㄨ〮\ u0a29组合㸾‾灮椮⠰せⰮㄠ\ u202e \ u202b樲⥝\ u200a†愠牲祡嬨ㄠ〮~ ~ ~⬰⸰\ u206a†††Ⱐ†⸰㠱㠷㌵㌷〫㘮㘴㘱㐹樴⥝ਊ']组合
(yes that apparently all came from 'string', don't ask me how)
(是的,很明显这一切都来自‘string’,别问我怎么回事)
2 个解决方案
#1
2
The problem that you are having is mentioned in the documentation:
文件中提到了您所遇到的问题:
Note: Although it is possible to store a pointer in shared memory remember that this will refer to a location in the address space of a specific process. However, the pointer is quite likely to be invalid in the context of a second process and trying to dereference the pointer from the second process may cause a crash.
注意:虽然可以在共享内存中存储指针,但请记住,这将指向特定进程地址空间中的一个位置。但是,在第二个进程的上下文中,指针很可能是无效的,试图取消第二个进程的指针可能导致崩溃。
This means that storing pointers (like strings) is not going to work, because only the address will get to the child process, and that address will not be valid anymore there (hence the segmentation fault). Consider, for example, this alternative, where all the strings are concatenated into one array and another array with the lengths is passed too (you can tweak it to your convenience):
这意味着存储指针(如字符串)将不起作用,因为只有地址将到达子进程,并且该地址在那里将不再有效(因此出现分段错误)。考虑一下,例如,这个替代方案,将所有字符串连接到一个数组中,并传递另一个具有长度的数组(您可以根据自己的方便对其进行调整):
from multiprocessing import Process, Lock
from multiprocessing.sharedctypes import Value, Array
import ctypes
def print_strings(S, S_len):
"""Print strings in the C array"""
received_strings = []
start = 0
for length in S_len:
received_strings.append(S[start:start + length])
start += length
print("received strings:", received_strings)
if __name__ == '__main__':
lock = Lock()
my_strings = ['string1', 'str2']
my_strings_len = [len(s) for s in my_strings]
string_array = Array(ctypes.c_wchar, ''.join(my_strings))
string_len_array = Array(ctypes.c_uint, my_strings_len)
q = Process(target=print_strings, args=(string_array, string_len_array))
q.start()
q.join()
Output:
输出:
received strings: ['string1', 'str2']
About addresses in subprocess:
关于地址的子流程:
This is a bit off topic of the question, but it was to long to put into a comment. Honestly this starts to be out of my depth, take a look at eryksun's comments below for more informed insights, but here's my understanding anyway. On Unix(-like) a new process created through fork
has the same memory and (virtual) addresses than the parent process, but if you then exec
some program that's not the case anymore; I don't know if Python's multiprocessing
runs an exec
or not on Unix (note: see eryksun's comment for more on this and set_start_method
), but in any case I wouldn't assume there is any guarantee that any address in the Python-managed memory pool should stay the same. On Windows, CreateProcess
makes a new process from an executable that does not have in principle anything in common with the parent. I don't think even shared libraries used by multiple processes (.so/.dll) should be at the same address in either platform. I don't think sharing (virtual) addresses between processes even makes sense when using shared memory since, if I recall correctly (and I may not), shared memory blocks are mapped to arbitrary virtual addresses on each process. So my impression is that there is no good reason (or "good and obvious", at least) to share addresses with a subprocess (of course, pointer types in ctypes
are still useful to talk to native libraries within the same process).
这是一个有点偏离主题的问题,但它是想要提出一个评论。坦白地说,这已经超出了我的深度,下面我们再来看看大家的评论吧,我的理解是这样的。在Unix(类似)上,通过fork创建的新进程具有与父进程相同的内存和(虚拟)地址,但是如果您执行某个程序,则不再是这样;我不知道Python的多处理是否在Unix上运行exec(注意:有关这个和set_start_method的更多信息,请参阅eryksun的注释),但无论如何,我不认为Python托管内存池中的任何地址应该保持不变。在Windows上,CreateProcess从可执行文件生成一个新进程,该可执行文件原则上与父进程没有任何共同之处。我认为即使是被多个进程(.so/.dll)使用的共享库也不应该在任何一个平台上都位于同一个地址。我认为在进程之间共享(虚拟)地址甚至在使用共享内存时都没有意义,因为如果我记得正确(我可能没有),共享内存块被映射到每个进程上的任意虚拟地址。因此,我的印象是,与子进程共享地址并没有很好的理由(或者至少是“好且明显的”)(当然,ctype中的指针类型在同一进程中与本机库对话仍然很有用)。
As I said, I'm not 100% confident in this, but I think the general idea goes like that.
就像我说的,我不是百分之百的有信心,但是我认为总的想法是这样的。
#2
1
Additional example getting .raw
and .value
to work. Per documentation it works only for Array(ctypes.c_char,...)
:
获取.raw和.value的附加示例。每个文档只适用于数组(ctypes.c_char,…):
from multiprocessing import Process
from multiprocessing.sharedctypes import Value, Array
import ctypes
def print_strings(s):
"""Print strings in the C array"""
print(s.value)
print(len(s))
s[len(s)-1]=b'x'
if __name__ == '__main__':
string_array = Array(ctypes.c_char, b'string')
q = Process(target=print_strings, args=(string_array,))
q.start()
q.join()
print(string_array.raw)
Output showing that shared buffer was modified:
显示共享缓冲区被修改的输出:
b'string'
6
b'strinx'
#1
2
The problem that you are having is mentioned in the documentation:
文件中提到了您所遇到的问题:
Note: Although it is possible to store a pointer in shared memory remember that this will refer to a location in the address space of a specific process. However, the pointer is quite likely to be invalid in the context of a second process and trying to dereference the pointer from the second process may cause a crash.
注意:虽然可以在共享内存中存储指针,但请记住,这将指向特定进程地址空间中的一个位置。但是,在第二个进程的上下文中,指针很可能是无效的,试图取消第二个进程的指针可能导致崩溃。
This means that storing pointers (like strings) is not going to work, because only the address will get to the child process, and that address will not be valid anymore there (hence the segmentation fault). Consider, for example, this alternative, where all the strings are concatenated into one array and another array with the lengths is passed too (you can tweak it to your convenience):
这意味着存储指针(如字符串)将不起作用,因为只有地址将到达子进程,并且该地址在那里将不再有效(因此出现分段错误)。考虑一下,例如,这个替代方案,将所有字符串连接到一个数组中,并传递另一个具有长度的数组(您可以根据自己的方便对其进行调整):
from multiprocessing import Process, Lock
from multiprocessing.sharedctypes import Value, Array
import ctypes
def print_strings(S, S_len):
"""Print strings in the C array"""
received_strings = []
start = 0
for length in S_len:
received_strings.append(S[start:start + length])
start += length
print("received strings:", received_strings)
if __name__ == '__main__':
lock = Lock()
my_strings = ['string1', 'str2']
my_strings_len = [len(s) for s in my_strings]
string_array = Array(ctypes.c_wchar, ''.join(my_strings))
string_len_array = Array(ctypes.c_uint, my_strings_len)
q = Process(target=print_strings, args=(string_array, string_len_array))
q.start()
q.join()
Output:
输出:
received strings: ['string1', 'str2']
About addresses in subprocess:
关于地址的子流程:
This is a bit off topic of the question, but it was to long to put into a comment. Honestly this starts to be out of my depth, take a look at eryksun's comments below for more informed insights, but here's my understanding anyway. On Unix(-like) a new process created through fork
has the same memory and (virtual) addresses than the parent process, but if you then exec
some program that's not the case anymore; I don't know if Python's multiprocessing
runs an exec
or not on Unix (note: see eryksun's comment for more on this and set_start_method
), but in any case I wouldn't assume there is any guarantee that any address in the Python-managed memory pool should stay the same. On Windows, CreateProcess
makes a new process from an executable that does not have in principle anything in common with the parent. I don't think even shared libraries used by multiple processes (.so/.dll) should be at the same address in either platform. I don't think sharing (virtual) addresses between processes even makes sense when using shared memory since, if I recall correctly (and I may not), shared memory blocks are mapped to arbitrary virtual addresses on each process. So my impression is that there is no good reason (or "good and obvious", at least) to share addresses with a subprocess (of course, pointer types in ctypes
are still useful to talk to native libraries within the same process).
这是一个有点偏离主题的问题,但它是想要提出一个评论。坦白地说,这已经超出了我的深度,下面我们再来看看大家的评论吧,我的理解是这样的。在Unix(类似)上,通过fork创建的新进程具有与父进程相同的内存和(虚拟)地址,但是如果您执行某个程序,则不再是这样;我不知道Python的多处理是否在Unix上运行exec(注意:有关这个和set_start_method的更多信息,请参阅eryksun的注释),但无论如何,我不认为Python托管内存池中的任何地址应该保持不变。在Windows上,CreateProcess从可执行文件生成一个新进程,该可执行文件原则上与父进程没有任何共同之处。我认为即使是被多个进程(.so/.dll)使用的共享库也不应该在任何一个平台上都位于同一个地址。我认为在进程之间共享(虚拟)地址甚至在使用共享内存时都没有意义,因为如果我记得正确(我可能没有),共享内存块被映射到每个进程上的任意虚拟地址。因此,我的印象是,与子进程共享地址并没有很好的理由(或者至少是“好且明显的”)(当然,ctype中的指针类型在同一进程中与本机库对话仍然很有用)。
As I said, I'm not 100% confident in this, but I think the general idea goes like that.
就像我说的,我不是百分之百的有信心,但是我认为总的想法是这样的。
#2
1
Additional example getting .raw
and .value
to work. Per documentation it works only for Array(ctypes.c_char,...)
:
获取.raw和.value的附加示例。每个文档只适用于数组(ctypes.c_char,…):
from multiprocessing import Process
from multiprocessing.sharedctypes import Value, Array
import ctypes
def print_strings(s):
"""Print strings in the C array"""
print(s.value)
print(len(s))
s[len(s)-1]=b'x'
if __name__ == '__main__':
string_array = Array(ctypes.c_char, b'string')
q = Process(target=print_strings, args=(string_array,))
q.start()
q.join()
print(string_array.raw)
Output showing that shared buffer was modified:
显示共享缓冲区被修改的输出:
b'string'
6
b'strinx'