有没有理由在PyMem_Malloc上使用malloc?

时间:2021-05-04 23:16:21

I'm reading the documentation for Memory Management in Python C extensions, and as far as I can tell, there doesn't really seem to be much reason to use malloc rather than PyMem_Malloc. Say I want to allocate an array that isn't to be exposed to Python source code and will be stored in an object that will be garbage collected. Is there any reason to use malloc?

我正在阅读Python C扩展中的内存管理文档,据我所知,似乎没有太多理由使用malloc而不是PyMem_Malloc。假设我想分配一个不会暴露给Python源代码的数组,并将存储在一个将被垃圾回收的对象中。有没有理由使用malloc?

3 个解决方案

#1


6  

EDIT: Mixed PyMem_Malloc and PyObject_Malloc corrections; they are two different calls.

编辑:混合PyMem_Malloc和PyObject_Malloc更正;他们是两个不同的电话。

Without the PYMALLOC_DEBUG macro activated, PyMem_Malloc is an alias of libc's malloc(), having one special case: calling PyMem_Malloc to allocate zero bytes will return a non-NULL pointer, while malloc(zero_bytes) might return a NULL value or raise a system error (source code reference):

如果没有激活PYMALLOC_DEBUG宏,PyMem_Malloc是libc中的malloc()的别名,有一个特殊情况:调用PyMem_Malloc分配零个字节会返回一个非NULL指针,而的malloc(zero_bytes)可能返回NULL值或提高系统错误(源代码参考):

/* malloc. Note that nbytes==0 tries to return a non-NULL pointer, distinct * from all other currently live pointers. This may not be possible. */

/ * malloc。请注意,nbytes == 0尝试返回非NULL指针,与所有其他当前活动指针不同。这可能是不可能的。 * /

Also, there is an advisory note on the pymem.h header file:

另外,pymem.h头文件中有一条建议说明:

Never mix calls to PyMem_ with calls to the platform malloc/realloc/ calloc/free. For example, on Windows different DLLs may end up using different heaps, and if you use PyMem_Malloc you'll get the memory from the heap used by the Python DLL; it could be a disaster if you free()'ed that directly in your own extension. Using PyMem_Free instead ensures Python can return the memory to the proper heap. As another example, in PYMALLOC_DEBUG mode, Python wraps all calls to all PyMem_ and PyObject_ memory functions in special debugging wrappers that add additional debugging info to dynamic memory blocks. The system routines have no idea what to do with that stuff, and the Python wrappers have no idea what to do with raw blocks obtained directly by the system routines then.

切勿将对PyMem_的调用与对平台malloc / realloc / calloc / free的调用混合。例如,在Windows上,不同的DLL最终可能会使用不同的堆,如果使用PyMem_Malloc,您将从Python DLL使用的堆中获取内存;如果你直接在你自己的扩展中免费()编辑它可能是一场灾难。使用PyMem_Free可以确保Python可以将内存返回到正确的堆。另一个例子,在PYMALLOC_DEBUG模式中,Python将所有调用PyMem_和PyObject_内存函数的所有调用包装在特殊的调试包装器中,这些包装器为动态内存块添加了额外的调试信息。系统例程不知道如何处理这些东西,并且Python包装器不知道如何处理由系统例程直接获得的原始块。

Then, there are some Python specific tunings inside PyMem_Malloc PyObject_Malloc, a function used not only for C extensions but for all the dynamic allocations while running a Python program, like 100*234, str(100) or 10 + 4j:

然后,在PyMem_Malloc PyObject_Malloc中有一些Python特定的调整,这个函数不仅用于C扩展,而且用于运行Python程序时的所有动态分配,如100 * 234,str(100)或10 + 4j:

>>> id(10 + 4j)
139721697591440
>>> id(10 + 4j)
139721697591504
>>> id(10 + 4j)
139721697591440

The previous complex() instances are small objects allocated on a dedicated pool.

以前的complex()实例是在专用池上分配的小对象。

Small objects (<256 bytes) allocation with PyMem_Malloc PyObject_Malloc is quite efficient since it's done from a pool 8 bytes aligned blocks, existing one pool for each block size. There are also Pages and Arenas blocks for bigger allocations.

使用PyMem_Malloc PyObject_Malloc分配小对象(<256字节)是非常有效的,因为它是从池中8字节对齐的块完成的,每个块大小存在一个池。还有用于更大分配的Pages和Arenas块。

This comment on the source code explains how the PyObject_Malloc call is optimized:

这篇关于源代码的评论解释了如何优化PyObject_Malloc调用:

/*
 * The basic blocks are ordered by decreasing execution frequency,
 * which minimizes the number of jumps in the most common cases,
 * improves branching prediction and instruction scheduling (small
 * block allocations typically result in a couple of instructions).
 * Unless the optimizer reorders everything, being too smart...
 */

Pools, Pages and Arenas are optimizations intended to reduce external memory fragmentation of long running Python programs.

Pools,Pages和Arenas是旨在减少长期运行的Python程序的外部内存碎片的优化。

Check out the source code for the full detailed documentation on Python's memory internals.

查看有关Python内存内部的完整详细文档的源代码。

#2


7  

It's perfectly OK for extensions to allocate memory with malloc, or other system allocators. That's normal and inevitable for many types of modules--most modules that wrap other libraries, which themselves know nothing about Python, will cause native allocations when they happen within that library. (Some libraries allow you to control allocation enough to prevent this; most do not.)

扩展使用malloc或其他系统分配器分配内存是完全可以的。对于许多类型的模块来说,这是正常的,也是不可避免的 - 大多数包装其他库的模块,它们本身对Python一无所知,当它们在该库中发生时会导致本机分配。 (有些库允许您控制分配足以防止这种情况;大多数库都没有。)

There's a serious drawback to using PyMem_Malloc: you need to hold the GIL when using it. Native libraries often want to release the GIL when doing CPU-intensive calculations or making any calls that might block, like I/O. Needing to lock the GIL before allocations can be somewhere between very inconvenient and a performance problem.

使用PyMem_Malloc有一个严重的缺点:使用它时需要保持GIL。本机库通常希望在进行CPU密集型计算或进行任何可能阻塞的调用(如I / O)时释放GIL。需要在分配之前锁定GIL可能介于非常不方便和性能问题之间。

Using Python's wrappers for memory allocation allows Python's memory debugging code to be used. With tools like Valgrind I doubt the real-world value of that, however.

使用Python的包装器进行内存分配允许使用Python的内存调试代码。有了像Valgrind这样的工具,我怀疑它的真实世界价值。

You'll need to use these functions if an API requires it; for example, if an API is passed a pointer that must be allocated with these functions, so it can be freed with them. Barring an explicit reason like that for using them, I stick with normal allocation.

如果API需要,您将需要使用这些功能;例如,如果API传递了必须使用这些函数分配的指针,那么可以使用它们释放它。除非有明确的理由使用它们,否则我坚持正常分配。

#3


1  

From my experience writing MATLAB .mex functions, I think the biggest determining factor in whether you use malloc or not is portability. Say you have a header file that performs a load of useful functions using internal c data types only (no necessary Python object interaction, so no problem using malloc), and you suddenly realise you want to port that header file to a different codebase that has nothing to do with Python whatsoever (maybe it's a project written purely in C), using malloc would obviously be a much more portable solution.

根据我编写MATLAB .mex函数的经验,我认为是否使用malloc的最大决定因素是可移植性。假设你有一个头文件,只使用内部c数据类型执行大量有用的功能(没有必要的Python对象交互,所以使用malloc没问题),你突然意识到你想要将该头文件移植到另一个代码库中与Python无关(也许这是一个纯粹用C语言编写的项目),使用malloc显然是一个更加便携的解决方案。

But for your code that is purely a Python extension, my initial reaction would be to expect the native c function to perform faster. I have no evidence to back this up :)

但是对于纯粹是Python扩展的代码,我最初的反应是期望本机c函数执行得更快。我没有证据支持这个:)

#1


6  

EDIT: Mixed PyMem_Malloc and PyObject_Malloc corrections; they are two different calls.

编辑:混合PyMem_Malloc和PyObject_Malloc更正;他们是两个不同的电话。

Without the PYMALLOC_DEBUG macro activated, PyMem_Malloc is an alias of libc's malloc(), having one special case: calling PyMem_Malloc to allocate zero bytes will return a non-NULL pointer, while malloc(zero_bytes) might return a NULL value or raise a system error (source code reference):

如果没有激活PYMALLOC_DEBUG宏,PyMem_Malloc是libc中的malloc()的别名,有一个特殊情况:调用PyMem_Malloc分配零个字节会返回一个非NULL指针,而的malloc(zero_bytes)可能返回NULL值或提高系统错误(源代码参考):

/* malloc. Note that nbytes==0 tries to return a non-NULL pointer, distinct * from all other currently live pointers. This may not be possible. */

/ * malloc。请注意,nbytes == 0尝试返回非NULL指针,与所有其他当前活动指针不同。这可能是不可能的。 * /

Also, there is an advisory note on the pymem.h header file:

另外,pymem.h头文件中有一条建议说明:

Never mix calls to PyMem_ with calls to the platform malloc/realloc/ calloc/free. For example, on Windows different DLLs may end up using different heaps, and if you use PyMem_Malloc you'll get the memory from the heap used by the Python DLL; it could be a disaster if you free()'ed that directly in your own extension. Using PyMem_Free instead ensures Python can return the memory to the proper heap. As another example, in PYMALLOC_DEBUG mode, Python wraps all calls to all PyMem_ and PyObject_ memory functions in special debugging wrappers that add additional debugging info to dynamic memory blocks. The system routines have no idea what to do with that stuff, and the Python wrappers have no idea what to do with raw blocks obtained directly by the system routines then.

切勿将对PyMem_的调用与对平台malloc / realloc / calloc / free的调用混合。例如,在Windows上,不同的DLL最终可能会使用不同的堆,如果使用PyMem_Malloc,您将从Python DLL使用的堆中获取内存;如果你直接在你自己的扩展中免费()编辑它可能是一场灾难。使用PyMem_Free可以确保Python可以将内存返回到正确的堆。另一个例子,在PYMALLOC_DEBUG模式中,Python将所有调用PyMem_和PyObject_内存函数的所有调用包装在特殊的调试包装器中,这些包装器为动态内存块添加了额外的调试信息。系统例程不知道如何处理这些东西,并且Python包装器不知道如何处理由系统例程直接获得的原始块。

Then, there are some Python specific tunings inside PyMem_Malloc PyObject_Malloc, a function used not only for C extensions but for all the dynamic allocations while running a Python program, like 100*234, str(100) or 10 + 4j:

然后,在PyMem_Malloc PyObject_Malloc中有一些Python特定的调整,这个函数不仅用于C扩展,而且用于运行Python程序时的所有动态分配,如100 * 234,str(100)或10 + 4j:

>>> id(10 + 4j)
139721697591440
>>> id(10 + 4j)
139721697591504
>>> id(10 + 4j)
139721697591440

The previous complex() instances are small objects allocated on a dedicated pool.

以前的complex()实例是在专用池上分配的小对象。

Small objects (<256 bytes) allocation with PyMem_Malloc PyObject_Malloc is quite efficient since it's done from a pool 8 bytes aligned blocks, existing one pool for each block size. There are also Pages and Arenas blocks for bigger allocations.

使用PyMem_Malloc PyObject_Malloc分配小对象(<256字节)是非常有效的,因为它是从池中8字节对齐的块完成的,每个块大小存在一个池。还有用于更大分配的Pages和Arenas块。

This comment on the source code explains how the PyObject_Malloc call is optimized:

这篇关于源代码的评论解释了如何优化PyObject_Malloc调用:

/*
 * The basic blocks are ordered by decreasing execution frequency,
 * which minimizes the number of jumps in the most common cases,
 * improves branching prediction and instruction scheduling (small
 * block allocations typically result in a couple of instructions).
 * Unless the optimizer reorders everything, being too smart...
 */

Pools, Pages and Arenas are optimizations intended to reduce external memory fragmentation of long running Python programs.

Pools,Pages和Arenas是旨在减少长期运行的Python程序的外部内存碎片的优化。

Check out the source code for the full detailed documentation on Python's memory internals.

查看有关Python内存内部的完整详细文档的源代码。

#2


7  

It's perfectly OK for extensions to allocate memory with malloc, or other system allocators. That's normal and inevitable for many types of modules--most modules that wrap other libraries, which themselves know nothing about Python, will cause native allocations when they happen within that library. (Some libraries allow you to control allocation enough to prevent this; most do not.)

扩展使用malloc或其他系统分配器分配内存是完全可以的。对于许多类型的模块来说,这是正常的,也是不可避免的 - 大多数包装其他库的模块,它们本身对Python一无所知,当它们在该库中发生时会导致本机分配。 (有些库允许您控制分配足以防止这种情况;大多数库都没有。)

There's a serious drawback to using PyMem_Malloc: you need to hold the GIL when using it. Native libraries often want to release the GIL when doing CPU-intensive calculations or making any calls that might block, like I/O. Needing to lock the GIL before allocations can be somewhere between very inconvenient and a performance problem.

使用PyMem_Malloc有一个严重的缺点:使用它时需要保持GIL。本机库通常希望在进行CPU密集型计算或进行任何可能阻塞的调用(如I / O)时释放GIL。需要在分配之前锁定GIL可能介于非常不方便和性能问题之间。

Using Python's wrappers for memory allocation allows Python's memory debugging code to be used. With tools like Valgrind I doubt the real-world value of that, however.

使用Python的包装器进行内存分配允许使用Python的内存调试代码。有了像Valgrind这样的工具,我怀疑它的真实世界价值。

You'll need to use these functions if an API requires it; for example, if an API is passed a pointer that must be allocated with these functions, so it can be freed with them. Barring an explicit reason like that for using them, I stick with normal allocation.

如果API需要,您将需要使用这些功能;例如,如果API传递了必须使用这些函数分配的指针,那么可以使用它们释放它。除非有明确的理由使用它们,否则我坚持正常分配。

#3


1  

From my experience writing MATLAB .mex functions, I think the biggest determining factor in whether you use malloc or not is portability. Say you have a header file that performs a load of useful functions using internal c data types only (no necessary Python object interaction, so no problem using malloc), and you suddenly realise you want to port that header file to a different codebase that has nothing to do with Python whatsoever (maybe it's a project written purely in C), using malloc would obviously be a much more portable solution.

根据我编写MATLAB .mex函数的经验,我认为是否使用malloc的最大决定因素是可移植性。假设你有一个头文件,只使用内部c数据类型执行大量有用的功能(没有必要的Python对象交互,所以使用malloc没问题),你突然意识到你想要将该头文件移植到另一个代码库中与Python无关(也许这是一个纯粹用C语言编写的项目),使用malloc显然是一个更加便携的解决方案。

But for your code that is purely a Python extension, my initial reaction would be to expect the native c function to perform faster. I have no evidence to back this up :)

但是对于纯粹是Python扩展的代码,我最初的反应是期望本机c函数执行得更快。我没有证据支持这个:)