In C, it's simple for a library to allow the user to customize memory allocation by using global function pointers to a function that should behave similarly to malloc()
and to a function that should behave similarly to free()
. SQLite, for example, uses this approach.
在C中,对于库来说,允许用户通过使用全局函数指针来定制内存分配是很简单的,该函数指针应该与malloc()和类似于free()的函数表现得类似。例如,SQLite使用这种方法。
C++ complicates things a bit because allocation and initialization are usually fused. Essentially we want to get the behavior of having overridden operator new
and operator delete
for only a library but there's no way to actually do that (I'm fairly certain but not quite 100%).
C ++使事情变得复杂,因为分配和初始化通常是融合的。基本上我们想要获得仅覆盖库的重写operator new和operator delete的行为,但实际上没有办法实现(我相当肯定,但不是100%)。
How should this be done in C++?
如何在C ++中完成?
Here's a first stab at something that replicates some of the semantics of new
expressions with a function Lib::make<T>
.
这是对使用函数Lib :: make
I don't know if this is so useful, but just for fun, here's a more complicated version that also tries to replicate the semantics of new[]
expressions.
我不知道这是否有用,但只是为了好玩,这是一个更复杂的版本,也试图复制new []表达式的语义。
This is a goal oriented question so I'm not necessarily looking for code review. If there's some better way to do this just say so and ignore the links.
这是一个面向目标的问题,所以我不一定要寻找代码审查。如果有更好的方法,只需这样说并忽略链接。
(By "allocator" I only mean something that allocates memory. I'm not referring to the STL allocator concept or even necessarily allocating memory for containers.)
(通过“allocator”我只是指分配内存的东西。我不是指STL分配器概念,甚至不是为容器分配内存。)
Why this might be desirable:
为什么这可能是可取的:
Here's a blog post from a Mozilla dev arguing that libraries should do this. He gives a few examples of C libraries that allow the library user to customize allocation for the library. I checked out the source code for one of the examples, SQLite, and see that this feature is also used internally for testing via fault injection. I'm not writing anything that needs to be as bulletproof as SQLite but it still seems like a sensible idea. If nothing else, it allows client code to figure out, "Which library is hogging my memory and when?".
这是一篇来自Mozilla dev的博客文章,他们认为图书馆应该这样做。他给出了一些C库的例子,它们允许库用户自定义库的分配。我查看了其中一个示例SQLite的源代码,并看到此功能也在内部用于通过故障注入进行测试。我不是在编写任何需要像SQLite一样防弹的东西,但它似乎仍然是一个明智的想法。如果没有别的,它允许客户端代码弄清楚,“哪个库正在占用我的记忆以及什么时候?”。
1 个解决方案
#1
5
Simple answer: don't use C++. Sorry, joke.
简单回答:不要使用C ++。对不起,开玩笑。
But if you want to take this kind of absolute control over memory management in C++, across libraries/module boundaries, and in a completely generalized way, you can be in for some terrible grief. I'd suggest to most to look for reasons not to do it more than ways to do it.
但是如果你想对C ++中的内存管理,跨库/模块边界以及完全通用的方式采取这种绝对控制,那么你可能会遇到一些可怕的悲痛。我建议大多数人寻找不做其他方法的理由。
I've gone through many iterations of this same basic idea over the years (actually decades), from trying to naively overload operator new/new[]/delete/delete[] at a global level to linker-based solutions to platform-specific solutions, and I'm actually at the desired point you are at now: I have a system that allows me to see the amount of memory allocated per plugin. But I didn't reach this point through the kind of generalized way that you desire (and me as well, originally).
多年来(实际上几十年),我经历了多次迭代这个相同的基本思想,从尝试天然地将全局级别的运算符new / new [] / delete / delete []重载到基于链接器的特定于平台的解决方案解决方案,我实际上是你现在所希望的点:我有一个系统,可以让我看到每个插件分配的内存量。但是我并没有通过你想要的那种普遍的方式达到这一点(我最初也是如此)。
C++ complicates things a bit because allocation and initialization are usually fused.
C ++使事情变得复杂,因为分配和初始化通常是融合的。
I would offer a slight twist to this statement: C++ complicates things because initialization and allocation are usually fused. All I did was swap the order here, but the most complicating part is not that allocation wants to initialize, but because initialization often wants to allocate.
我会对这个陈述略微扭曲:C ++使事情变得复杂,因为初始化和分配通常是融合的。我所做的只是在这里交换顺序,但最复杂的部分不是分配想要初始化,而是因为初始化通常想要分配。
Take this basic example:
拿这个基本的例子:
struct Foo
{
std::vector<Bar> stuff;
};
In this case, we can easily allocate Foo through a custom memory allocator:
在这种情况下,我们可以通过自定义内存分配器轻松分配Foo:
void* mem = custom_malloc(sizeof(Foo));
Foo* foo = new(foo_mem) Foo;
...
foo->~Foo();
custom_free(foo);
... and of course we can wrap this all we like to conform to RAII, achieve exception-safety, etc.
...当然,我们可以将这一切包装成符合RAII,实现异常安全等。
Except now the problem cascades. That stuff
member using std::vector
will want to use std::allocator
, and now we have a second problem to solve. We could use a template instantiation of std::vector
using our own allocator, and if you need runtime information passed to the allocator, you can override Foo's constructors to pass that information along with the allocator to the vector constructor.
除了现在问题级联。使用std :: vector的那个东西成员想要使用std :: allocator,现在我们有第二个问题需要解决。我们可以使用自己的分配器来使用std :: vector的模板实例化,如果需要传递给分配器的运行时信息,可以覆盖Foo的构造函数,将该信息与分配器一起传递给向量构造函数。
But what about Bar
? Its constructor may also want to allocate memory for a variety of disparate objects, and so the problem cascades and cascades and cascades.
但是吧呢?它的构造函数也可能想为各种不同的对象分配内存,因此问题级联,级联和级联。
Given the difficulty of this problem, and the alternative, generalized solutions I've tried and the grief associated when porting, I've settled on a completely de-generalized, somewhat pragmatic approach.
考虑到这个问题的难度,以及我尝试过的替代的,通用的解决方案以及移植时的悲伤,我已经确定了一种完全去泛化的,有点务实的方法。
The solution I settled on is to effectively reinvent the entire C and C++ standard library. Disgusting, I know, but I had a bit more of an excuse to do it in my case. The product I'm working on is effectively an engine and software development kit, designed to allow people to write plugins for it using any compiler, C runtime, C++ standard library implementation, and build settings they desire. To allow things like vectors or sets or maps to be passed through these central APIs in an ABI-compatible way required rolling our own standard-compliant containers in addition to a lot of C standard functions.
我解决的解决方案是有效地重新发明整个C和C ++标准库。我知道,这很恶心,但在我的情况下,我有更多的借口来做这件事。我正在开发的产品实际上是一个引擎和软件开发工具包,旨在允许人们使用任何编译器,C运行时,C ++标准库实现和他们想要的构建设置为其编写插件。为了允许向量或集合或映射等事物以ABI兼容的方式通过这些*API传递,除了许多C标准函数之外,还需要滚动我们自己的标准兼容容器。
The entire implementation of this devkit then revolves around these allocation functions:
然后,这个devkit的整个实现围绕这些分配函数:
EP_API void* ep_malloc(int lib_id, int size);
EP_API void ep_free(int lib_id, void* mem);
... and the entirety of the SDK revolves around these two, including memory pools and "sub-allocators".
...并且整个SDK围绕这两个,包括内存池和“子分配器”。
For third party libraries outside of our control, we're just SOL. Some of those libraries have equally ambitious things they want to do with their memory management, and to try to override that would just lead to all kinds of *es and open up all kinds of cans of worms. There are also very low-level drivers when using things like OGL that want to allocate a lot of system memory, and we can't do anything about it.
对于我们无法控制的第三方库,我们只是SOL。其中一些图书馆在内存管理方面也有同样雄心勃勃的事情,并试图超越它只会导致各种冲突,并打开各种各样的蠕虫。当使用像OGL这样想要分配大量系统内存的东西时,也有非常低级别的驱动程序,我们无法做任何事情。
Yet I've found this solution to work well enough to answer the basic question: "who/what is hogging up all this memory?" very quickly: a question which is often much more difficult to answer than a similar one related to clock cycles (for which we can just fire up any profiler). It only applies for code under our control, using this SDK, but we can get a very thorough memory breakdown using this system on a per-module basis. We can also set superficial caps on memory use to make sure that out of memory errors are actually being handled correctly without actually trying to exhaust all contiguous pages available in the system.
然而,我发现这个解决方案能够很好地回答基本问题:“谁/什么在占据所有这些记忆?”很快:一个问题通常比与时钟周期相关的类似问题(我们可以启动任何分析器)更难回答。它仅适用于我们控制下的代码,使用此SDK,但我们可以在每个模块的基础上使用此系统获得非常彻底的内存故障。我们还可以设置内存使用的表面上限,以确保实际上正确处理内存不足错误,而不会实际耗尽系统中可用的所有连续页面。
So in my case this problem was solved via policy: by building a uniform coding standard and a central library conforming to it that's used throughout the codebase (and by third parties writing plugins for our system). It's probably not the answer you are looking for, but this ended up being the most practical solution we've found yet.
所以在我的情况下,这个问题是通过政策解决的:通过建立一个统一的编码标准和一个符合它的*库,它在整个代码库中使用(以及第三方为我们的系统编写插件)。这可能不是您正在寻找的答案,但这最终成为我们发现的最实用的解决方案。
#1
5
Simple answer: don't use C++. Sorry, joke.
简单回答:不要使用C ++。对不起,开玩笑。
But if you want to take this kind of absolute control over memory management in C++, across libraries/module boundaries, and in a completely generalized way, you can be in for some terrible grief. I'd suggest to most to look for reasons not to do it more than ways to do it.
但是如果你想对C ++中的内存管理,跨库/模块边界以及完全通用的方式采取这种绝对控制,那么你可能会遇到一些可怕的悲痛。我建议大多数人寻找不做其他方法的理由。
I've gone through many iterations of this same basic idea over the years (actually decades), from trying to naively overload operator new/new[]/delete/delete[] at a global level to linker-based solutions to platform-specific solutions, and I'm actually at the desired point you are at now: I have a system that allows me to see the amount of memory allocated per plugin. But I didn't reach this point through the kind of generalized way that you desire (and me as well, originally).
多年来(实际上几十年),我经历了多次迭代这个相同的基本思想,从尝试天然地将全局级别的运算符new / new [] / delete / delete []重载到基于链接器的特定于平台的解决方案解决方案,我实际上是你现在所希望的点:我有一个系统,可以让我看到每个插件分配的内存量。但是我并没有通过你想要的那种普遍的方式达到这一点(我最初也是如此)。
C++ complicates things a bit because allocation and initialization are usually fused.
C ++使事情变得复杂,因为分配和初始化通常是融合的。
I would offer a slight twist to this statement: C++ complicates things because initialization and allocation are usually fused. All I did was swap the order here, but the most complicating part is not that allocation wants to initialize, but because initialization often wants to allocate.
我会对这个陈述略微扭曲:C ++使事情变得复杂,因为初始化和分配通常是融合的。我所做的只是在这里交换顺序,但最复杂的部分不是分配想要初始化,而是因为初始化通常想要分配。
Take this basic example:
拿这个基本的例子:
struct Foo
{
std::vector<Bar> stuff;
};
In this case, we can easily allocate Foo through a custom memory allocator:
在这种情况下,我们可以通过自定义内存分配器轻松分配Foo:
void* mem = custom_malloc(sizeof(Foo));
Foo* foo = new(foo_mem) Foo;
...
foo->~Foo();
custom_free(foo);
... and of course we can wrap this all we like to conform to RAII, achieve exception-safety, etc.
...当然,我们可以将这一切包装成符合RAII,实现异常安全等。
Except now the problem cascades. That stuff
member using std::vector
will want to use std::allocator
, and now we have a second problem to solve. We could use a template instantiation of std::vector
using our own allocator, and if you need runtime information passed to the allocator, you can override Foo's constructors to pass that information along with the allocator to the vector constructor.
除了现在问题级联。使用std :: vector的那个东西成员想要使用std :: allocator,现在我们有第二个问题需要解决。我们可以使用自己的分配器来使用std :: vector的模板实例化,如果需要传递给分配器的运行时信息,可以覆盖Foo的构造函数,将该信息与分配器一起传递给向量构造函数。
But what about Bar
? Its constructor may also want to allocate memory for a variety of disparate objects, and so the problem cascades and cascades and cascades.
但是吧呢?它的构造函数也可能想为各种不同的对象分配内存,因此问题级联,级联和级联。
Given the difficulty of this problem, and the alternative, generalized solutions I've tried and the grief associated when porting, I've settled on a completely de-generalized, somewhat pragmatic approach.
考虑到这个问题的难度,以及我尝试过的替代的,通用的解决方案以及移植时的悲伤,我已经确定了一种完全去泛化的,有点务实的方法。
The solution I settled on is to effectively reinvent the entire C and C++ standard library. Disgusting, I know, but I had a bit more of an excuse to do it in my case. The product I'm working on is effectively an engine and software development kit, designed to allow people to write plugins for it using any compiler, C runtime, C++ standard library implementation, and build settings they desire. To allow things like vectors or sets or maps to be passed through these central APIs in an ABI-compatible way required rolling our own standard-compliant containers in addition to a lot of C standard functions.
我解决的解决方案是有效地重新发明整个C和C ++标准库。我知道,这很恶心,但在我的情况下,我有更多的借口来做这件事。我正在开发的产品实际上是一个引擎和软件开发工具包,旨在允许人们使用任何编译器,C运行时,C ++标准库实现和他们想要的构建设置为其编写插件。为了允许向量或集合或映射等事物以ABI兼容的方式通过这些*API传递,除了许多C标准函数之外,还需要滚动我们自己的标准兼容容器。
The entire implementation of this devkit then revolves around these allocation functions:
然后,这个devkit的整个实现围绕这些分配函数:
EP_API void* ep_malloc(int lib_id, int size);
EP_API void ep_free(int lib_id, void* mem);
... and the entirety of the SDK revolves around these two, including memory pools and "sub-allocators".
...并且整个SDK围绕这两个,包括内存池和“子分配器”。
For third party libraries outside of our control, we're just SOL. Some of those libraries have equally ambitious things they want to do with their memory management, and to try to override that would just lead to all kinds of *es and open up all kinds of cans of worms. There are also very low-level drivers when using things like OGL that want to allocate a lot of system memory, and we can't do anything about it.
对于我们无法控制的第三方库,我们只是SOL。其中一些图书馆在内存管理方面也有同样雄心勃勃的事情,并试图超越它只会导致各种冲突,并打开各种各样的蠕虫。当使用像OGL这样想要分配大量系统内存的东西时,也有非常低级别的驱动程序,我们无法做任何事情。
Yet I've found this solution to work well enough to answer the basic question: "who/what is hogging up all this memory?" very quickly: a question which is often much more difficult to answer than a similar one related to clock cycles (for which we can just fire up any profiler). It only applies for code under our control, using this SDK, but we can get a very thorough memory breakdown using this system on a per-module basis. We can also set superficial caps on memory use to make sure that out of memory errors are actually being handled correctly without actually trying to exhaust all contiguous pages available in the system.
然而,我发现这个解决方案能够很好地回答基本问题:“谁/什么在占据所有这些记忆?”很快:一个问题通常比与时钟周期相关的类似问题(我们可以启动任何分析器)更难回答。它仅适用于我们控制下的代码,使用此SDK,但我们可以在每个模块的基础上使用此系统获得非常彻底的内存故障。我们还可以设置内存使用的表面上限,以确保实际上正确处理内存不足错误,而不会实际耗尽系统中可用的所有连续页面。
So in my case this problem was solved via policy: by building a uniform coding standard and a central library conforming to it that's used throughout the codebase (and by third parties writing plugins for our system). It's probably not the answer you are looking for, but this ended up being the most practical solution we've found yet.
所以在我的情况下,这个问题是通过政策解决的:通过建立一个统一的编码标准和一个符合它的*库,它在整个代码库中使用(以及第三方为我们的系统编写插件)。这可能不是您正在寻找的答案,但这最终成为我们发现的最实用的解决方案。