什么时候垃圾收集比手动内存管理更快?

时间:2022-06-06 20:44:22

Under what circumstances is garbage collection more efficient than manual memory management? (Here manual could mean using malloc and free as in C, or the cleaner RAII and smart pointer techniques popularized by C++)

在什么情况下垃圾收集比手动内存管理更有效? (这里的手册可能意味着在C中使用malloc和free,或者使用C ++推广的更清晰的RAII和智能指针技术)

I like how garbage collection can remove some accidental complexity from writing software, but I was even more pleased at how RAII and smart pointers can eliminate that complexity while also working on resources other than memory, being deterministic, and providing performance guarantees and being more efficient overall. So I thought I could safely ignore garbage collection. However, I've noticed that people have been saying that garbage collection is faster than the tight resource management used in C++, such as when there is a lot of extra memory available.

我喜欢垃圾收集如何通过编写软件来消除一些偶然的复杂性,但我更加高兴RAII和智能指针如何消除这种复杂性,同时还可以处理内存以外的资源,确定性,提供性能保证和更高效总体。所以我认为我可以安全地忽略垃圾收集。但是,我注意到人们一直在说垃圾收集比C ++中使用的紧密资源管理更快,例如当有大量额外内存可用时。

So when exactly can garbage collection outperform manual memory management? I like RAII and smart pointers but would happy to accept garbage collection as another tool if it is faster.

那么什么时候垃圾收集完全可以胜过手动内存管理呢?我喜欢RAII和智能指针,但如果速度更快,很乐意接受垃圾收集作为另一种工具。

3 个解决方案

#1


14  

Never, and I can prove it.

从来没有,我可以证明这一点。

First, let's assume we're using the best algorithms in either case. Using a sub-optimal algorithm can prove anything.

首先,让我们假设我们在任何一种情况下都使用最好的算法。使用次优算法可以证明任何事情。

Secondly, let's assume the best GC algorithm uses times T0...Tn to decide whether memory allocation i should be freed at a certain moment. The total then is Σ(Ti)

其次,让我们假设最好的GC算法使用时间T0 ... Tn来决定是否应该在某个时刻释放内存分配。那么总数是Σ(Ti)

Now, an equivalent manual memory management algorithm exists that uses the same GC algorithm, but only considers memory blocks which have been manually marked to be freed. Therefore, the running time is Σ(δiTi) (with δi=0 when block i wasn't freed, and =1 when it was).

现在,存在使用相同GC算法的等效手动内存管理算法,但仅考虑已手动标记为已释放的内存块。因此,运行时间是Σ(δiTi)(当块i未被释放时δi= 0,当它被释放时= 1)。

Obviously, Σ(δiTi) ≤ Σ(Ti) : there's a manual algorithm that's strictly not worse than a GC algorithm.

显然,Σ(δiTi)≤Σ(Ti):有一种手动算法,严格来说不比GC算法差。

How does that contrast with other answers?

这与其他答案形成鲜明对比?

  • "With RAII, you have to deallocate every time you allocate." - No, that's a mistaken assumption. We should compare either batched or non-batched operations. GC would be far worse if you'd also do a GC run every time something went out of scope.
  • “使用RAII,你必须在每次分配时解除分配。” - 不,这是一个错误的假设。我们应该比较批量或非批量操作。如果你在每次超出范围时也进行GC运行,GC会更糟糕。

  • "Improved locality" - well, unless you discount the fact that non-GC languages can use the stack far more often, and that stack has an excellent locality of reference.
  • “改进的局部性” - 好吧,除非你忽略了非GC语言可以更频繁地使用堆栈的事实,并且该堆栈具有极好的引用局部性。

  • "low odds for leaks" - that's entirely true, but we were discussing runtime performance. (BTW: good to point out that it's low but non-zero odds).
  • “泄漏的可能性很低” - 这完全正确,但我们正在讨论运行时性能。 (顺便说一句:很好地指出它很低但非零赔率)。

#2


10  

GC advantages:

  • a GC allocates simply by incrementing a pointer, heap allocators must take counter-measures to avoid heap fragmentation
  • 一个GC只是通过递增指针来分配,堆分配器必须采取反措施来避免堆碎片

  • a GC improves cpu cache locality, a big deal on modern processors
  • GC改进了cpu缓存局部性,这对现代处理器来说很重要

  • a GC requires no extra code to release memory, low odds for leaks
  • GC不需要额外的代码来释放内存,泄漏的可能性很低

  • a generational GG can collect garbage concurrently, making memory management close to free on a multi-core cpu.
  • 世代GG可以同时收集垃圾,使内存管理在多核cpu上接近免费。

GC disadvantages:

  • difficult to make efficient in a language that treats pointers as first class types
  • 难以在将指针视为第一类类型的语言中提高效率

  • uses more virtual memory space due to collection latency
  • 由于收集延迟,使用更多虚拟内存空间

  • leaky abstraction for operating system resources other than memory
  • 内存以外的操作系统资源的漏洞抽象

  • can cause pauses in program operation in some cases while garbage is being collected.
  • 在收集垃圾的某些情况下,可能会导致程序操作暂停。

It is a slamdunk for perf, a GC beats a heap allocator easily without effort. The Chinese Dictionary programming contest between Rico Mariani and Raymond Chen is often quoted, overview is here. Raymond's C++ program eventually won but only after rewriting it several times and giving up on the standard C++ library.

对于perf来说,它是一个灌篮,GC不费力地轻松击败堆分配器。 Rico Mariani和Raymond Chen之间的汉语词典编程比赛经常引用,概述就在这里。雷蒙德的C ++程序最终赢了,但只是在重写了几次并放弃了标准的C ++库之后。

#3


2  

There's one particular scenario I know in which GC pointer is much faster than a smart pointer (reference counted pointer) when both are implemented in traditional C++ (i.e. not C++/CLR as we won't have the same luxury as to Compact the memory after Mark-Sweep, and we're trying to compare apples to apples here as much as we can) assuming GC uses the same underlying heap memory manager. It is when your object assignment time significantly overwhelms object creation time.

我知道有一个特殊的场景,当GC指针比传统的C ++(即不是C ++ / CLR)实现时,GC指针比智能指针(引用计数指针)快得多,因为我们不会像以后压缩内存一样奢侈Mark-Sweep,我们正试图在这里尽可能多地比较苹果和苹果)假设GC使用相同的底层堆内存管理器。当你的对象分配时间明显超过对象创建时间时。

Examples include sorting, array reversal etc.

例子包括排序,阵列反转等。

See here for more info on my test with a GC framework I implemented using traditional C++ vs reference counted pointers. Outcome of the array reversal test was GcString was 16.5 times faster than ref counted String.

有关我使用传统C ++与引用计数指针实现的GC框架测试的更多信息,请参见此处。阵列反转测试的结果是GcString比ref计数字符串快16.5倍。

This could be attributed to the painfully slow bus lock semantics in a reference counted pointer (unless you're targeting purely single threaded apps, locks are required for thread safety). From my experience of implementing a high performance precise GC in traditional C++, I can say that there are more opportunities for optimizations with a GC vs 'manual' (as per OP's definition of manual) memory management (i.e. smart pointers).

这可能归因于引用计数指针中缓慢的总线锁定语义(除非您的目标是纯单线程应用程序,因此线程安全需要锁定)。根据我在传统C ++中实现高性能精确GC的经验,我可以说有更多机会通过GC与'手动'(根据OP的手册定义)内存管理(即智能指针)进行优化。

#1


14  

Never, and I can prove it.

从来没有,我可以证明这一点。

First, let's assume we're using the best algorithms in either case. Using a sub-optimal algorithm can prove anything.

首先,让我们假设我们在任何一种情况下都使用最好的算法。使用次优算法可以证明任何事情。

Secondly, let's assume the best GC algorithm uses times T0...Tn to decide whether memory allocation i should be freed at a certain moment. The total then is Σ(Ti)

其次,让我们假设最好的GC算法使用时间T0 ... Tn来决定是否应该在某个时刻释放内存分配。那么总数是Σ(Ti)

Now, an equivalent manual memory management algorithm exists that uses the same GC algorithm, but only considers memory blocks which have been manually marked to be freed. Therefore, the running time is Σ(δiTi) (with δi=0 when block i wasn't freed, and =1 when it was).

现在,存在使用相同GC算法的等效手动内存管理算法,但仅考虑已手动标记为已释放的内存块。因此,运行时间是Σ(δiTi)(当块i未被释放时δi= 0,当它被释放时= 1)。

Obviously, Σ(δiTi) ≤ Σ(Ti) : there's a manual algorithm that's strictly not worse than a GC algorithm.

显然,Σ(δiTi)≤Σ(Ti):有一种手动算法,严格来说不比GC算法差。

How does that contrast with other answers?

这与其他答案形成鲜明对比?

  • "With RAII, you have to deallocate every time you allocate." - No, that's a mistaken assumption. We should compare either batched or non-batched operations. GC would be far worse if you'd also do a GC run every time something went out of scope.
  • “使用RAII,你必须在每次分配时解除分配。” - 不,这是一个错误的假设。我们应该比较批量或非批量操作。如果你在每次超出范围时也进行GC运行,GC会更糟糕。

  • "Improved locality" - well, unless you discount the fact that non-GC languages can use the stack far more often, and that stack has an excellent locality of reference.
  • “改进的局部性” - 好吧,除非你忽略了非GC语言可以更频繁地使用堆栈的事实,并且该堆栈具有极好的引用局部性。

  • "low odds for leaks" - that's entirely true, but we were discussing runtime performance. (BTW: good to point out that it's low but non-zero odds).
  • “泄漏的可能性很低” - 这完全正确,但我们正在讨论运行时性能。 (顺便说一句:很好地指出它很低但非零赔率)。

#2


10  

GC advantages:

  • a GC allocates simply by incrementing a pointer, heap allocators must take counter-measures to avoid heap fragmentation
  • 一个GC只是通过递增指针来分配,堆分配器必须采取反措施来避免堆碎片

  • a GC improves cpu cache locality, a big deal on modern processors
  • GC改进了cpu缓存局部性,这对现代处理器来说很重要

  • a GC requires no extra code to release memory, low odds for leaks
  • GC不需要额外的代码来释放内存,泄漏的可能性很低

  • a generational GG can collect garbage concurrently, making memory management close to free on a multi-core cpu.
  • 世代GG可以同时收集垃圾,使内存管理在多核cpu上接近免费。

GC disadvantages:

  • difficult to make efficient in a language that treats pointers as first class types
  • 难以在将指针视为第一类类型的语言中提高效率

  • uses more virtual memory space due to collection latency
  • 由于收集延迟,使用更多虚拟内存空间

  • leaky abstraction for operating system resources other than memory
  • 内存以外的操作系统资源的漏洞抽象

  • can cause pauses in program operation in some cases while garbage is being collected.
  • 在收集垃圾的某些情况下,可能会导致程序操作暂停。

It is a slamdunk for perf, a GC beats a heap allocator easily without effort. The Chinese Dictionary programming contest between Rico Mariani and Raymond Chen is often quoted, overview is here. Raymond's C++ program eventually won but only after rewriting it several times and giving up on the standard C++ library.

对于perf来说,它是一个灌篮,GC不费力地轻松击败堆分配器。 Rico Mariani和Raymond Chen之间的汉语词典编程比赛经常引用,概述就在这里。雷蒙德的C ++程序最终赢了,但只是在重写了几次并放弃了标准的C ++库之后。

#3


2  

There's one particular scenario I know in which GC pointer is much faster than a smart pointer (reference counted pointer) when both are implemented in traditional C++ (i.e. not C++/CLR as we won't have the same luxury as to Compact the memory after Mark-Sweep, and we're trying to compare apples to apples here as much as we can) assuming GC uses the same underlying heap memory manager. It is when your object assignment time significantly overwhelms object creation time.

我知道有一个特殊的场景,当GC指针比传统的C ++(即不是C ++ / CLR)实现时,GC指针比智能指针(引用计数指针)快得多,因为我们不会像以后压缩内存一样奢侈Mark-Sweep,我们正试图在这里尽可能多地比较苹果和苹果)假设GC使用相同的底层堆内存管理器。当你的对象分配时间明显超过对象创建时间时。

Examples include sorting, array reversal etc.

例子包括排序,阵列反转等。

See here for more info on my test with a GC framework I implemented using traditional C++ vs reference counted pointers. Outcome of the array reversal test was GcString was 16.5 times faster than ref counted String.

有关我使用传统C ++与引用计数指针实现的GC框架测试的更多信息,请参见此处。阵列反转测试的结果是GcString比ref计数字符串快16.5倍。

This could be attributed to the painfully slow bus lock semantics in a reference counted pointer (unless you're targeting purely single threaded apps, locks are required for thread safety). From my experience of implementing a high performance precise GC in traditional C++, I can say that there are more opportunities for optimizations with a GC vs 'manual' (as per OP's definition of manual) memory management (i.e. smart pointers).

这可能归因于引用计数指针中缓慢的总线锁定语义(除非您的目标是纯单线程应用程序,因此线程安全需要锁定)。根据我在传统C ++中实现高性能精确GC的经验,我可以说有更多机会通过GC与'手动'(根据OP的手册定义)内存管理(即智能指针)进行优化。