Consider this snippet:
考虑以下代码段:
l = []
while 1
l << 'a random 369-characterish string'
end
^C
# ran this for maybe 4 seconds, and it had 27 million entries in l. memory
# usage was 1.6 GB.
l = nil
# no change in memory usage
GC.start
# memory usage drops a relatively small amount, from 1.6 GB to 1.39 GB.
I am pushing millions of elements into/through Ruby's data structures, and having some serious memory issues. This example demonstrates that even in a case where there is no reference to an extant object, Ruby will not let [most of] it go, even after an explicit call to GC.start
.
我正在将数百万个元素推入/通过Ruby的数据结构,并且存在一些严重的内存问题。这个例子表明,即使在没有引用现存对象的情况下,即使在显式调用GC.start之后,Ruby也不会让[大部分]进行。
The objects I am using in real life push millions of elements into a hash in total, but the hash is used as a temporary lookup table and is zeroed out after some loop has completed. The memory from this lookup table, however, apparently never gets released, and this slows my application horrendously and progessively because the GC has millions of defunct objects to analyze on each cycle. I am working on a workaround with the sparsehash
gem, but this doesn't seem like an intractable problem that the Ruby runtime should choke on like that. The references are clearly deleted, and the objects should clearly be collected and disposed. Can anyone help me figure out why this is not happening?
我在现实生活中使用的对象总共将数百万个元素推送到哈希中,但是哈希被用作临时查找表,并在一些循环完成后被清零。然而,这个查找表中的内存显然永远不会被释放,这会使我的应用程序骇人听闻地放慢速度,因为GC在每个周期都有数百万个已解散的对象需要分析。我正在使用sparsehash gem进行解决方法,但这似乎不是一个棘手的问题,Ruby运行时应该像这样窒息。清楚地删除了参考文献,并且应该清楚地收集和处理这些对象。任何人都可以帮我弄清楚为什么没有发生这种情况?
I have tried l.delete_if { |x| true}
on the suggestion of a user in #ruby on freenode, but this was really slow and also never seemed to cause an appreciable release of memory.
我试过l.delete_if {| x |关于freenode上#ruby中的用户的建议,但是这真的很慢,而且似乎也从来没有引起明显的内存释放。
Using ruby 1.9.3p194 (2012-04-20 revision 35410) [x86_64-linux]
.
使用ruby 1.9.3p194(2012-04-20修订版35410)[x86_64-linux]。
EDIT:
编辑:
For comparison, here is a run in python3
:
为了比较,这是python3中的运行:
l = []
while 1:
l.append('a random 369-characterish string')
^C
# 31,216,082 elements; 246M memory usage.
l = []
# memory usage drops to 8K (0% of system total)
Test on python2 show near identical results.
在python2上测试显示几乎相同的结果。
I'm not sure if this is sufficient to consider this an implementation deficiency in MRI or if it just chalked up to different approaches to GC. Either way, it seems like Python is better suited to use cases that are going to push millions of elements in total through the data structures and periodically zero the structures out (like a one may do for a temporary look up table).
我不确定这是否足以将这一点视为MRI的实施缺陷,或者它是否只是采用了不同的GC方法。无论哪种方式,似乎Python更适合使用通过数据结构总共推动数百万个元素的情况,并定期将结构归零(就像一个临时查找表一样)。
It really does seem like this should be a simple one. :\
看起来这应该是一个简单的。 :\
1 个解决方案
#1
1
Kind of hacky, but you can try fork
ing the operation off as a separate process. The process will run in shared memory space; when it terminates, the memory will be freed.
有点hacky,但你可以尝试将操作分离为一个单独的过程。该进程将在共享内存空间中运行;当它终止时,内存将被释放。
Ruby might not be releasing memory back to the Kernel as @Sergio Tulentsev pointed out in the comments.
正如@Sergio Tulentsev在评论中指出的那样,Ruby可能不会将内存释放回内核。
This Ruby/Unix mailing list conversation describes this in detail: Avoiding system calls
这个Ruby / Unix邮件列表对话详细描述了这一点:避免系统调用
Also, this blog post describes forking as a solution for memory management in Rails: Saving memory in Ruby on Rails with fork() and copy-on-write. Although, I don't think that Ruby will support copy-on-write until Ruby 2 comes out.
此外,这篇博文还描述了分支作为Rails中内存管理的解决方案:使用fork()和copy-on-write在Ruby on Rails中保存内存。虽然,我不认为Ruby会支持copy-on-write,直到Ruby 2出现。
#1
1
Kind of hacky, but you can try fork
ing the operation off as a separate process. The process will run in shared memory space; when it terminates, the memory will be freed.
有点hacky,但你可以尝试将操作分离为一个单独的过程。该进程将在共享内存空间中运行;当它终止时,内存将被释放。
Ruby might not be releasing memory back to the Kernel as @Sergio Tulentsev pointed out in the comments.
正如@Sergio Tulentsev在评论中指出的那样,Ruby可能不会将内存释放回内核。
This Ruby/Unix mailing list conversation describes this in detail: Avoiding system calls
这个Ruby / Unix邮件列表对话详细描述了这一点:避免系统调用
Also, this blog post describes forking as a solution for memory management in Rails: Saving memory in Ruby on Rails with fork() and copy-on-write. Although, I don't think that Ruby will support copy-on-write until Ruby 2 comes out.
此外,这篇博文还描述了分支作为Rails中内存管理的解决方案:使用fork()和copy-on-write在Ruby on Rails中保存内存。虽然,我不认为Ruby会支持copy-on-write,直到Ruby 2出现。