What's the most efficient way to iterate through an entire table using Datamapper?
使用Datamapper迭代整个表最有效的方法是什么?
If I do this, does Datamapper try to pull the entire result set into memory before performing the iteration? Assume, for the sake of argument, that I have millions of records and that this is infeasible:
如果我这样做,Datamapper会在执行迭代之前将整个结果集拖放到内存中吗?为了论证,假设我有数百万条记录,这是不可行的:
Author.all.each do |a|
puts a.title
end
Is there a way that I can tell Datamapper to load the results in chunks? Is it smart enough to know to do this automatically?
是否有一种方法可以让Datamapper以块的形式加载结果?是否明智地知道自动地做这件事?
3 个解决方案
#1
2
Datamapper will run just one sql query for the example above so it will have to keep the whole result set in memory.
Datamapper为上面的示例只运行一个sql查询,因此它必须将整个结果集保存在内存中。
I think you should use some sort of pagination if your collection is big. Using dm-pagination you could do something like:
我认为如果你的收藏很大,你应该使用某种分页。使用dm-pagination你可以做以下事情:
PAGE_SIZE = 20
pager = Author.page(:per_page => PAGE_SIZE).pager # This will run a count query
(1..pager.total_pages).each do |page_number|
Author.page(:per_page => PAGE_SIZE, :page => page_number).each do |a|
puts a.title
end
end
You can play around with different values for PAGE_SIZE to find a good trade-off between the number of sql queries and memory usage.
您可以使用PAGE_SIZE的不同值来寻找sql查询的数量和内存使用之间的良好权衡。
#2
4
Thanks, Nicolas, I actually came up with a similar solution. I've accepted your answer since it makes use of Datamapper's dm-pagination
system, but I'm wondering if this would do equally as well (or worse):
谢谢,尼古拉斯,我确实想出了一个类似的解决办法。我已经接受了你的答案,因为它使用了Datamapper的dm-pagination系统,但是我想知道这样做是否同样有效(或者更糟糕):
while authors = Author.slice(offset, CHUNK) do
authors.each do |a|
# do something with a
end
offset += CHUNK
end
#3
2
What you want is the dm-chunked_query plugin: (example from the docs)
您需要的是dm-chunked_query插件:(来自文档的示例)
require 'dm-chunked_query'
MyModel.each_chunk(20) do |chunk|
chunk.each do |resource|
# ...
end
end
This will allow you to iterate over all the records in the model, in chunks of 20 records at a time.
这将允许您对模型中的所有记录进行迭代,每次以20条记录为单位。
EDIT: the example above had an extra #each
after #each_chunk
, and it was unnecessary. The gem author updated the README example, and I changed the above code to match.
编辑:上面的示例在#each_chunk之后有一个额外的#,这是不必要的。gem作者更新了README示例,我更改了上面的代码以匹配。
#1
2
Datamapper will run just one sql query for the example above so it will have to keep the whole result set in memory.
Datamapper为上面的示例只运行一个sql查询,因此它必须将整个结果集保存在内存中。
I think you should use some sort of pagination if your collection is big. Using dm-pagination you could do something like:
我认为如果你的收藏很大,你应该使用某种分页。使用dm-pagination你可以做以下事情:
PAGE_SIZE = 20
pager = Author.page(:per_page => PAGE_SIZE).pager # This will run a count query
(1..pager.total_pages).each do |page_number|
Author.page(:per_page => PAGE_SIZE, :page => page_number).each do |a|
puts a.title
end
end
You can play around with different values for PAGE_SIZE to find a good trade-off between the number of sql queries and memory usage.
您可以使用PAGE_SIZE的不同值来寻找sql查询的数量和内存使用之间的良好权衡。
#2
4
Thanks, Nicolas, I actually came up with a similar solution. I've accepted your answer since it makes use of Datamapper's dm-pagination
system, but I'm wondering if this would do equally as well (or worse):
谢谢,尼古拉斯,我确实想出了一个类似的解决办法。我已经接受了你的答案,因为它使用了Datamapper的dm-pagination系统,但是我想知道这样做是否同样有效(或者更糟糕):
while authors = Author.slice(offset, CHUNK) do
authors.each do |a|
# do something with a
end
offset += CHUNK
end
#3
2
What you want is the dm-chunked_query plugin: (example from the docs)
您需要的是dm-chunked_query插件:(来自文档的示例)
require 'dm-chunked_query'
MyModel.each_chunk(20) do |chunk|
chunk.each do |resource|
# ...
end
end
This will allow you to iterate over all the records in the model, in chunks of 20 records at a time.
这将允许您对模型中的所有记录进行迭代,每次以20条记录为单位。
EDIT: the example above had an extra #each
after #each_chunk
, and it was unnecessary. The gem author updated the README example, and I changed the above code to match.
编辑:上面的示例在#each_chunk之后有一个额外的#,这是不必要的。gem作者更新了README示例,我更改了上面的代码以匹配。