So I have this function.
我有这个函数。
Let say I have millions of posts.
假设我有数百万篇文章。
How can I optimize this function?
如何优化这个函数?
def fun
Post.all.each do |post|
if post.user.present?
post.active = true
else
post.active = false
end
post.save
end
end
Like do this in fewer line with better performance because this is not a very good approach.
就像用更少的行来做,有更好的性能因为这不是一个很好的方法。
3 个解决方案
#1
3
Here's another option that does it in two queries without any raw SQL (just plain ol' Rails):
这里还有一个选项,它可以在两个查询中执行,而不需要任何原始SQL(只是普通的ol' Rails):
Post.where(user_id: nil).update_all(active: false)
Post.where.not(user_id: nil).update_all(active: true)
And, believe it or not, this actually runs faster in the database than doing it in one query that's using an expression – active = (user_id IS NOT NULL)
– to populate active
.
而且,不管你信不信,这实际上在数据库中运行得比在一个使用表达式- active = (user_id不是NULL)的查询中运行得更快——以填充活动。
Here are the speed results from testing on a table with only 20,000 records:
下面是在一个只有20,000条记录的表上测试的速度结果:
# Single (expression-based) query
<Benchmark::Tms:0x00007fd251a52780 @cstime=0.0, @cutime=0.0, @label="", @real=2.3656239999982063, @stime=0.0, @total=0.009999999999999787, @utime=0.009999999999999787>
# Two (purely column-based) queries
<Benchmark::Tms:0x00007fd2518c36d0 @cstime=0.0, @cutime=0.0, @label="", @real=2.309347999995225, @stime=0.0, @total=0.0, @utime=0.0>
#2
5
This should do the trick - and it's FAST...
这应该会奏效——而且速度很快……
Post.update_all("active = (user_id IS NOT NULL)")
#3
2
Post.connection.execute \
"UPDATE posts SET active = TRUE WHERE user_id IS NOT NULL"
The proper approach would be to remove active
field from the database and implement the ruby getter in Post
class:
正确的方法是从数据库中删除活动字段,并在Post类中实现ruby getter:
def active
user.present?
end
#1
3
Here's another option that does it in two queries without any raw SQL (just plain ol' Rails):
这里还有一个选项,它可以在两个查询中执行,而不需要任何原始SQL(只是普通的ol' Rails):
Post.where(user_id: nil).update_all(active: false)
Post.where.not(user_id: nil).update_all(active: true)
And, believe it or not, this actually runs faster in the database than doing it in one query that's using an expression – active = (user_id IS NOT NULL)
– to populate active
.
而且,不管你信不信,这实际上在数据库中运行得比在一个使用表达式- active = (user_id不是NULL)的查询中运行得更快——以填充活动。
Here are the speed results from testing on a table with only 20,000 records:
下面是在一个只有20,000条记录的表上测试的速度结果:
# Single (expression-based) query
<Benchmark::Tms:0x00007fd251a52780 @cstime=0.0, @cutime=0.0, @label="", @real=2.3656239999982063, @stime=0.0, @total=0.009999999999999787, @utime=0.009999999999999787>
# Two (purely column-based) queries
<Benchmark::Tms:0x00007fd2518c36d0 @cstime=0.0, @cutime=0.0, @label="", @real=2.309347999995225, @stime=0.0, @total=0.0, @utime=0.0>
#2
5
This should do the trick - and it's FAST...
这应该会奏效——而且速度很快……
Post.update_all("active = (user_id IS NOT NULL)")
#3
2
Post.connection.execute \
"UPDATE posts SET active = TRUE WHERE user_id IS NOT NULL"
The proper approach would be to remove active
field from the database and implement the ruby getter in Post
class:
正确的方法是从数据库中删除活动字段,并在Post类中实现ruby getter:
def active
user.present?
end