复杂ActiveRecord中尺寸、长度和计数的差异

时间:2023-01-28 18:29:14
[10] pry(main)> r.respondents.select(:name).uniq.size

(1.1ms)  SELECT DISTINCT COUNT("respondents"."name") FROM "respondents" 
INNER JOIN "values" ON "respondents"."id" = "values"."respondent_id" WHERE 
"values"."round_id" = 37 => 495

[11] pry(main)> r.respondents.select(:name).uniq.length

Respondent Load (1.1ms)  SELECT DISTINCT name FROM "respondents" 
INNER JOIN "values" ON "respondents"."id" = "values"."respondent_id" WHERE
"values"."round_id" = 37 => 6

Why the difference in what each query returns?

为什么每个查询返回的内容有差异?

3 个解决方案

#1


6  

.count #=> this always triggers a SELECT COUNT(*) on the database

.size #=> if the collection has been loaded, defers to Enumerable#size, else does the SELECT COUNT(*)

.length #=> always loads the collection and then defers to Enumerable#size

#2


1  

r.respondents.select(:name).uniq returns an ActiveRecord::Relation object, which overrides size.

r.respondents.select(名字)。uniq返回一个ActiveRecord::关系对象,它覆盖了大小。

See: http://api.rubyonrails.org/classes/ActiveRecord/Relation.html#method-i-size

见:http://api.rubyonrails.org/classes/ActiveRecord/Relation.html method-i-size

Calling size on such an object checks to see if the object is "loaded."

在这样的对象上调用大小来检查对象是否“加载”。

# Returns size of the records.
def size
  loaded? ? @records.length : count
end

If it is "loaded", it returns the length of the @records array. Otherwise, it calls count, which, without arguments, will "return a count of all the rows for the model."

如果是“装载”,则返回@records数组的长度。否则,它调用count,没有参数,它将“返回模型中所有行的count”。

So why this behavior? An AR::Relation is only "loaded" if either to_a or explain is called on it first:

那么,为什么这种行为呢?a::只有当首先调用to_a或explain时,关系才被“加载”:

https://github.com/rails/rails/blob/master/activerecord/lib/active_record/relation.rb

https://github.com/rails/rails/blob/master/activerecord/lib/active_record/relation.rb

The why is explained in a comment above the load method:

为什么在加载方法上面的注释中解释:

# Causes the records to be loaded from the database if they have not
# been loaded already. You can use this if for some reason you need
# to explicitly load some records before actually using them. The
# return value is the relation itself, not the records.
#
#   Post.where(published: true).load # => #<ActiveRecord::Relation>
def load
  unless loaded?
    # We monitor here the entire execution rather than individual SELECTs
    # because from the point of view of the user fetching the records of a
    # relation is a single unit of work. You want to know if this call takes
    # too long, not if the individual queries take too long.
    #
    # It could be the case that none of the queries involved surpass the
    # threshold, and at the same time the sum of them all does. The user
    # should get a query plan logged in that case.
    logging_query_plan { exec_queries }
  end

  self
end

So, perhaps using AR::Relation#size is a measure of the size of the potential complexity of queries on this relation, where length falls back to a count of the returned records.

因此,也许可以使用AR:: relationship #size来度量这个关系上查询的潜在复杂性的大小,其中的长度可以追溯到返回记录的计数。

#3


0  

While converting Rails 3.2 to 4.1 it seems AR::Relation#size is different. Previously it returned the number of "rows" whereas (in my case) it now returned a Hash. Changing to use #count seems to give the same result as #size in 3.2. I'm being a bit vague here since running tests in 'rails console' on 4.1 did not give the same results when running via 'rails server' on 4.1

当将Rails 3.2转换为4.1时,似乎AR::关系#大小不同。以前它返回“行”的数量,而现在(在我的例子中)它返回一个散列。更改为使用#count似乎会得到与3.2中的#size相同的结果。我在这里有点含糊,因为在4.1版的“rails console”中运行测试时,在4.1版的“rails server”中运行时没有给出相同的结果

#1


6  

.count #=> this always triggers a SELECT COUNT(*) on the database

.size #=> if the collection has been loaded, defers to Enumerable#size, else does the SELECT COUNT(*)

.length #=> always loads the collection and then defers to Enumerable#size

#2


1  

r.respondents.select(:name).uniq returns an ActiveRecord::Relation object, which overrides size.

r.respondents.select(名字)。uniq返回一个ActiveRecord::关系对象,它覆盖了大小。

See: http://api.rubyonrails.org/classes/ActiveRecord/Relation.html#method-i-size

见:http://api.rubyonrails.org/classes/ActiveRecord/Relation.html method-i-size

Calling size on such an object checks to see if the object is "loaded."

在这样的对象上调用大小来检查对象是否“加载”。

# Returns size of the records.
def size
  loaded? ? @records.length : count
end

If it is "loaded", it returns the length of the @records array. Otherwise, it calls count, which, without arguments, will "return a count of all the rows for the model."

如果是“装载”,则返回@records数组的长度。否则,它调用count,没有参数,它将“返回模型中所有行的count”。

So why this behavior? An AR::Relation is only "loaded" if either to_a or explain is called on it first:

那么,为什么这种行为呢?a::只有当首先调用to_a或explain时,关系才被“加载”:

https://github.com/rails/rails/blob/master/activerecord/lib/active_record/relation.rb

https://github.com/rails/rails/blob/master/activerecord/lib/active_record/relation.rb

The why is explained in a comment above the load method:

为什么在加载方法上面的注释中解释:

# Causes the records to be loaded from the database if they have not
# been loaded already. You can use this if for some reason you need
# to explicitly load some records before actually using them. The
# return value is the relation itself, not the records.
#
#   Post.where(published: true).load # => #<ActiveRecord::Relation>
def load
  unless loaded?
    # We monitor here the entire execution rather than individual SELECTs
    # because from the point of view of the user fetching the records of a
    # relation is a single unit of work. You want to know if this call takes
    # too long, not if the individual queries take too long.
    #
    # It could be the case that none of the queries involved surpass the
    # threshold, and at the same time the sum of them all does. The user
    # should get a query plan logged in that case.
    logging_query_plan { exec_queries }
  end

  self
end

So, perhaps using AR::Relation#size is a measure of the size of the potential complexity of queries on this relation, where length falls back to a count of the returned records.

因此,也许可以使用AR:: relationship #size来度量这个关系上查询的潜在复杂性的大小,其中的长度可以追溯到返回记录的计数。

#3


0  

While converting Rails 3.2 to 4.1 it seems AR::Relation#size is different. Previously it returned the number of "rows" whereas (in my case) it now returned a Hash. Changing to use #count seems to give the same result as #size in 3.2. I'm being a bit vague here since running tests in 'rails console' on 4.1 did not give the same results when running via 'rails server' on 4.1

当将Rails 3.2转换为4.1时,似乎AR::关系#大小不同。以前它返回“行”的数量,而现在(在我的例子中)它返回一个散列。更改为使用#count似乎会得到与3.2中的#size相同的结果。我在这里有点含糊,因为在4.1版的“rails console”中运行测试时,在4.1版的“rails server”中运行时没有给出相同的结果