如何避免多个查询:包含在Rails中?

时间:2020-12-12 04:17:08

If I do this

如果我这样做

post = Post.find_by_id(post_id, :include => :comments)

two queries are performed (one for post data and and another for the post's comments). Then when I do post.comments, another query is not performed because data is already cached.

执行两个查询(一个用于发布数据,另一个用于帖子的评论)。然后,当我发布post.com时,不执行另一个查询,因为数据已经被缓存。

Is there a way to do just one query and still access the comments via post.comments?

有没有办法只做一个查询,仍然通过post.comments访问评论?

2 个解决方案

#1


32  

No, there is not. This is the intended behavior of :include, since the JOIN approach ultimately comes out to be inefficient.

不,那里没有。这是以下行为:include,因为JOIN方法最终效率低下。

For example, consider the following scenario: the Post model has 3 fields that you need to select, 2 fields for Comment, and this particular post has 100 comments. Rails could run a single JOIN query along the lines of:

例如,请考虑以下情形:Post模型有3个字段需要选择,2个字段用于Comment,此特定帖子有100条注释。 Rails可以运行单个JOIN查询:

SELECT post.id, post.title, post.author_id, comment.id, comment.body
FROM posts
INNER JOIN comments ON comment.post_id = post.id
WHERE post.id = 1

This would return the following table of results:

这将返回以下结果表:

 post.id | post.title | post.author_id | comment.id | comment.body
---------+------------+----------------+------------+--------------
       1 | Hello!     |              1 |          1 | First!
       1 | Hello!     |              1 |          2 | Second!
       1 | Hello!     |              1 |          3 | Third!
       1 | Hello!     |              1 |          4 | Fourth!
...96 more...

You can see the problem already. The single-query JOIN approach, though it returns the data you need, returns it redundantly. When the database server sends the result set to Rails, it will send the post's ID, title, and author ID 100 times each. Now, suppose that the Post had 10 fields you were interested in, 8 of which were text blocks. Eww. That's a lot of data. Transferring data from the database to Rails does take work on both sides, both in CPU cycles and RAM, so minimizing that data transfer is important for making the app run faster and leaner.

你已经可以看到问题。单查询JOIN方法虽然返回了您需要的数据,但会以冗余方式返回。当数据库服务器将结果集发送给Rails时,它将分别发送帖子的ID,标题和作者ID 100次。现在,假设Post有10个您感兴趣的字段,其中8个是文本块。好恶。这是很多数据。将数据从数据库传输到Rails确实需要在CPU周期和RAM中进行双方工作,因此最大限度地减少数据传输对于使应用程序运行更快更精简非常重要。

The Rails devs crunched the numbers, and most applications run better when using multiple queries that only fetch each bit of data once rather than one query that has the potential to get hugely redundant.

Rails开发人员对这些数字进行了碾压,并且大多数应用程序在使用多次查询时运行得更好,这些查询只获取一次数据位而不是一次有可能变得非常冗余的查询。

Of course, there comes a time in every developer's life when a join is necessary in order to run complex conditions, and that can be achieved by replacing :include with :joins. For prefetching relationships, however, the approach Rails takes in :include is much better for performance.

当然,每个开发人员的生活中都有一段时间需要连接才能运行复杂的条件,并且可以通过替换:include with:join来实现。但是,对于预取关系,Rails采用的方法是:include对性能要好得多。

#2


5  

If you use this behaviour of eagerly-loaded associations, you'll get a single (and efficient) query.

如果您使用这种急切加载的关联行为,您将获得单个(高效)查询。

Here is an example:

这是一个例子:

  • Say you have the following model (where :user is the foreign reference):

    假设您有以下模型(其中:user是外部引用):

    class Item < ActiveRecord::Base
      attr_accessible :name, :user_id
      belongs_to :user
    end
    
  • Then executing this (note: the where part is crucial as it tricks Rails to produce that single query):

    然后执行它(注意:where部分是至关重要的,因为它欺骗Rails来产生单个查询):

    @items = Item.includes(:user).where("users.id IS NOT NULL").all
    

    will result in a single SQL query (the syntax below is that of PostgreSQL):

    将导致单个SQL查询(下面的语法是PostgreSQL的语法):

    SELECT "items"."id" AS t0_r0, "items"."user_id" AS t0_r1, 
            "items"."name" AS t0_r2, "items"."created_at" AS t0_r3,
            "items"."updated_at" AS t0_r4, "users"."id" AS t1_r0, 
            "users"."email" AS t1_r1, "users"."created_at" AS t1_r4, 
            "users"."updated_at" AS t1_r5 
    FROM "measurements" 
    LEFT OUTER JOIN "users" ON "users"."id" = "items"."user_id" 
    WHERE (users.id IS NOT NULL)

#1


32  

No, there is not. This is the intended behavior of :include, since the JOIN approach ultimately comes out to be inefficient.

不,那里没有。这是以下行为:include,因为JOIN方法最终效率低下。

For example, consider the following scenario: the Post model has 3 fields that you need to select, 2 fields for Comment, and this particular post has 100 comments. Rails could run a single JOIN query along the lines of:

例如,请考虑以下情形:Post模型有3个字段需要选择,2个字段用于Comment,此特定帖子有100条注释。 Rails可以运行单个JOIN查询:

SELECT post.id, post.title, post.author_id, comment.id, comment.body
FROM posts
INNER JOIN comments ON comment.post_id = post.id
WHERE post.id = 1

This would return the following table of results:

这将返回以下结果表:

 post.id | post.title | post.author_id | comment.id | comment.body
---------+------------+----------------+------------+--------------
       1 | Hello!     |              1 |          1 | First!
       1 | Hello!     |              1 |          2 | Second!
       1 | Hello!     |              1 |          3 | Third!
       1 | Hello!     |              1 |          4 | Fourth!
...96 more...

You can see the problem already. The single-query JOIN approach, though it returns the data you need, returns it redundantly. When the database server sends the result set to Rails, it will send the post's ID, title, and author ID 100 times each. Now, suppose that the Post had 10 fields you were interested in, 8 of which were text blocks. Eww. That's a lot of data. Transferring data from the database to Rails does take work on both sides, both in CPU cycles and RAM, so minimizing that data transfer is important for making the app run faster and leaner.

你已经可以看到问题。单查询JOIN方法虽然返回了您需要的数据,但会以冗余方式返回。当数据库服务器将结果集发送给Rails时,它将分别发送帖子的ID,标题和作者ID 100次。现在,假设Post有10个您感兴趣的字段,其中8个是文本块。好恶。这是很多数据。将数据从数据库传输到Rails确实需要在CPU周期和RAM中进行双方工作,因此最大限度地减少数据传输对于使应用程序运行更快更精简非常重要。

The Rails devs crunched the numbers, and most applications run better when using multiple queries that only fetch each bit of data once rather than one query that has the potential to get hugely redundant.

Rails开发人员对这些数字进行了碾压,并且大多数应用程序在使用多次查询时运行得更好,这些查询只获取一次数据位而不是一次有可能变得非常冗余的查询。

Of course, there comes a time in every developer's life when a join is necessary in order to run complex conditions, and that can be achieved by replacing :include with :joins. For prefetching relationships, however, the approach Rails takes in :include is much better for performance.

当然,每个开发人员的生活中都有一段时间需要连接才能运行复杂的条件,并且可以通过替换:include with:join来实现。但是,对于预取关系,Rails采用的方法是:include对性能要好得多。

#2


5  

If you use this behaviour of eagerly-loaded associations, you'll get a single (and efficient) query.

如果您使用这种急切加载的关联行为,您将获得单个(高效)查询。

Here is an example:

这是一个例子:

  • Say you have the following model (where :user is the foreign reference):

    假设您有以下模型(其中:user是外部引用):

    class Item < ActiveRecord::Base
      attr_accessible :name, :user_id
      belongs_to :user
    end
    
  • Then executing this (note: the where part is crucial as it tricks Rails to produce that single query):

    然后执行它(注意:where部分是至关重要的,因为它欺骗Rails来产生单个查询):

    @items = Item.includes(:user).where("users.id IS NOT NULL").all
    

    will result in a single SQL query (the syntax below is that of PostgreSQL):

    将导致单个SQL查询(下面的语法是PostgreSQL的语法):

    SELECT "items"."id" AS t0_r0, "items"."user_id" AS t0_r1, 
            "items"."name" AS t0_r2, "items"."created_at" AS t0_r3,
            "items"."updated_at" AS t0_r4, "users"."id" AS t1_r0, 
            "users"."email" AS t1_r1, "users"."created_at" AS t1_r4, 
            "users"."updated_at" AS t1_r5 
    FROM "measurements" 
    LEFT OUTER JOIN "users" ON "users"."id" = "items"."user_id" 
    WHERE (users.id IS NOT NULL)