如何在控制台中的activerecord-sqlserver-adapter中调试sql查询

时间:2021-02-05 20:22:55

If anyone has a couple of free hours (or days) to help me optimise a few calls and want to be paid for it ( i can offer 150USD an hour ) for their help I would really like your help. I'm getting desperate :)

如果有人有几个免费的小时(或几天)来帮助我优化几个电话,并希望得到它的支付(我可以提供每小时150美元)的帮助,我真的很想得到你的帮助。我变得绝望:)

I've got some sql queries that are quite slow:

我有一些非常慢的SQL查询:

Panel Load (1075.7ms)  EXEC sp_executesql N'SELECT [panels].* FROM [panels] WHERE [panels].[agglo_code_id] = @0 AND [panels].[environment_id] = @1 AND [panels].[product_id] = @2 AND (NOT EXISTS(SELECT 1 FROM campaign_search_panels WHERE campaign_search_panels.panel_id = panels.panel_id AND campaign_search_panels.campaign_id = 32)) AND (NOT EXISTS(SELECT 1 FROM "AIDAAU_Avails" WHERE "AIDAAU_Avails"."PanelID" = panels.panel_uid AND "AIDAAU_Avails"."TillDate" >= ''08-21-2017'' AND "AIDAAU_Avails"."FromDate" <= ''09-03-2017''))', N'@0 int, @1 int, @2 int', @0 = 24, @1 = 14, @2 = 25  [["agglo_code_id", 24], ["environment_id", "14"], ["product_id", "25"]]

I am trying to figure out how to debug this but I can't quite get it right. I would like to perform an explain on it however I can't access the db directly via a sql client as it's locked down to the ip of the server so I am trying to do it via the rails console on the server.

我试图弄清楚如何调试这个,但我不能完全正确。我想对它进行解释但是我不能直接通过sql客户端访问数据库,因为它被锁定到服务器的ip,所以我试图通过服务器上的rails控制台来实现。

I can do the following (not sure why it runs two queries):

我可以执行以下操作(不确定为什么它运行两个查询):

irb(main):049:0> ActiveRecord::Base.connection.execute('SELECT [panels].* FROM [panels] WHERE [panels].[agglo_code_id] = 24 AND [panels].[environment_id] = 14 AND [panels].[product_id] = 25 AND (NOT EXISTS(SELECT 1 FROM campaign_search_panels WHERE campaign_search_panels.panel_id = panels.panel_id AND campaign_search_panels.campaign_id = 32)) AND (NOT EXISTS(SELECT 1 FROM "AIDAAU_Avails" WHERE "AIDAAU_Avails"."PanelID" = panels.panel_uid AND "AIDAAU_Avails"."TillDate" >= ''08-21-2017'' AND "AIDAAU_Avails"."FromDate" <= ''09-03-2017''))')
   (47.3ms)  SELECT [panels].* FROM [panels] WHERE [panels].[agglo_code_id] = 24 AND [panels].[environment_id] = 14 AND [panels].[product_id] = 25 AND (NOT EXISTS(SELECT 1 FROM campaign_search_panels WHERE campaign_search_panels.panel_id = panels.panel_id AND campaign_search_panels.campaign_id = 32)) AND (NOT EXISTS(SELECT 1 FROM "AIDAAU_Avails" WHERE "AIDAAU_Avails"."PanelID" = panels.panel_uid AND "AIDAAU_Avails"."TillDate" >= 08-21-2017 AND "AIDAAU_Avails"."FromDate" <= 09-03-2017))
   (47.3ms)  SELECT [panels].* FROM [panels] WHERE [panels].[agglo_code_id] = 24 AND [panels].[environment_id] = 14 AND [panels].[product_id] = 25 AND (NOT EXISTS(SELECT 1 FROM campaign_search_panels WHERE campaign_search_panels.panel_id = panels.panel_id AND campaign_search_panels.campaign_id = 32)) AND (NOT EXISTS(SELECT 1 FROM "AIDAAU_Avails" WHERE "AIDAAU_Avails"."PanelID" = panels.panel_uid AND "AIDAAU_Avails"."TillDate" >= 08-21-2017 AND "AIDAAU_Avails"."FromDate" <= 09-03-2017))
=> 1143

and its much faster that the above but is that because I have replaced all the scalar variables or why is it so much faster? Is there any way I can run the query exactly the same? ie:

它比上面的速度要快得多,但是因为我已经替换了所有的标量变量或为什么它更快?有什么方法可以完全相同地运行查询吗?即:

query = <<-SQL 
  EXEC sp_executesql N'SELECT [panels].* FROM [panels] WHERE [panels].[agglo_code_id] = @0 AND [panels].[environment_id] = @1 AND [panels].[product_id] = @2 AND (NOT EXISTS(SELECT 1 FROM campaign_search_panels WHERE campaign_search_panels.panel_id = panels.panel_id AND campaign_search_panels.campaign_id = 32)) AND (NOT EXISTS(SELECT 1 FROM "AIDAAU_Avails" WHERE "AIDAAU_Avails"."PanelID" = panels.panel_uid AND "AIDAAU_Avails"."TillDate" >= ''08-21-2017'' AND "AIDAAU_Avails"."FromDate" <= ''09-03-2017''))', N'@0 int, @1 int, @2 int', @0 = 24, @1 = 14, @2 = 25  [["agglo_code_id", 24], ["environment_id", "14"], ["product_id", "25"]]
SQL
ActiveRecord::Base.connection.execute(query)

ActiveRecord::StatementInvalid: TinyTds::Error: Incorrect syntax near '["agglo_code_id", 24'.:

any ideas how it can be improved?

任何想法如何改进?

1 个解决方案

#1


0  

Without the execution plan, it will be extremely difficult to diagnose what the exact performance problem is. However, just by looking at your SQL, I see a huge red flag for me that is likely your performance problem.

如果没有执行计划,诊断确切的性能问题将非常困难。但是,只要看一下你的SQL,我就会看到一个巨大的红旗,这可能是你的性能问题。

SELECT
    [panels].*
FROM [panels]
WHERE
    [panels].[agglo_code_id] = @0
AND
    [panels].[environment_id] = @1
AND
    [panels].[product_id] = @2
AND
    (
        NOT EXISTS( SELECT 1
                    FROM campaign_search_panels
                    WHERE
                        campaign_search_panels.panel_id = panels.panel_id
                    AND
                        campaign_search_panels.campaign_id = 32)
    )
AND
    (
        NOT EXISTS( SELECT 1
                    FROM AIDAAU_Avails
                    WHERE
                        AIDAAU_Avails.PanelID = panels.panel_uid
                    AND
                        AIDAAU_Avails.TillDate >= '08-21-2017'
                    AND
                        AIDAAU_Avails.FromDate <= '09-03-2017')
    )

When I extracted your dynamic SQL and made it pretty, I discovered two things that can cause performance problems. First, you have a SELECT * which will grab every column from the table, regardless if you need it. You could potentially be slowing yourself down because you are grabbing way more data then you actually need.

当我提取动态SQL并使其漂亮时,我发现了两个可能导致性能问题的事情。首先,你有一个SELECT *,它将从表中获取每一列,无论你是否需要它。您可能会放慢速度,因为您正在抓取实际需要的更多数据。

The second thing, which is my huge red flag, is you have two NOT EXISTS clauses that run SQL queries. Depending on the amount of data between the three tables, this can be a very expensive operation. For every record returned by your main your query, you need to run each of the NOT EXISTS queries. That means if the main query returns 100 rows, you have to run 200 additional queries to satisfy your where clause.

第二件事是我的巨大红旗,你有两个运行SQL查询的NOT EXISTS子句。根据三个表之间的数据量,这可能是非常昂贵的操作。对于主查询返回的每条记录,您需要运行每个NOT EXISTS查询。这意味着如果主查询返回100行,则必须运行200个额外的查询以满足您的where子句。

To fix this, you should be able replace those NOT EXISTS with two LEFT JOIN. I can guess on how to do it, but without data to work with, I can't be certain and don't want to give you something that makes things worse.

要解决这个问题,您应该能够用两个LEFT JOIN替换那些NOT EXISTS。我可以猜测如何做,但没有数据可以使用,我不能确定,不想给你一些让事情变得更糟的东西。

To give you an idea of performance difference, I had a query doing something similar. It would take 36 hours to run due to the size of the data. I replaced the sub-queries with some sort of JOIN and I had it running in less then an hour.

为了让您了解性能差异,我有一个查询做类似的事情。由于数据的大小,运行需要36个小时。我用某种JOIN替换了子查询,并且我在不到一个小时内运行它。

#1


0  

Without the execution plan, it will be extremely difficult to diagnose what the exact performance problem is. However, just by looking at your SQL, I see a huge red flag for me that is likely your performance problem.

如果没有执行计划,诊断确切的性能问题将非常困难。但是,只要看一下你的SQL,我就会看到一个巨大的红旗,这可能是你的性能问题。

SELECT
    [panels].*
FROM [panels]
WHERE
    [panels].[agglo_code_id] = @0
AND
    [panels].[environment_id] = @1
AND
    [panels].[product_id] = @2
AND
    (
        NOT EXISTS( SELECT 1
                    FROM campaign_search_panels
                    WHERE
                        campaign_search_panels.panel_id = panels.panel_id
                    AND
                        campaign_search_panels.campaign_id = 32)
    )
AND
    (
        NOT EXISTS( SELECT 1
                    FROM AIDAAU_Avails
                    WHERE
                        AIDAAU_Avails.PanelID = panels.panel_uid
                    AND
                        AIDAAU_Avails.TillDate >= '08-21-2017'
                    AND
                        AIDAAU_Avails.FromDate <= '09-03-2017')
    )

When I extracted your dynamic SQL and made it pretty, I discovered two things that can cause performance problems. First, you have a SELECT * which will grab every column from the table, regardless if you need it. You could potentially be slowing yourself down because you are grabbing way more data then you actually need.

当我提取动态SQL并使其漂亮时,我发现了两个可能导致性能问题的事情。首先,你有一个SELECT *,它将从表中获取每一列,无论你是否需要它。您可能会放慢速度,因为您正在抓取实际需要的更多数据。

The second thing, which is my huge red flag, is you have two NOT EXISTS clauses that run SQL queries. Depending on the amount of data between the three tables, this can be a very expensive operation. For every record returned by your main your query, you need to run each of the NOT EXISTS queries. That means if the main query returns 100 rows, you have to run 200 additional queries to satisfy your where clause.

第二件事是我的巨大红旗,你有两个运行SQL查询的NOT EXISTS子句。根据三个表之间的数据量,这可能是非常昂贵的操作。对于主查询返回的每条记录,您需要运行每个NOT EXISTS查询。这意味着如果主查询返回100行,则必须运行200个额外的查询以满足您的where子句。

To fix this, you should be able replace those NOT EXISTS with two LEFT JOIN. I can guess on how to do it, but without data to work with, I can't be certain and don't want to give you something that makes things worse.

要解决这个问题,您应该能够用两个LEFT JOIN替换那些NOT EXISTS。我可以猜测如何做,但没有数据可以使用,我不能确定,不想给你一些让事情变得更糟的东西。

To give you an idea of performance difference, I had a query doing something similar. It would take 36 hours to run due to the size of the data. I replaced the sub-queries with some sort of JOIN and I had it running in less then an hour.

为了让您了解性能差异,我有一个查询做类似的事情。由于数据的大小,运行需要36个小时。我用某种JOIN替换了子查询,并且我在不到一个小时内运行它。