We're seeing strange behavior when running two versions of a query on SQL Server 2005:
在SQL Server 2005上运行两个版本的查询时,我们看到了奇怪的行为:
version A:
版本A:
SELECT otherattributes.* FROM listcontacts JOIN otherattributes
ON listcontacts.contactId = otherattributes.contactId WHERE listcontacts.listid = 1234
ORDER BY name ASC
version B:
版本B:
DECLARE @Id AS INT;
SET @Id = 1234;
SELECT otherattributes.* FROM listcontacts JOIN otherattributes
ON listcontacts.contactId = otherattributes.contactId
WHERE listcontacts.listid = @Id
ORDER BY name ASC
Both queries return 1000 rows; version A takes on average 15s; version B on average takes 4s. Could anyone help us understand the difference in execution times of these two versions of SQL?
两个查询都返回1000行;版本A平均需要15秒;版本B平均需要4秒。任何人都可以帮助我们理解这两个版本的SQL执行时间的差异吗?
If we invoke this query via named parameters using NHibernate, we see the following query via SQL Server profiler:
如果我们使用NHibernate通过命名参数调用此查询,我们通过SQL Server探查器看到以下查询:
EXEC sp_executesql N'SELECT otherattributes.* FROM listcontacts JOIN otherattributes ON listcontacts.contactId = otherattributes.contactId WHERE listcontacts.listid = @id ORDER BY name ASC',
N'@id INT',
@id=1234;
...and this tends to perform as badly as version A.
......这往往与版本A一样糟糕。
4 个解决方案
#1
2
Try take a look at the execution plan for your query. This should give you some more explanation on how your query is executed.
尝试查看查询的执行计划。这应该为您提供有关如何执行查询的更多解释。
#2
2
I've not seen the execution plans, but I strongly suspect that they are different in these two cases. The issue that you are having is that in case A (the faster query) the optimiser knows the value that you are using for the list id (1234) and using a combination of the distribution statistics and the indexes chooses an optimal plan.
我没有看到执行计划,但我强烈怀疑它们在这两种情况下是不同的。您遇到的问题是,在A(更快的查询)情况下,优化器知道您用于列表ID(1234)的值,并使用分布统计信息和索引的组合选择最佳计划。
In the second case, the optimiser is not able to sniff the value of the ID and so produces a plan that would be acceptable for any passed in list id. And where I say acceptable I do not mean optimal.
在第二种情况下,优化器无法嗅探ID的值,因此生成一个对于任何传入的列表ID都可接受的计划。在我说可接受的地方,我并不意味着最佳。
So what can you do to improve the scenario? There are a couple of alternatives here:
那么你可以做些什么来改善这种情况呢?这里有几种选择:
1) Create a stored procedure to perform the query as below:
1)创建存储过程以执行以下查询:
CREATE PROCEDURE Foo @Id INT AS SELECT otherattributes.* FROM listcontacts JOIN otherattributes ON listcontacts.contactId = otherattributes.contactId WHERE listcontacts.listid = @Id ORDER BY name ASC
CREATE PROCEDURE Foo @Id INT AS SELECT otherattributes。* FROM listcontacts JOIN otherattributes ON listcontacts.contactId = otherattributes.contactId WHERE listcontacts.listid = @Id ORDER BY name ASC
GO
走
This will allow the optimiser to sniff the value of the input parameter when passed in and produce an appropriate execution plan for the first execution. Unfortunately it will cache that plan for reuse later so unless the you generally call the sproc with similarly selective values this may not help you too much
这将允许优化器在传入时嗅探输入参数的值,并为第一次执行生成适当的执行计划。不幸的是,它将缓存该计划以便稍后重复使用,因此除非您通常使用类似的选择性值调用sproc,否则这可能对您没有多大帮助
2) Create a stored procedure as above, but specify it to be WITH RECOMPILE. This will ensure that the stored procedure is recompiled each time it is executed and hence produce a new plan optimised for this input value
2)创建如上所述的存储过程,但将其指定为WITH RECOMPILE。这将确保每次执行时重新编译存储过程,从而生成针对此输入值优化的新计划
3) Add OPTION (RECOMPILE) to the end of the SQL Statement. Forces recompilation of this statement, and is able to optimise for the input value
3)将OPTION(RECOMPILE)添加到SQL语句的末尾。强制重新编译此语句,并能够针对输入值进行优化
4) Add OPTION (OPTIMIZE FOR (@Id = 1234)) to the end of the SQL statement. This will cause the plan that gets cached to be optimised for this specific input value. Great if this is a highly common value, or most common values are similarly selective, but not so great if the distribution of selectivity is more widely spread.
4)将OPTION(OPTIMIZE FOR(@Id = 1234))添加到SQL语句的末尾。这将导致缓存的计划针对此特定输入值进行优化。如果这是一个非常常见的值,或者大多数常见值具有相似的选择性,那么很好,但如果选择性的分布更广泛地传播则不是很好。
#3
0
It's possible that instead of casting 1234 to be the same type as listcontacts.listid and then doing the comparison with each row, it might be casting the value in each row to be the same as 1234. The first requires just one cast, the second needs a cast per row (and that's probably on far more than 1000 rows, it may be for every row in the table). I'm not sure what type that constant will be interpreted as but it may be 'numeric' rather than 'int'.
有可能的是,不是将1234转换为与listcontacts.listid相同的类型,然后与每行进行比较,它可能会将每行中的值转换为与1234相同。第一个只需要一个转换,第二个每行需要一个强制转换(这可能远远超过1000行,它可能适用于表中的每一行)。我不确定该常量将被解释为什么类型,但它可能是'数字'而不是'int'。
If this is the cause, the second version is faster because it's forcing 1234 to be interpreted as an int and thus removing the need to cast the value in every row.
如果这是原因,则第二个版本更快,因为它强制1234被解释为int,因此无需在每一行中转换值。
However, as the previous poster suggests, the query plan shown in SQL Server Management Studio may indicate an alternative explanation.
但是,正如之前的海报所示,SQL Server Management Studio中显示的查询计划可能表示另一种解释。
#4
0
The best way to see what is happening is to compare the execution plans, everything else is speculation based on the limited details presented in the question.
查看正在发生的事情的最好方法是比较执行计划,其他一切都是根据问题中提供的有限细节进行推测。
To see the execution plan, go into SQL Server Management Studio and run SET SHOWPLAN_XML ON
then run query version A, the query will not run but the execution plan will be displayed in XML. Then run query version B and see its execution plan. If you still can't tell the difference or solve the problem, post both execution plans and someone here will explain it.
要查看执行计划,请进入SQL Server Management Studio并运行SET SHOWPLAN_XML ON然后运行查询版本A,查询将不会运行,但执行计划将以XML格式显示。然后运行查询版本B并查看其执行计划。如果您仍然无法区分或解决问题,请发布执行计划,此处有人会解释。
#1
2
Try take a look at the execution plan for your query. This should give you some more explanation on how your query is executed.
尝试查看查询的执行计划。这应该为您提供有关如何执行查询的更多解释。
#2
2
I've not seen the execution plans, but I strongly suspect that they are different in these two cases. The issue that you are having is that in case A (the faster query) the optimiser knows the value that you are using for the list id (1234) and using a combination of the distribution statistics and the indexes chooses an optimal plan.
我没有看到执行计划,但我强烈怀疑它们在这两种情况下是不同的。您遇到的问题是,在A(更快的查询)情况下,优化器知道您用于列表ID(1234)的值,并使用分布统计信息和索引的组合选择最佳计划。
In the second case, the optimiser is not able to sniff the value of the ID and so produces a plan that would be acceptable for any passed in list id. And where I say acceptable I do not mean optimal.
在第二种情况下,优化器无法嗅探ID的值,因此生成一个对于任何传入的列表ID都可接受的计划。在我说可接受的地方,我并不意味着最佳。
So what can you do to improve the scenario? There are a couple of alternatives here:
那么你可以做些什么来改善这种情况呢?这里有几种选择:
1) Create a stored procedure to perform the query as below:
1)创建存储过程以执行以下查询:
CREATE PROCEDURE Foo @Id INT AS SELECT otherattributes.* FROM listcontacts JOIN otherattributes ON listcontacts.contactId = otherattributes.contactId WHERE listcontacts.listid = @Id ORDER BY name ASC
CREATE PROCEDURE Foo @Id INT AS SELECT otherattributes。* FROM listcontacts JOIN otherattributes ON listcontacts.contactId = otherattributes.contactId WHERE listcontacts.listid = @Id ORDER BY name ASC
GO
走
This will allow the optimiser to sniff the value of the input parameter when passed in and produce an appropriate execution plan for the first execution. Unfortunately it will cache that plan for reuse later so unless the you generally call the sproc with similarly selective values this may not help you too much
这将允许优化器在传入时嗅探输入参数的值,并为第一次执行生成适当的执行计划。不幸的是,它将缓存该计划以便稍后重复使用,因此除非您通常使用类似的选择性值调用sproc,否则这可能对您没有多大帮助
2) Create a stored procedure as above, but specify it to be WITH RECOMPILE. This will ensure that the stored procedure is recompiled each time it is executed and hence produce a new plan optimised for this input value
2)创建如上所述的存储过程,但将其指定为WITH RECOMPILE。这将确保每次执行时重新编译存储过程,从而生成针对此输入值优化的新计划
3) Add OPTION (RECOMPILE) to the end of the SQL Statement. Forces recompilation of this statement, and is able to optimise for the input value
3)将OPTION(RECOMPILE)添加到SQL语句的末尾。强制重新编译此语句,并能够针对输入值进行优化
4) Add OPTION (OPTIMIZE FOR (@Id = 1234)) to the end of the SQL statement. This will cause the plan that gets cached to be optimised for this specific input value. Great if this is a highly common value, or most common values are similarly selective, but not so great if the distribution of selectivity is more widely spread.
4)将OPTION(OPTIMIZE FOR(@Id = 1234))添加到SQL语句的末尾。这将导致缓存的计划针对此特定输入值进行优化。如果这是一个非常常见的值,或者大多数常见值具有相似的选择性,那么很好,但如果选择性的分布更广泛地传播则不是很好。
#3
0
It's possible that instead of casting 1234 to be the same type as listcontacts.listid and then doing the comparison with each row, it might be casting the value in each row to be the same as 1234. The first requires just one cast, the second needs a cast per row (and that's probably on far more than 1000 rows, it may be for every row in the table). I'm not sure what type that constant will be interpreted as but it may be 'numeric' rather than 'int'.
有可能的是,不是将1234转换为与listcontacts.listid相同的类型,然后与每行进行比较,它可能会将每行中的值转换为与1234相同。第一个只需要一个转换,第二个每行需要一个强制转换(这可能远远超过1000行,它可能适用于表中的每一行)。我不确定该常量将被解释为什么类型,但它可能是'数字'而不是'int'。
If this is the cause, the second version is faster because it's forcing 1234 to be interpreted as an int and thus removing the need to cast the value in every row.
如果这是原因,则第二个版本更快,因为它强制1234被解释为int,因此无需在每一行中转换值。
However, as the previous poster suggests, the query plan shown in SQL Server Management Studio may indicate an alternative explanation.
但是,正如之前的海报所示,SQL Server Management Studio中显示的查询计划可能表示另一种解释。
#4
0
The best way to see what is happening is to compare the execution plans, everything else is speculation based on the limited details presented in the question.
查看正在发生的事情的最好方法是比较执行计划,其他一切都是根据问题中提供的有限细节进行推测。
To see the execution plan, go into SQL Server Management Studio and run SET SHOWPLAN_XML ON
then run query version A, the query will not run but the execution plan will be displayed in XML. Then run query version B and see its execution plan. If you still can't tell the difference or solve the problem, post both execution plans and someone here will explain it.
要查看执行计划,请进入SQL Server Management Studio并运行SET SHOWPLAN_XML ON然后运行查询版本A,查询将不会运行,但执行计划将以XML格式显示。然后运行查询版本B并查看其执行计划。如果您仍然无法区分或解决问题,请发布执行计划,此处有人会解释。