Today i stumbled upon an interesting performance problem with a stored procedure running on Sql Server 2005 SP2 in a db running on compatible level of 80 (SQL2000).
今天,我在兼容级别为80(SQL2000)的数据库中运行在Sql Server 2005 SP2上的存储过程偶然发现了一个有趣的性能问题。
The proc runs about 8 Minutes and the execution plan shows the usage of an index with an actual row count of 1.339.241.423 which is about factor 1000 higher than the "real" actual rowcount of the table itself which is 1.144.640 as shown correctly by estimated row count. So the actual row count given by the query plan optimizer is definitly wrong!
proc运行大约8分钟,执行计划显示索引的使用,实际行数为1.339.241.423,比表本身的“实际”实际行数(1.144.640)高出约1000倍,如正确显示估计行数。因此查询计划优化器给出的实际行数肯定是错误的!
Interestingly enough, when i copy the procs parameter values inside the proc to local variables and than use the local variables in the actual query, everything works fine - the proc runs 18 seconds and the execution plan shows the right actual row count.
有趣的是,当我将proc中的procs参数值复制到局部变量而不是在实际查询中使用局部变量时,一切正常 - proc运行18秒,执行计划显示正确的实际行数。
EDIT: As suggested by TrickyNixon, this seems to be a sign of the parameter sniffing problem. But actually, i get in both cases exact the same execution plan. Same indices are beeing used in the same order. The only difference i see is the way to high actual row count on the PK_ED_Transitions index when directly using the parametervalues.
编辑:正如TrickyNixon所建议的那样,这似乎是参数嗅探问题的一个标志。但实际上,我在两种情况下都完全相同的执行计划。相同的指数以相同的顺序使用。我看到的唯一区别是直接使用参数值时PK_ED_Transitions索引的实际行数高的方法。
I have done dbcc dbreindex and UPDATE STATISTICS already without any success. dbcc show_statistics shows good data for the index, too.
我已经完成了dbcc dbreindex和UPDATE STATISTICS,但没有任何成功。 dbcc show_statistics也显示索引的良好数据。
The proc is created WITH RECOMPILE so every time it runs a new execution plan is getting compiled.
proc是使用RECOMPILE创建的,因此每次运行时都会编译新的执行计划。
To be more specific - this one runs fast:
更具体一点 - 这个速度很快:
CREATE Proc [dbo].[myProc](
@Param datetime
)
WITH RECOMPILE
as
set nocount on
declare @local datetime
set @local = @Param
select
some columns
from
table1
where
column = @local
group by
some other columns
And this version runs terribly slow, but produces exactly the same execution plan (besides the too high actual row count on an used index):
这个版本的运行速度非常慢,但产生的执行计划完全相同(除了使用过的索引的实际行数太高):
CREATE Proc [dbo].[myProc](
@Param datetime
)
WITH RECOMPILE
as
set nocount on
select
some columns
from
table1
where
column = @Param
group by
some other columns
Any ideas? Anybody out there who knows where Sql Server gets the actual row count value from when calculating query plans?
有任何想法吗?谁知道Sql Server在计算查询计划时从哪里获取实际行计数值?
Update: I tried the query on another server woth copat mode set to 90 (Sql2005). Its the same behavior. I think i will open up an ms support call, because this looks to me like a bug.
更新:我在另一台服务器上尝试了查询,将copat模式设置为90(Sql2005)。它的行为相同。我想我会打开ms支持电话,因为这对我来说就像一个bug。
5 个解决方案
#1
5
Ok, finally i got to it myself.
好的,最后我自己做到了。
The two query plans are different in a small detail which i missed at first. the slow one uses a nested loops operator to join two subqueries together. And that results in the high number at current row count on the index scan operator which is simply the result of multiplicating the number of rows of input a with number of rows of input b.
这两个查询计划在一个小细节上是不同的,我最初错过了。慢速使用嵌套循环运算符将两个子查询连接在一起。并且这导致索引扫描运算符上当前行计数的高数量,这简单地是将输入a的行数与输入b的行数相乘的结果。
I still don't know why the optimizer decides to use the nested loops instead of a hash match which runs 1000 timer faster in this case, but i could handle my problem by creating a new index, so that the engine does an index seek statt instead of an index scan under the nested loops.
我仍然不知道为什么优化器决定使用嵌套循环而不是在这种情况下运行1000个计时器的哈希匹配,但我可以通过创建新索引来处理我的问题,以便引擎执行索引查找statt而不是嵌套循环下的索引扫描。
#2
2
When you're checking execution plans of the stored proc against the copy/paste query, are you using the estimated plans or the actual plans? Make sure to click Query, Include Execution Plan, and then run each query. Compare those plans and see what the differences are.
当您根据复制/粘贴查询检查存储过程的执行计划时,您使用的是估计计划还是实际计划?确保单击“查询”,“包括执行计划”,然后运行每个查询。比较这些计划,看看有什么不同。
#3
1
It sounds like a case of Parameter Sniffing. Here's an excellent explanation along with possible solutions: I Smell a Parameter!
这听起来像参数嗅探的情况。这是一个很好的解释以及可能的解决方案:我闻到一个参数!
Here's another * thread that addresses it: Parameter Sniffing (or Spoofing) in SQL Server
这是另一个解决它的*线程:SQL Server中的参数嗅探(或欺骗)
#4
0
To me it still sounds as if the statistics were incorrect. Rebuilding the indexes does not necessarily update them.
对我来说,听起来好像统计数据不正确。重建索引不一定会更新它们。
Have you already tried an explicit UPDATE STATISTICS
for the affected tables?
您是否已为受影响的表尝试过显式的UPDATE STATISTICS?
#5
0
Have you run sp_spaceused to check if SQL Server's got the right summary for that table? I believe in SQL 2000 the engine used to use that sort of metadata when building execution plans. We used to have to run DBCC UPDATEUSAGE weekly to update the metadata on some of the rapidly changing tables, as SQL Server was choosing the wrong indexes due to the incorrect row count data.
您是否运行sp_spaceused以检查SQL Server是否获得该表的正确摘要?我相信在SQL 2000中,引擎在构建执行计划时曾经使用过这种元数据。我们过去必须每周运行DBCC UPDATEUSAGE来更新一些快速更改的表的元数据,因为SQL Server由于错误的行计数数据而选择了错误的索引。
You're running SQL 2005, and BOL says that in 2005 you shouldn't have to run UpdateUsage anymore, but since you're in 2000 compat mode you might find that it is still required.
你正在运行SQL 2005,并且BOL说在2005年你不应该再运行UpdateUsage了,但是因为你处于2000 compat模式,你可能会发现它仍然是必需的。
#1
5
Ok, finally i got to it myself.
好的,最后我自己做到了。
The two query plans are different in a small detail which i missed at first. the slow one uses a nested loops operator to join two subqueries together. And that results in the high number at current row count on the index scan operator which is simply the result of multiplicating the number of rows of input a with number of rows of input b.
这两个查询计划在一个小细节上是不同的,我最初错过了。慢速使用嵌套循环运算符将两个子查询连接在一起。并且这导致索引扫描运算符上当前行计数的高数量,这简单地是将输入a的行数与输入b的行数相乘的结果。
I still don't know why the optimizer decides to use the nested loops instead of a hash match which runs 1000 timer faster in this case, but i could handle my problem by creating a new index, so that the engine does an index seek statt instead of an index scan under the nested loops.
我仍然不知道为什么优化器决定使用嵌套循环而不是在这种情况下运行1000个计时器的哈希匹配,但我可以通过创建新索引来处理我的问题,以便引擎执行索引查找statt而不是嵌套循环下的索引扫描。
#2
2
When you're checking execution plans of the stored proc against the copy/paste query, are you using the estimated plans or the actual plans? Make sure to click Query, Include Execution Plan, and then run each query. Compare those plans and see what the differences are.
当您根据复制/粘贴查询检查存储过程的执行计划时,您使用的是估计计划还是实际计划?确保单击“查询”,“包括执行计划”,然后运行每个查询。比较这些计划,看看有什么不同。
#3
1
It sounds like a case of Parameter Sniffing. Here's an excellent explanation along with possible solutions: I Smell a Parameter!
这听起来像参数嗅探的情况。这是一个很好的解释以及可能的解决方案:我闻到一个参数!
Here's another * thread that addresses it: Parameter Sniffing (or Spoofing) in SQL Server
这是另一个解决它的*线程:SQL Server中的参数嗅探(或欺骗)
#4
0
To me it still sounds as if the statistics were incorrect. Rebuilding the indexes does not necessarily update them.
对我来说,听起来好像统计数据不正确。重建索引不一定会更新它们。
Have you already tried an explicit UPDATE STATISTICS
for the affected tables?
您是否已为受影响的表尝试过显式的UPDATE STATISTICS?
#5
0
Have you run sp_spaceused to check if SQL Server's got the right summary for that table? I believe in SQL 2000 the engine used to use that sort of metadata when building execution plans. We used to have to run DBCC UPDATEUSAGE weekly to update the metadata on some of the rapidly changing tables, as SQL Server was choosing the wrong indexes due to the incorrect row count data.
您是否运行sp_spaceused以检查SQL Server是否获得该表的正确摘要?我相信在SQL 2000中,引擎在构建执行计划时曾经使用过这种元数据。我们过去必须每周运行DBCC UPDATEUSAGE来更新一些快速更改的表的元数据,因为SQL Server由于错误的行计数数据而选择了错误的索引。
You're running SQL 2005, and BOL says that in 2005 you shouldn't have to run UpdateUsage anymore, but since you're in 2000 compat mode you might find that it is still required.
你正在运行SQL 2005,并且BOL说在2005年你不应该再运行UpdateUsage了,但是因为你处于2000 compat模式,你可能会发现它仍然是必需的。