T-SQL参数嗅探重新编译计划

时间:2021-07-15 08:51:24

I have the SQL command

我有SQL命令

exec sp_executesql N'SELECT TOP (10) * FROM mytableView WHERE ([Name]) LIKE (''%'' + 
  (@Value0) + ''%'') ORDER BY [Id] DESC',N'@Value0 varchar(5)',@Value0='value'

this sql command execute near 22 seconds. I fount that it happens because I have a parameter sniffing.. If add to end of SQL command option(recompile) it's work fast: 0 seconds was shown in Managements studio

这个sql命令执行时间接近22秒。我发现这是因为我有一个参数嗅探。如果添加到SQL命令选项(重新编译)的末尾,它的工作速度很快:0秒显示在Managements studio中

exec sp_executesql N'SELECT TOP (10) * FROM mytableView WHERE ([Name]) LIKE 
  (''%'' + (@Value0) + ''%'') ORDER BY [Id] DESC 
    option(recompile)',N'@Value0 varchar(5)',@Value0='value'

Is it possible to recompile/recreate/erase/update execution plan for my SQL command to work without option(recompile)?

我的SQL命令是否可以在没有选项(重新编译)的情况下重新编译/重新创建/删除/更新执行计划?

I have tried to apply

我试过申请

  • UPDATE STATISTICS
  • 更新统计数据
  • sp_recompile
  • sp_recompile
  • DBCC FREEPROCCACHE
  • DBCC FREEPROCCACHE
  • DBCC UPDATEUSAGE (0)
  • DBCC UPDATEUSAGE(0)
  • DBCC FREESYSTEMCACHE ('ALL')
  • DBCC FREESYSTEMCACHE(所有的)
  • ALTER INDEX and REBUILD WITH unfortunately all this actions didn't help me.
  • 修改索引和重新构建不幸的是,所有这些操作都没有帮助我。

1 个解决方案

#1


4  

You could try the OPTIMIZE FOR UNKNOWN hint instead of RECOMPILE:

你可以尝试对未知提示进行优化,而不是重新编译:

exec sp_executesql N'SELECT TOP (10) *
                     FROM mytableView
                     WHERE ([Name]) LIKE (''%'' + (@Value0) + ''%'')
                     ORDER BY [Id] DESC
                     option(OPTIMIZE FOR UNKNOWN);',
                   N'@Value0 varchar(5)',
                   @Value0 = 'value';

The MSDN page for Query Hints states that OPTIMIZE FOR UNKNOWN:

查询提示的MSDN页面声明了对未知的优化:

Instructs the query optimizer to use statistical data instead of the initial values for all local variables when the query is compiled and optimized, including parameters created with forced parameterization.

指示查询优化器在编译和优化查询时使用统计数据而不是所有本地变量的初始值,包括使用强制参数化创建的参数。

This hint instructs the optimizer to use the total number of rows for the specified table divided by the number of distinct values for the specified column (i.e. average rows per value) as the row estimate instead of using the statistics of any particular value. As pointed out by @GarethD in a comment below: since this will possibly benefit some queries and possibly hurt others, it needs to be tested to see if the overall gain from this is a net savings over the cost of doing the RECOMPILE. For more details check out: How OPTIMIZE FOR UNKNOWN Works.

这个提示指示优化器使用指定表的总行数除以指定列(即每个值的平均行数)的不同值,而不是使用任何特定值的统计值。正如@GarethD在下面的评论中指出的:由于这可能会对某些查询有利,也可能会对其他查询不利,因此需要对它进行测试,看看从这方面获得的总体收益是否比重新编译的成本节省了净额。有关详细信息,请参阅:如何优化未知工作。

And just to have it stated, depending on the distribution of the data and what values are passed in, if there is a particular value being used that has a distribution that is fairly representative of most of the values that could be passed in (even if wildly different from some values that won't ever be passed in), then you can target that value by using OPTIMIZE FOR (@Value0 = 'representative value') rather than OPTIMIZE FOR UNKNOWN.

为了表示,根据数据的分布和值在传递什么,如果有一个特定值使用,分布相当大部分的代表值,可以通过(即使非常不同于一些不会被传递的值),然后你可以通过使用优化目标价值(@Value0 =代表值)而不是优化未知。

Please note that this query hint is only needed for queries that have:

请注意,此查询提示仅用于以下查询:

  • parameters supplied by variables
  • 提供的参数变量
  • the field(s) in question do not have a fairly even distribution of values (and hence different values passed in via the variable could generate different plans)
  • 所讨论的字段没有相当均匀的值分布(因此通过变量传入的不同值可能产生不同的计划)

The following scenarios were identified in comments below and do not all require this hint, so here is how to address each situation:

下面的评论指出了以下的场景,并不都需要这个提示,下面是如何处理每个场景的:

  • select top 80 * from table order by id desc
    There is no variable being passed in here so no query hint needed.

    通过id desc从表顺序中选择top 80 *,这里没有传递变量,因此不需要查询提示。

  • select top 80 * from table where id < @lastid order by id desc
    There is a variable being passed in here, but the [id] field, by its very nature, is evenly distributed, even if sparse due to some deletes, hence no query hint needed (or at least should not be needed).

    从表中选择前80 * id < @lastid订单的id desc传递在这里有一个变量,但(id)领域,由于其本身的性质,是均匀分布的,即使由于一些删除稀疏,因此不需要查询提示(或至少应该不需要)。

  • SELECT TOP (10) * FROM mytableView WHERE ([Name]) LIKE (''%'' + (@Value0) + ''%'') ORDER BY [Id] DESC
    There is a variable being passed in here, and used in such a way that there could be no indication of consistent numbers of matching rows for different values, especially due to not being able to use an index as a result of the leading %. THIS is a good opportunity for the OPTION (OPTIMIZE FOR UNKNOWN) hint as discussed above.

    选择从mytableView最高(10)*([名字])像(“%”+(@Value0)+“%”)命令(Id)DESC传递在这里有一个变量,并使用的方式可能是没有显示一致的数字为不同的值匹配的行,特别是由于无法使用索引结果领先的%。这是上面讨论的选项(优化未知)提示的好机会。

  • If there is a situation where a variable is passed in that has a greatly varying distribution of matching rows, but not many possible values to get passed in, and the values that are passed in are re-used frequently, then those can be concatenated (after doing a REPLACE(@var, '''', '''''') on it) directly into the Dynamic SQL. This allows for each of those values to have their own separate yet reusable query plan. Other variables should be sent in as parameters as usual.

    如果有一个变量是通过有极大的不同分布的匹配的行,但不是很多可能的值传递,传递的值是经常重复使用,那么这些后可以连接(做一个替换(@var、”“”“”)在动态SQL)直接进入。这允许这些值中的每个都有各自独立的但可重用的查询计划。其他变量应该像往常一样作为参数发送。

    For example, a lookup value for [StatusID] will only have a few possible values and they will get reused frequently but each particular value can match a vastly different number of rows. In that case, something like the following will allow for separate execution plans that do not need either the RECOMPILE or OPTIMIZE FOR UNKNOWN hints as each execution plan will be optimized for that particular value:

    例如,[雕塑类]的查找值将只有几个可能的值,并且它们将经常被重用,但是每个特定的值可以匹配非常不同的行数。在这种情况下,如下内容将允许单独的执行计划,它们不需要重新编译或优化未知的提示,因为每个执行计划都将针对该特定值进行优化:

    IF (TRY_CONVERT(INT, @StatusID) IS NULL)
    BEGIN
       ;THROW 50505, '@StatusID was not a valid INT', 55;
    END;
    
    DECLARE @SQL NVARCHAR(MAX);
    SET @SQL = N'SELECT TOP (10) * FROM myTableView WHERE [StatusID] = '
               + REPLACE(@StatusID, N'''', N'''''') -- really only needed for strings
               + N' AND [OtherField] = @OtherFieldVal;';
    
    EXEC sp_executesql
                   @SQL,
                   N'@OtherFieldVal VARCHAR(50)',
                   @OtherFieldVal = @OtherField;
    

    Assuming two different values of @StatusID are passed in (e.g. 1 and 2), there will be two execution plans cached matching the following queries:

    假设有两个不同的@ shellsid值被传入(例如1和2),那么缓存的两个执行计划将匹配以下查询:

    • SELECT TOP (10) * FROM myTableView WHERE [StatusID] = 1 AND [OtherField] = @OtherFieldVal;

      从myTableView中选择TOP(10) *,其中[雕塑]= 1,[OtherField] = @OtherFieldVal;

    • SELECT TOP (10) * FROM myTableView WHERE [StatusID] = 2 AND [OtherField] = @OtherFieldVal;

      从myTableView中选择TOP(10) *,其中[雕塑]= 2,[OtherField] = @OtherFieldVal;

#1


4  

You could try the OPTIMIZE FOR UNKNOWN hint instead of RECOMPILE:

你可以尝试对未知提示进行优化,而不是重新编译:

exec sp_executesql N'SELECT TOP (10) *
                     FROM mytableView
                     WHERE ([Name]) LIKE (''%'' + (@Value0) + ''%'')
                     ORDER BY [Id] DESC
                     option(OPTIMIZE FOR UNKNOWN);',
                   N'@Value0 varchar(5)',
                   @Value0 = 'value';

The MSDN page for Query Hints states that OPTIMIZE FOR UNKNOWN:

查询提示的MSDN页面声明了对未知的优化:

Instructs the query optimizer to use statistical data instead of the initial values for all local variables when the query is compiled and optimized, including parameters created with forced parameterization.

指示查询优化器在编译和优化查询时使用统计数据而不是所有本地变量的初始值,包括使用强制参数化创建的参数。

This hint instructs the optimizer to use the total number of rows for the specified table divided by the number of distinct values for the specified column (i.e. average rows per value) as the row estimate instead of using the statistics of any particular value. As pointed out by @GarethD in a comment below: since this will possibly benefit some queries and possibly hurt others, it needs to be tested to see if the overall gain from this is a net savings over the cost of doing the RECOMPILE. For more details check out: How OPTIMIZE FOR UNKNOWN Works.

这个提示指示优化器使用指定表的总行数除以指定列(即每个值的平均行数)的不同值,而不是使用任何特定值的统计值。正如@GarethD在下面的评论中指出的:由于这可能会对某些查询有利,也可能会对其他查询不利,因此需要对它进行测试,看看从这方面获得的总体收益是否比重新编译的成本节省了净额。有关详细信息,请参阅:如何优化未知工作。

And just to have it stated, depending on the distribution of the data and what values are passed in, if there is a particular value being used that has a distribution that is fairly representative of most of the values that could be passed in (even if wildly different from some values that won't ever be passed in), then you can target that value by using OPTIMIZE FOR (@Value0 = 'representative value') rather than OPTIMIZE FOR UNKNOWN.

为了表示,根据数据的分布和值在传递什么,如果有一个特定值使用,分布相当大部分的代表值,可以通过(即使非常不同于一些不会被传递的值),然后你可以通过使用优化目标价值(@Value0 =代表值)而不是优化未知。

Please note that this query hint is only needed for queries that have:

请注意,此查询提示仅用于以下查询:

  • parameters supplied by variables
  • 提供的参数变量
  • the field(s) in question do not have a fairly even distribution of values (and hence different values passed in via the variable could generate different plans)
  • 所讨论的字段没有相当均匀的值分布(因此通过变量传入的不同值可能产生不同的计划)

The following scenarios were identified in comments below and do not all require this hint, so here is how to address each situation:

下面的评论指出了以下的场景,并不都需要这个提示,下面是如何处理每个场景的:

  • select top 80 * from table order by id desc
    There is no variable being passed in here so no query hint needed.

    通过id desc从表顺序中选择top 80 *,这里没有传递变量,因此不需要查询提示。

  • select top 80 * from table where id < @lastid order by id desc
    There is a variable being passed in here, but the [id] field, by its very nature, is evenly distributed, even if sparse due to some deletes, hence no query hint needed (or at least should not be needed).

    从表中选择前80 * id < @lastid订单的id desc传递在这里有一个变量,但(id)领域,由于其本身的性质,是均匀分布的,即使由于一些删除稀疏,因此不需要查询提示(或至少应该不需要)。

  • SELECT TOP (10) * FROM mytableView WHERE ([Name]) LIKE (''%'' + (@Value0) + ''%'') ORDER BY [Id] DESC
    There is a variable being passed in here, and used in such a way that there could be no indication of consistent numbers of matching rows for different values, especially due to not being able to use an index as a result of the leading %. THIS is a good opportunity for the OPTION (OPTIMIZE FOR UNKNOWN) hint as discussed above.

    选择从mytableView最高(10)*([名字])像(“%”+(@Value0)+“%”)命令(Id)DESC传递在这里有一个变量,并使用的方式可能是没有显示一致的数字为不同的值匹配的行,特别是由于无法使用索引结果领先的%。这是上面讨论的选项(优化未知)提示的好机会。

  • If there is a situation where a variable is passed in that has a greatly varying distribution of matching rows, but not many possible values to get passed in, and the values that are passed in are re-used frequently, then those can be concatenated (after doing a REPLACE(@var, '''', '''''') on it) directly into the Dynamic SQL. This allows for each of those values to have their own separate yet reusable query plan. Other variables should be sent in as parameters as usual.

    如果有一个变量是通过有极大的不同分布的匹配的行,但不是很多可能的值传递,传递的值是经常重复使用,那么这些后可以连接(做一个替换(@var、”“”“”)在动态SQL)直接进入。这允许这些值中的每个都有各自独立的但可重用的查询计划。其他变量应该像往常一样作为参数发送。

    For example, a lookup value for [StatusID] will only have a few possible values and they will get reused frequently but each particular value can match a vastly different number of rows. In that case, something like the following will allow for separate execution plans that do not need either the RECOMPILE or OPTIMIZE FOR UNKNOWN hints as each execution plan will be optimized for that particular value:

    例如,[雕塑类]的查找值将只有几个可能的值,并且它们将经常被重用,但是每个特定的值可以匹配非常不同的行数。在这种情况下,如下内容将允许单独的执行计划,它们不需要重新编译或优化未知的提示,因为每个执行计划都将针对该特定值进行优化:

    IF (TRY_CONVERT(INT, @StatusID) IS NULL)
    BEGIN
       ;THROW 50505, '@StatusID was not a valid INT', 55;
    END;
    
    DECLARE @SQL NVARCHAR(MAX);
    SET @SQL = N'SELECT TOP (10) * FROM myTableView WHERE [StatusID] = '
               + REPLACE(@StatusID, N'''', N'''''') -- really only needed for strings
               + N' AND [OtherField] = @OtherFieldVal;';
    
    EXEC sp_executesql
                   @SQL,
                   N'@OtherFieldVal VARCHAR(50)',
                   @OtherFieldVal = @OtherField;
    

    Assuming two different values of @StatusID are passed in (e.g. 1 and 2), there will be two execution plans cached matching the following queries:

    假设有两个不同的@ shellsid值被传入(例如1和2),那么缓存的两个执行计划将匹配以下查询:

    • SELECT TOP (10) * FROM myTableView WHERE [StatusID] = 1 AND [OtherField] = @OtherFieldVal;

      从myTableView中选择TOP(10) *,其中[雕塑]= 1,[OtherField] = @OtherFieldVal;

    • SELECT TOP (10) * FROM myTableView WHERE [StatusID] = 2 AND [OtherField] = @OtherFieldVal;

      从myTableView中选择TOP(10) *,其中[雕塑]= 2,[OtherField] = @OtherFieldVal;