如何解决SQL Server - 基于参数的内联表值函数执行计划变化？

Here is the situation:
I have a table value function with a datetime parameter ,lest's say tdf(p_date) , that filters about two million rows selecting those with column date smaller than p_date and computes some aggregate values on other columns.
It works great but if p_date is a custom scalar value function (returning the end of day in my case) the execution plan is altered an the query goes from 1 sec to 1 minute execution time.

下面是这样的情况:我有一个带有datetime参数的表值函数,最好是tdf(p_date),它可以过滤大约200万行,选择列日期小于p_date的行,并计算其他列的某些聚合值。它工作得很好但是如果p_date是一个自定义标量值函数(在我的情况下返回结束日期),则执行计划会被更改,查询的执行时间从1秒到1分钟。

A proof of concept table - 1K products, 2M rows:

概念证明表 - 1K产品,2M行:

CREATE TABLE [dbo].[POC](
    [Date] [datetime] NOT NULL,
    [idProduct] [int] NOT NULL,
    [Quantity] [int] NOT NULL
) ON [PRIMARY]

The inline table value function:

内联表值函数:

CREATE FUNCTION tdf (@p_date datetime)
RETURNS TABLE 
AS
RETURN 
(
    SELECT idProduct, SUM(Quantity) AS TotalQuantity,
         max(Date) as LastDate
    FROM POC
    WHERE (Date < @p_date)
    GROUP BY idProduct
)

The scalar value function:

标量值函数:

CREATE FUNCTION [dbo].[EndOfDay] (@date datetime)
RETURNS datetime
AS
BEGIN
    DECLARE @res datetime
    SET @res=dateadd(second, -1,
         dateadd(day, 1, 
             dateadd(ms, -datepart(ms, @date),
                 dateadd(ss, -datepart(ss, @date),
                    dateadd(mi,- datepart(mi,@date),
                         dateadd(hh, -datepart(hh, @date), @date))))))
    RETURN @res
END

Query 1 - Working great

查询1 - 工作得很好

SELECT * FROM [dbo].[tdf] (getdate())

The end of execution plan: Stream Aggregate Cost 13% <--- Clustered Index Scan Cost 86%

执行计划结束:Stream Aggregate Cost 13%<--- Clustered Index Scan Cost 86%

Query 2 - Not so great

查询2 - 不太好

SELECT * FROM [dbo].[tdf] (dbo.EndOfDay(getdate()))

The end of execution plan: Stream Aggregate Cost 4% <--- Filter Cost 12% <--- Clustered Index Scan Cost 86%

执行计划结束:Stream Aggregate Cost 4%<--- Filter Cost 12%<--- Clustered Index Scan Cost 86%

2 个解决方案

#1

The overhead is your scalar function.

开销是你的标量函数。

The TVF here is expanded like an inline macro so

这里的TVF像内联宏一样扩展

SELECT * FROM [dbo].[tdf] (getdate())

becomes

SELECT     idProduct, SUM(Quantity) AS TotalQuantity, max(Date) as LastDate
    FROM         POC
    WHERE     Date < getdate()
    GROUP BY idProduct

When you use end of day scalar function, SQL can not evaluate the EOD(GETDATE()) as a constant. I can't find my article quickly on how SQL evaluates this stuff, sorry.

当您使用日结束标量函数时,SQL无法将EOD(GETDATE())计算为常量。抱歉,我无法快速找到关于SQL如何评估这些内容的文章。

I guess that it's being evaluated for each row, not upfront as you want.

我猜它正在为每一行进行评估,而不是根据需要进行评估。

I'd calulate the EOD statement separately:

我将单独记录EOD声明:

DECLARE @eod datetime;
SET @eod = dbo.EndOfDay(getdate());
SELECT * FROM [dbo].[tdf] (@eod)

I'd also use this for the EOD function:

我也将它用于EOD功能:

DATEADD(second, -1, DATEADD(day, 1, (DATEDIFF(day, 0, @date))))

EDIT: Other question I answered

编辑:我回答的其他问题

#2

You can rewrite EndOfDay as an inline UDF too, and use nested inline UDFs. Examples:

您也可以将EndOfDay重写为内联UDF,并使用嵌套的内联UDF。例子:

Many nested inline UDFs are very fast

许多嵌套的内联UDF非常快

Calculating third Wednesday of the month with inline UDFs

使用内联UDF计算当月的第三个星期三

#1