在可空列上分组的SQL服务器

时间:2021-09-11 22:41:59

I have a situation in SQL Server (with a legacy DB) that i can't understand why?

我在SQL Server(使用遗留的DB)中遇到过一种情况,我不明白为什么会这样?

I have a table A (about 2 million rows) that have column CODE that allow null. The number rows that have CODE = NULL is just several (< 10 rows). When i run the query:

我有一个表a(大约200万行),它的列代码允许null。代码为NULL的数字行只有几行(< 10行)。运行查询时:

select code, sum(C1)
from A
-- where code is not null
group by code;

It runs forever. But when i un-comment the where clause, it took around 1.5s (still too slow, right?)

它运行,直到永远。但是当我评论unun - where子句时,它花费了1.5秒(还是太慢了,对吧?)

Could anyone here help me pointing out what are the possible causes for such situation?

这里有人能帮我指出造成这种情况的可能原因吗?

Execution plan add: 在可空列上分组的SQL服务器

执行计划添加:

2 个解决方案

#1


1  

As a general rule, NULL values cannot be stored by a conventional index. So even if you have an index on code, your WHERE condition cannot benefit from that index.

作为一般规则,空值不能被常规索引存储。所以即使你有一个关于代码的索引,你的WHERE条件也不能从这个索引中获益。

If C1 is included in the index (which I assume is NOT NULL), things are different, because all the tuples (code=NULL, C1=(some value)) can and will be indexed. These are few, according to your question; so SQL Server can get a considerable speedup by just returning the rows for all these tuples.

如果索引中包含C1(我假设它不是NULL),那么事情就不同了,因为所有元组(代码=NULL, C1=(某个值))都可以并将被索引。根据你的问题,这些是很少的;因此,SQL Server只需返回所有这些元组的行,就可以获得相当大的加速。

#2


1  

First of all, a few words about performance. We have a several variants in your case.

首先,简单介绍一下性能。我们有几个变体在你的情况。

Indexes View -

索引视图,

IF OBJECT_ID('dbo.t', 'U') IS NOT NULL
    DROP TABLE dbo.t
GO

CREATE TABLE dbo.t (
    ID INT IDENTITY PRIMARY KEY,
    Code VARCHAR(10) NULL,
    [Status] INT NULL
)
GO

ALTER VIEW dbo.v
WITH SCHEMABINDING
AS
    SELECT Code, [Status] = SUM(ISNULL([Status], 0)), Cnt = COUNT_BIG(*)
    FROM dbo.t
    WHERE Code IS NOT NULL
    GROUP BY Code
GO

CREATE UNIQUE CLUSTERED INDEX ix ON dbo.v (Code)

SELECT Code, [Status]
FROM dbo.v

Filtered Index -

过滤指数

CREATE NONCLUSTERED INDEX ix ON dbo.t (Code)
    INCLUDE ([Status])
    WHERE Code IS NOT NULL

Will wait your second execution plan.

将等待您的第二个执行计划。

#1


1  

As a general rule, NULL values cannot be stored by a conventional index. So even if you have an index on code, your WHERE condition cannot benefit from that index.

作为一般规则,空值不能被常规索引存储。所以即使你有一个关于代码的索引,你的WHERE条件也不能从这个索引中获益。

If C1 is included in the index (which I assume is NOT NULL), things are different, because all the tuples (code=NULL, C1=(some value)) can and will be indexed. These are few, according to your question; so SQL Server can get a considerable speedup by just returning the rows for all these tuples.

如果索引中包含C1(我假设它不是NULL),那么事情就不同了,因为所有元组(代码=NULL, C1=(某个值))都可以并将被索引。根据你的问题,这些是很少的;因此,SQL Server只需返回所有这些元组的行,就可以获得相当大的加速。

#2


1  

First of all, a few words about performance. We have a several variants in your case.

首先,简单介绍一下性能。我们有几个变体在你的情况。

Indexes View -

索引视图,

IF OBJECT_ID('dbo.t', 'U') IS NOT NULL
    DROP TABLE dbo.t
GO

CREATE TABLE dbo.t (
    ID INT IDENTITY PRIMARY KEY,
    Code VARCHAR(10) NULL,
    [Status] INT NULL
)
GO

ALTER VIEW dbo.v
WITH SCHEMABINDING
AS
    SELECT Code, [Status] = SUM(ISNULL([Status], 0)), Cnt = COUNT_BIG(*)
    FROM dbo.t
    WHERE Code IS NOT NULL
    GROUP BY Code
GO

CREATE UNIQUE CLUSTERED INDEX ix ON dbo.v (Code)

SELECT Code, [Status]
FROM dbo.v

Filtered Index -

过滤指数

CREATE NONCLUSTERED INDEX ix ON dbo.t (Code)
    INCLUDE ([Status])
    WHERE Code IS NOT NULL

Will wait your second execution plan.

将等待您的第二个执行计划。