SQL Server Performance: Non-clustered Index + INCLUDE columns vs. Clustered Index - equivalent?

时间:2020-12-18 02:47:21

Hello SQL Server engine experts; please share with us a bit of your insight...

你好SQL Server引擎专家;请与我们分享您的见解......

As I understand it, INCLUDE columns on a non-clustered index allow additional, non-key data to be stored with the index pages.

据我了解,非聚集索引上的INCLUDE列允许将附加的非键数据与索引页一起存储。

I am well aware of the performance benefits a clustered index has over a non-clustered index simply due to the 1 fewer step the engine must take in a retrieval in order to arrive at the data on disk.

我很清楚聚簇索引对非聚集索引的性能优势,这是因为引擎必须在检索中减少1步才能获得磁盘上的数据。

However, since INCLUDE columns live in a non-clustered index, can the following query be expected to have essentially the same performance across scenarios 1 and 2 since all columns could be retrieved from the index pages in scenario 2 rather than ever resorting to the table data pages?

但是,由于INCLUDE列存在于非聚集索引中,因此可以预期以下查询在方案1和方案2中具有基本相同的性能,因为可以从方案2中的索引页检索所有列,而不是依赖于表数据页?

QUERY

QUERY

SELECT A, B, C FROM TBL ORDER BY A

SCENARIO 1

情景1

CREATE CLUSTERED INDEX IX1 ON TBL (A, B, C);

SCENARIO 2

情景2

CREATED NONCLUSTERED INDEX IX1 ON TBL (A) INCLUDE (B, C);

2 个解决方案

#1


3  

For this example you may actually get better performance with the non-clustered index. But, it really depends on additional information you haven't provided. Here are some thoughts.

对于此示例,您实际上可以使用非聚集索引获得更好的性能。但是,它实际上取决于您未提供的其他信息。这是一些想法。

SQL Server stores information in 8KB pages; this includes data and indexes. If your table only includes columns A, B and C, then the data will be stored in approximately the same number of data pages and the non-clustered index pages. But, if you have more columns in the table, then the data will need more pages. The number of index pages wouldn't be any different.

SQL Server以8KB页面存储信息;这包括数据和索引。如果您的表只包含A,B和C列,则数据将存储在大致相同数量的数据页和非聚集索引页中。但是,如果表中有更多列,那么数据将需要更多页面。索引页面的数量不会有任何不同。

So, in a table with more columns than your query needs, the query will work better with the non-clustered covering index (index with all columns). It will be able to process fewer pages to return the results you want.

因此,在列数多于查询需求的表中,查询将更好地处理非聚集覆盖索引(包含所有列的索引)。它将能够处理更少的页面以返回您想要的结果。

Of course, performance differences may not be seen until you get a very large number of rows.

当然,在获得非常多的行之前,可能看不到性能差异。

#2


5  

Indeed a non-clustered index with covering include columns can play exactly the same role as a clustered index. The cost is at update time: more include columns means more indexes have to be updated when an included column value is changed in the base table (in the clustered index). Also, with more included columns, the size-of-data increases: the database becomes larger and this can complicate maintenance operations.

实际上,具有覆盖包括列的非聚集索引可以扮演与聚簇索引完全相同的角色。成本是在更新时:更多包含列意味着当在基表(在聚簇索引中)中更改包含的列值时,必须更新更多索引。此外,随着列数越多,数据大小也越大:数据库越大,这可能会使维护操作复杂化。

In the end, is a balance you have to find between the covering value of the additional indexes and more included columns vs. the cost of update and data size increase.

最后,您需要在附加索引的覆盖值和更多包含的列与更新成本和数据大小增加之间找到平衡。

#1


3  

For this example you may actually get better performance with the non-clustered index. But, it really depends on additional information you haven't provided. Here are some thoughts.

对于此示例,您实际上可以使用非聚集索引获得更好的性能。但是,它实际上取决于您未提供的其他信息。这是一些想法。

SQL Server stores information in 8KB pages; this includes data and indexes. If your table only includes columns A, B and C, then the data will be stored in approximately the same number of data pages and the non-clustered index pages. But, if you have more columns in the table, then the data will need more pages. The number of index pages wouldn't be any different.

SQL Server以8KB页面存储信息;这包括数据和索引。如果您的表只包含A,B和C列,则数据将存储在大致相同数量的数据页和非聚集索引页中。但是,如果表中有更多列,那么数据将需要更多页面。索引页面的数量不会有任何不同。

So, in a table with more columns than your query needs, the query will work better with the non-clustered covering index (index with all columns). It will be able to process fewer pages to return the results you want.

因此,在列数多于查询需求的表中,查询将更好地处理非聚集覆盖索引(包含所有列的索引)。它将能够处理更少的页面以返回您想要的结果。

Of course, performance differences may not be seen until you get a very large number of rows.

当然,在获得非常多的行之前,可能看不到性能差异。

#2


5  

Indeed a non-clustered index with covering include columns can play exactly the same role as a clustered index. The cost is at update time: more include columns means more indexes have to be updated when an included column value is changed in the base table (in the clustered index). Also, with more included columns, the size-of-data increases: the database becomes larger and this can complicate maintenance operations.

实际上,具有覆盖包括列的非聚集索引可以扮演与聚簇索引完全相同的角色。成本是在更新时:更多包含列意味着当在基表(在聚簇索引中)中更改包含的列值时,必须更新更多索引。此外,随着列数越多,数据大小也越大:数据库越大,这可能会使维护操作复杂化。

In the end, is a balance you have to find between the covering value of the additional indexes and more included columns vs. the cost of update and data size increase.

最后,您需要在附加索引的覆盖值和更多包含的列与更新成本和数据大小增加之间找到平衡。