SQL Server查询性能——消除散列匹配(内部连接)的需要

I have the following query, which is doing very little and is an example of the kind of joins I am doing throughout the system.

我有下面的查询，它做的很少，是我在整个系统中进行的那种连接的一个例子。

select t1.PrimaryKeyId, t1.AdditionalColumnId
from TableOne t1
    join TableTwo t2 on t1.ForeignKeyId = t2.PrimaryKeyId
    join TableThree t3 on t1.PrimaryKeyId = t3.ForeignKeyId
    join TableFour t4 on t3.ForeignKeyId = t4.PrimaryKeyId
    join TableFive t5 on t4.ForeignKeyId = t5.PrimaryKeyId
where 
    t1.StatusId = 1
    and t5.TypeId = 68

There are indexes on all the join columns, however the performance is not great. Inspecting the query plan reveals a lot of Hash Match (Inner Joins) when really I want to see Nested Loop joins.

所有连接列上都有索引，但是性能不是很好。检查查询计划会显示大量的散列匹配(内部连接)，而实际上我希望看到嵌套循环连接。

The number of records in each table is as follows:

各表记录的数量如下:

select count(*) from TableOne

= 64393

select count(*) from TableTwo

= 87245

select count(*) from TableThree

= 97141

select count(*) from TableFour

= 116480

select count(*) from TableFive

= 62

What is the best way in which to improve the performance of this type of query?

改进此类查询性能的最佳方式是什么?

2 个解决方案

#1

First thoughts:

第一个想法:

Change to EXISTS (changes equi-join to semi-join)
更改为存在(将等连接更改为半连接)
You need to have indexes on t1.StatusId, t5.TypeId and INCLUDE t1.AdditionalColumnID
你需要在t1上有索引。StatusId,t5。类型id,包括t1.AdditionalColumnID

I wouldn't worry about your join method yet...

我还不担心您的join方法……

Personally, I've never used a JOIN hint. They only work for the data, indexes and statistics you have at that point in time. As these change, your JOIN hint limits the optimiser

就我个人而言，我从未使用过连接提示。它们只适用于当时的数据、索引和统计数据。随着这些变化，您的连接提示限制了optimiser

select t1.PrimaryKeyId, t1.AdditionalColumnId
from
    TableOne t1
where 
    t1.Status = 1
    AND EXISTS (SELECT *
        FROM
          TableThree t3
          join TableFour t4 on t3.ForeignKeyId = t4.PrimaryKeyId
          join TableFive t5 on t4.ForeignKeyId = t5.PrimaryKeyId
        WHERE
          t1.PrimaryKeyId = t3.ForeignKeyId
          AND
          t5.TypeId = 68)
    AND EXISTS (SELECT *
        FROM
          TableTwo t2
        WHERE
          t1.ForeignKeyId = t2.PrimaryKeyId)

Index for tableOne.. one of

指数tableOne . .之一

(Status, ForeignKeyId) INCLUDE (AdditionalColumnId)
(地位、ForeignKeyId)包括(AdditionalColumnId)
(ForeignKeyId, Status) INCLUDE (AdditionalColumnId)
(ForeignKeyId、状态)包括(AdditionalColumnId)

Index for tableFive... probably (typeID, PrimaryKeyId)

指数tableFive……可能(typeID PrimaryKeyId)

Edit: updated JOINS and EXISTS to match question fixes

编辑:更新连接并存在以匹配问题修复

#2

SQL Server is pretty good at optimizing queries, but it's also conservative: it optimizes queries for the worst case. A loop join typically results in an index lookup and a bookmark lookup for for every row. Because loop joins cause dramatic degradation for large sets, SQL Server is hesitant to use them unless it's sure about the number of rows.

SQL Server非常擅长优化查询，但它也比较保守:它针对最坏的情况优化查询。循环连接通常导致对每一行进行索引查找和书签查找。由于循环联接会导致大型集合的严重退化，所以SQL Server在使用它们时犹豫不决，除非它确定行数。

You can use the forceseek query hint to force an index lookup:

您可以使用forceseek查询提示强制执行索引查找:

inner join TableTwo t2 with (FORCESEEK) on t1.ForeignKeyId = t2.PrimaryKeyId

Alternatively, you can force a loop join with the loop keyword:

或者，您可以强制循环连接循环关键字:

inner LOOP join TableTwo t2 on t1.ForeignKeyId = t2.PrimaryKeyId

Query hints limit SQL Server's freedom, so it can no longer adapt to changed circumstances. It's best practice to avoid query hints unless there is a business need that cannot be met without them.

查询提示限制了SQL Server的*，因此它不能再适应变化的环境。最好的做法是避免查询提示，除非有业务需要，没有查询提示就无法满足。

#1