SQL问题:WHERE子句的顺序是否有所不同?

时间:2022-02-26 22:31:44

From a performance standpoint, does the order of my SQL WHERE statements make a difference?

从性能的角度来看,我的SQL WHERE语句的顺序是否有所不同?

For instance

例如

SELECT ... FROM ...
WHERE a > 1
AND b < 2

Would that be any faster/slower than

会比这更快/更慢

SELECT ... FROM ...
WHERE b < 2
AND a > 1

Let's also assume that I know in advance that a > 1 will narrow the result set the most.

让我们假设我事先知道a> 1会缩小结果集的范围。

Also, does it matter if I'm joining two or more tables the order of my WHERE statements?

另外,如果我按照WHERE语句的顺序加入两个或更多表,这有关系吗?

6 个解决方案

#1


19  

In theory, there is no difference.

从理论上讲,没有区别。

Occasionally, especially with the simpler optimizers, there are differences in the query plan depending on the order of the clauses in the WHERE clause. There's a moderately strong argument that such differences are symptomatic of a bug.

有时,特别是对于更简单的优化器,查询计划中存在差异,具体取决于WHERE子句中子句的顺序。有一个中等强度的论点,即这种差异是一个错误的症状。

Similar comments apply to join order, too. The order of the joins should not matter - for joins of the same type. Clearly, whether a table Table2 is inner joined or outer joined to another table Table1 does matter - and it matters whether it is Table1 LEFT JOIN Table2 or Table1 RIGHT JOIN Table2 or Table1 FULL JOIN Table2. But for a series of INNER JOIN operations, the sequencing should not matter. The processing order may be forced, to some extent, if you are dealing with a chain of joins.

类似的评论也适用于加入订单。连接的顺序无关紧要 - 对于相同类型的连接。显然,表Table2是内部连接还是外部连接到另一个表Table1都很重要 - 并且重要的是它是Table1 LEFT JOIN Table2还是Table1 RIGHT JOIN Table2或Table1 FULL JOIN Table2。但对于一系列INNER JOIN操作,测序无关紧要。如果您正在处理连接链,则可能会在某种程度上强制处理顺序。

Clarifying (again) - consider:

澄清(再次) - 考虑:

(Table1 AS t1 JOIN Table2 AS t2 ON t1.pkcol = t2.fkcol) AS j1
JOIN
(Table3 AS t3 JOIN Table4 AS t4 ON t3.pkcol = t4.fkcol) AS j2
ON j1.somecol = j2.anothercol

The way it is written, clearly the programmer expects the joins on (t1, t2) and (t3, t4) to be executed before the join on (j1, j2), but the optimizer may be able to do the joins differently. For example, if j1.somecol derives from Table1 and j2.anothercol derives from Table4, the optimizer may be able to choose the join on Table1.SomeCol = Table4.AnotherCol over either of the other joins. This sort of issue can be influenced by the filter conditions in the WHERE clause, and by the presence or absence of appropriate indexes on the various tables. This is where statistics can play a big part in the way the optimizer generates the query plan.

它的编写方式显然是程序员期望在连接(j1,j2)之前执行(t1,t2)和(t3,t4)上的连接,但优化器可能能够以不同方式执行连接。例如,如果j1.somecol派生自Table1而j2.anothercol派生自Table4,则优化器可能能够选择Table1.SomeCol = Table4.AnotherCol上的任何其他连接的连接。 WHERE子句中的过滤条件以及各个表上是否存在适当的索引会影响此类问题。这是统计数据在优化程序生成查询计划方式中发挥重要作用的地方。

#2


11  

No, it doesn't. Most modern SQL servers include a query optimizer which looks into all the plausible (*) ways of resolving a query and whereby older servers may take hints based on the order within the SELECT clause, newer servers do not.

不,它没有。大多数现代SQL服务器都包含一个查询优化器,它查看解析查询的所有合理(*)方法,并且旧服务器可能会根据SELECT子句中的顺序获取提示,而较新的服务器则不会。

The order of the JOINs on the other hand still matter to a greater extent.

另一方面,JOIN的顺序在更大程度上仍然是重要的。

Edit: Do see Jonathan's Leffler's response for he provides additional detail in particular regarding the order of JOINs. Thanks you, Jonathan!

编辑:请参阅Jonathan的Leffler的回复,因为他提供了有关JOIN顺序的更多细节。谢谢你,乔纳森!

Edit: ( * ) Plausible vs. Possible: As pointed out by Erikkalen, the optimizer does not look into all of the possible ways, thanks to [pretty good] heuristics coded in its logic, it will only evaluate the plausible plans, on the basis of the statistics it keeps for the underlying indexes. For each of the plans it considers an overall cost is estimated (or partially so, when partial costs readily exceed the overall cost of another plan [pruning]), and that's how the plan effectively used is eventually selected. While the general principles used by SQL query optimizers are well known, the intricacies of their implementation introduce many different twists-and-turns.

编辑:(*)合理与可能:正如Erikkalen所指出的那样,优化器并没有考虑所有可能的方式,这要归功于在其逻辑中编码的[相当不错的]启发式方法,它只会评估合理的计划,它为基础指数保留的统计数据的基础。对于每个计划,它考虑总体成本(或部分成本,当部分成本容易超过另一个计划[修剪]的总成本时),并且最终选择有效使用的计划。虽然SQL查询优化器使用的一般原则是众所周知的,但其实现的复杂性引入了许多不同的曲折。

#3


6  

See below and follow the link(long article but worth the read):

请参阅下文并按照链接(长篇文章,但值得一读):

SQL Server Transact-SQL WHERE

SQL Server Transact-SQL WHERE

If a WHERE clause includes multiple expressions, there is generally no performance benefit gained by ordering the various expressions in any particular order. This is because the SQL Server Query Optimizer does this for you, saving you the effort. There are a few exceptions to this, which are discussed on this web site. [7.0, 2000, 2005] Added 1-24-2006

如果WHERE子句包含多个表达式,则通过以任何特定顺序排序各种表达式通常不会获得性能优势。这是因为SQL Server查询优化器会为您执行此操作,从而节省您的工作量。这方面有一些例外情况,本网站对此进行了讨论。 [7.0,2000,2005]已添加1-24-2006

#4


2  

No. The optimiser decides which order to filter results based upon current statistics.

不可以。优化器根据当前统计信息决定筛选结果的顺序。

#5


2  

It depends on the DBMS. SQL itself does not say anything about how a query should execute. It is up to the specific implementation.

这取决于DBMS。 SQL本身并没有说明查询应该如何执行。这取决于具体的实施。

If your DBMS had the very simplistic model of interpreting the query sequentially, then putting a > 1 first in your example would (obviously) be faster - because the DBMS would make two passes of which the second pass is through a much smaller resultset.

如果您的DBMS具有按顺序解释查询的非常简单的模型,那么在您的示例中首先放置> 1会(显然)更快 - 因为DBMS将进行两次传递,其中第二次传递通过更小的结果集。

#6


0  

If it's from the same table, and the query is as simple as your example then, no it doesn't make a difference. As you get more complicated and link more tables, it can.

如果它来自同一个表,并且查询就像您的示例一样简单,那么它不会产生任何影响。随着您变得更复杂并链接更多表,它可以。

#1


19  

In theory, there is no difference.

从理论上讲,没有区别。

Occasionally, especially with the simpler optimizers, there are differences in the query plan depending on the order of the clauses in the WHERE clause. There's a moderately strong argument that such differences are symptomatic of a bug.

有时,特别是对于更简单的优化器,查询计划中存在差异,具体取决于WHERE子句中子句的顺序。有一个中等强度的论点,即这种差异是一个错误的症状。

Similar comments apply to join order, too. The order of the joins should not matter - for joins of the same type. Clearly, whether a table Table2 is inner joined or outer joined to another table Table1 does matter - and it matters whether it is Table1 LEFT JOIN Table2 or Table1 RIGHT JOIN Table2 or Table1 FULL JOIN Table2. But for a series of INNER JOIN operations, the sequencing should not matter. The processing order may be forced, to some extent, if you are dealing with a chain of joins.

类似的评论也适用于加入订单。连接的顺序无关紧要 - 对于相同类型的连接。显然,表Table2是内部连接还是外部连接到另一个表Table1都很重要 - 并且重要的是它是Table1 LEFT JOIN Table2还是Table1 RIGHT JOIN Table2或Table1 FULL JOIN Table2。但对于一系列INNER JOIN操作,测序无关紧要。如果您正在处理连接链,则可能会在某种程度上强制处理顺序。

Clarifying (again) - consider:

澄清(再次) - 考虑:

(Table1 AS t1 JOIN Table2 AS t2 ON t1.pkcol = t2.fkcol) AS j1
JOIN
(Table3 AS t3 JOIN Table4 AS t4 ON t3.pkcol = t4.fkcol) AS j2
ON j1.somecol = j2.anothercol

The way it is written, clearly the programmer expects the joins on (t1, t2) and (t3, t4) to be executed before the join on (j1, j2), but the optimizer may be able to do the joins differently. For example, if j1.somecol derives from Table1 and j2.anothercol derives from Table4, the optimizer may be able to choose the join on Table1.SomeCol = Table4.AnotherCol over either of the other joins. This sort of issue can be influenced by the filter conditions in the WHERE clause, and by the presence or absence of appropriate indexes on the various tables. This is where statistics can play a big part in the way the optimizer generates the query plan.

它的编写方式显然是程序员期望在连接(j1,j2)之前执行(t1,t2)和(t3,t4)上的连接,但优化器可能能够以不同方式执行连接。例如,如果j1.somecol派生自Table1而j2.anothercol派生自Table4,则优化器可能能够选择Table1.SomeCol = Table4.AnotherCol上的任何其他连接的连接。 WHERE子句中的过滤条件以及各个表上是否存在适当的索引会影响此类问题。这是统计数据在优化程序生成查询计划方式中发挥重要作用的地方。

#2


11  

No, it doesn't. Most modern SQL servers include a query optimizer which looks into all the plausible (*) ways of resolving a query and whereby older servers may take hints based on the order within the SELECT clause, newer servers do not.

不,它没有。大多数现代SQL服务器都包含一个查询优化器,它查看解析查询的所有合理(*)方法,并且旧服务器可能会根据SELECT子句中的顺序获取提示,而较新的服务器则不会。

The order of the JOINs on the other hand still matter to a greater extent.

另一方面,JOIN的顺序在更大程度上仍然是重要的。

Edit: Do see Jonathan's Leffler's response for he provides additional detail in particular regarding the order of JOINs. Thanks you, Jonathan!

编辑:请参阅Jonathan的Leffler的回复,因为他提供了有关JOIN顺序的更多细节。谢谢你,乔纳森!

Edit: ( * ) Plausible vs. Possible: As pointed out by Erikkalen, the optimizer does not look into all of the possible ways, thanks to [pretty good] heuristics coded in its logic, it will only evaluate the plausible plans, on the basis of the statistics it keeps for the underlying indexes. For each of the plans it considers an overall cost is estimated (or partially so, when partial costs readily exceed the overall cost of another plan [pruning]), and that's how the plan effectively used is eventually selected. While the general principles used by SQL query optimizers are well known, the intricacies of their implementation introduce many different twists-and-turns.

编辑:(*)合理与可能:正如Erikkalen所指出的那样,优化器并没有考虑所有可能的方式,这要归功于在其逻辑中编码的[相当不错的]启发式方法,它只会评估合理的计划,它为基础指数保留的统计数据的基础。对于每个计划,它考虑总体成本(或部分成本,当部分成本容易超过另一个计划[修剪]的总成本时),并且最终选择有效使用的计划。虽然SQL查询优化器使用的一般原则是众所周知的,但其实现的复杂性引入了许多不同的曲折。

#3


6  

See below and follow the link(long article but worth the read):

请参阅下文并按照链接(长篇文章,但值得一读):

SQL Server Transact-SQL WHERE

SQL Server Transact-SQL WHERE

If a WHERE clause includes multiple expressions, there is generally no performance benefit gained by ordering the various expressions in any particular order. This is because the SQL Server Query Optimizer does this for you, saving you the effort. There are a few exceptions to this, which are discussed on this web site. [7.0, 2000, 2005] Added 1-24-2006

如果WHERE子句包含多个表达式,则通过以任何特定顺序排序各种表达式通常不会获得性能优势。这是因为SQL Server查询优化器会为您执行此操作,从而节省您的工作量。这方面有一些例外情况,本网站对此进行了讨论。 [7.0,2000,2005]已添加1-24-2006

#4


2  

No. The optimiser decides which order to filter results based upon current statistics.

不可以。优化器根据当前统计信息决定筛选结果的顺序。

#5


2  

It depends on the DBMS. SQL itself does not say anything about how a query should execute. It is up to the specific implementation.

这取决于DBMS。 SQL本身并没有说明查询应该如何执行。这取决于具体的实施。

If your DBMS had the very simplistic model of interpreting the query sequentially, then putting a > 1 first in your example would (obviously) be faster - because the DBMS would make two passes of which the second pass is through a much smaller resultset.

如果您的DBMS具有按顺序解释查询的非常简单的模型,那么在您的示例中首先放置> 1会(显然)更快 - 因为DBMS将进行两次传递,其中第二次传递通过更小的结果集。

#6


0  

If it's from the same table, and the query is as simple as your example then, no it doesn't make a difference. As you get more complicated and link more tables, it can.

如果它来自同一个表,并且查询就像您的示例一样简单,那么它不会产生任何影响。随着您变得更复杂并链接更多表,它可以。