WHERE子句或ON子句中的INNER JOIN条件？

I mistyped a query today, but it still worked and gave the intended result. I meant to run this query:

我今天错误地输入了一个查询,但它仍然有效,并给出了预期的结果。我打算运行这个查询:

SELECT e.id FROM employees e JOIN users u ON u.email=e.email WHERE u.id='139840'

but I accidentally ran this query

但我不小心跑了这个查询

SELECT e.id FROM employees e JOIN users u ON u.email=e.email AND u.id='139840'

(note the AND instead of WHERE in the last clause)

(注意最后一个子句中的AND而不是WHERE)

and both returned the correct employee id from the user id.

并且都从用户ID返回了正确的员工ID。

What is the difference between these 2 queries? Does the second form only join members of the 2 tables meeting the criteria, whereas the first one would join the entire table, and then run the query? Is one more or less efficient than the other? Is it something else I am missing?

这两个查询之间有什么区别?第二种形式是否只加入符合条件的2个表的成员,而第一个表是否会加入整个表,然后运行查询?一个比另一个更有效还是更低效?这是我缺少的其他东西吗?

Thanks!

5 个解决方案

#1

For inner joins like this they are logically equivalent. However, you can run in to situations where a condition in the join clause means something different than a condition in the where clause.

对于这样的内连接,它们在逻辑上是等价的。但是,您可以运行到join子句中的条件意味着与where子句中的条件不同的情况。

As a simple illustration, imagine you do a left join like so;

作为一个简单的例子,假设您像这样进行左连接;

select x.id
from x
       left join y
         on x.id = y.id
;

Here we're taking all the rows from x, regardless of whether there is a matching id in y. Now let's say our join condition grows - we're not just looking for matches in y based on the id but also on id_type.

这里我们将从x中获取所有行,无论y中是否存在匹配的id。现在让我们假设我们的连接条件增长 - 我们不只是根据id而是在id_type上查找y中的匹配项。

select x.id
from x
       left join y
         on x.id = y.id
         and y.id_type = 'some type'
;

Again this gives all the rows in x regardless of whether there is a matching (id, id_type) in y.

同样,这给出了x中的所有行,无论y中是否存在匹配(id,id_type)。

This is very different, though:

但这是非常不同的:

select x.id
from x
       left join y
         on x.id = y.id
where y.id_type = 'some type'
;

In this situation, we're picking all the rows of x and trying to match to rows from y. Now for rows for which there is no match in y, y.id_type will be null. Because of that, y.id_type = 'some type' isn't satisfied, so those rows where there is no match are discarded, which effectively turned this in to an inner join.

在这种情况下,我们选择x的所有行并尝试匹配y中的行。现在对于y中没有匹配的行,y.id_type将为null。因此,不满足y.id_type ='some type',因此丢弃那些没有匹配的行,这实际上将其转换为内连接。

Long story short: for inner joins it doesn't matter where the conditions go but for outer joins it can.

长话短说:对于内部连接而言,条件在哪里并不重要,但对于外部连接,它可以。

#2

The optimizer will treat them the same. You can do an EXPLAIN to prove it to yourself.

优化器会对它们进行相同的处理。你可以做一个EXPLAIN来证明这一点。

Therefore, write the one that is clearer.

因此,写一个更清晰的。

SELECT e.id
FROM employees e JOIN users u ON u.email=e.email
WHERE u.id='139840'

#3

In the case of an INNER JOIN, the two queries are semantically the same, meaning they are guaranteed to have the same results. If you were using an OUTER join, the meaning of the two queries could be very different, with different results.

在INNER JOIN的情况下,两个查询在语义上是相同的,这意味着它们保证具有相同的结果。如果您使用的是OUTER连接,则两个查询的含义可能会有很大差异,但结果会有所不同。

Performance-wise, I would expect that these two queries would result in the same execution plan. However, the query engine might surprise you. The only way to know is to view the execution plans for the two queries.

在性能方面,我希望这两个查询会产生相同的执行计划。但是,查询引擎可能会让您大吃一惊。要知道的唯一方法是查看两个查询的执行计划。

#4

If it were an outer join instead of inner, you'd get unintended results, but when using an inner join it makes no real difference whether you use additional join criteria instead of a WHERE clause.

如果它是外连接而不是内连接,则会产生意外结果,但是当使用内连接时,无论使用其他连接条件而不是WHERE子句,都没有什么区别。

Performance-wise they are most likely identical, but can't be certain.

在性能方面,他们很可能是相同的,但不能确定。

#5

I brought this up with my colleagues on our team at work. This response is a bit SQL Server centered and not MySQL. However, the optimizer should have similarities in operation between SQL and MySQL..

我和我的同事在工作中提出了这个问题。这个响应有点以SQL Server为中心而不是MySQL。但是,优化器在SQL和MySQL之间的操作应该有相似之处。

Some thoughts: Essentially, if you have to add a WHERE, there are additional table scans done to verify equality for each condition (This goes up by orders of magnitude with an AND or dataset, an OR, the decision is cast at the first true condition) – if you have one id pointer in the example given it is extremely quick conversely, if you have to find all of the records that belong to a company or department it becomes more obscure as you may have multiples of records. If you can apply the equals condition, it is far more effective when working with an AuditLog or EventLog table that has zillions of rows. One would not really see the large benefits of this on small tables (at around 200,000 rows or so).

一些想法:基本上,如果你必须添加一个WHERE,还会进行额外的表扫描,以验证每个条件的相等性(使用AND或数据集进行数量级递增,一个OR,该决策在第一个真实时转换为条件) - 如果你给出的例子中有一个id指针,反之则非常快,如果你必须找到属于公司或部门的所有记录,它就会变得更加模糊,因为你可能有多个记录。如果您可以应用equals条件,那么在使用具有数十亿行的AuditLog或EventLog表时,它会更有效。人们不会真正看到这对小桌子(大约200,000行左右)的巨大好处。

From: Allesandro Alpi http://suxstellino.wordpress.com/2013/01/07/sql-server-logical-query-processing-summary/

来自:Allesandro Alpi http://suxstellino.wordpress.com/2013/01/07/sql-server-logical-query-processing-summary/

From: Itzik Ben-Gan http://tsql.solidq.com/books/insidetsql2008/Logical%20Query%20Processing%20Poster.pdf

来自:Itzik Ben-Gan http://tsql.solidq.com/books/insidetsql2008/Logical%20Query%20Processing%20Poster.pdf

#1