为什么使用JOIN而不是内部查询

时间:2021-03-06 16:45:13

I find myself unwilling to push to using JOIN when I can easily solve the same problem by using an inner query:

当我可以通过使用内部查询轻松解决相同问题时,我发现自己不愿意使用JOIN:

e.g.

例如

SELECT COLUMN1, ( SELECT COLUMN1 FROM TABLE2 WHERE TABLE2.ID = TABLE1.TABLE2ID ) AS COLUMN2 FROM TABLE1;

SELECT COLUMN1,(SELECT COLUMN1 FROM TABLE2 WHERE TABLE2.ID = TABLE1.TABLE2ID)AS COLUMN2 FROM TABLE1;

My question is, is this a bad programming practice? I find it easier to read and maintain as opposed to a join.

我的问题是,这是一个糟糕的编程习惯吗?我发现与连接相比,它更容易阅读和维护。

UPDATE

UPDATE

I want to add that there's some great feedback in here which in essence is pushing be back to using JOIN. I am finding myself less and less involved with using TSQL directly these days as of a result of ORM solutions (LINQ to SQL, NHibernate, etc.), but when I do it's things like correlated subqueries which I find are easier to type out linearly.

我想补充一点,这里有一些很好的反馈,实质上就是推回使用JOIN。由于ORM解决方案(LINQ to SQL,NHibernate等)的结果,我发现自己现在越来越少地直接使用TSQL,但是当我这样做时,我发现相关子查询之类的内容更容易线性输入。

8 个解决方案

#1


8  

Personally, I find this incredibly difficult to read. It isn't the structure a SQL developer expects. By using JOIN, you are keeping all of your table sources in a single spot instead of spreading it throughout your query.

就个人而言,我发现这非常难以阅读。它不是SQL开发人员所期望的结构。通过使用JOIN,您可以将所有表源保留在一个位置,而不是在整个查询中传播它。

What happens if you need to have three or four joins? Putting all of those into the SELECT clause is going to get hairy.

如果您需要三个或四个连接会发生什么?将所有这些放入SELECT子句将变得毛茸茸。

#2


6  

A join is usually faster than a correlated subquery as it acts on the set of rows rather than one row at a time. I would never let this code go to my production server.

连接通常比相关子查询更快,因为它一次作用于行集而不是一行。我永远不会让这些代码进入我的生产服务器。

And I find a join much much easier to read and maintain.

我发现连接更容易阅读和维护。

#3


6  

If you needed more than one column from the second table, then you would require two subqueries. This typically would not perform as well as a join.

如果您需要第二个表中的多个列,则需要两个子查询。这通常不会像连接一样好。

#4


4  

This is not equivalent to JOIN.

这不等于JOIN。

If you have multiple rows in TABLE2 for each row in TABLE1, you won't get them. For each row in TABLE1 you get one row output so you can't get multiple from TABLE2.

如果TABLE2中的每一行都包含TABLE1中的每一行,则不会获得它们。对于TABLE1中的每一行,您将获得一行输出,因此您无法从TABLE2中获取多个行。

This is why I'd use "JOIN": to make sure I get the data I wanted...

这就是我使用“JOIN”的原因:确保我得到我想要的数据......

After your update: I rarely use correlation except with EXISTS...

更新后:我很少使用相关性,除了EXISTS ...

#5


4  

The query you use was often used as a replacement for a LEFT JOIN for the engines that lacked it (most notably, PostgreSQL before 7.2)

您使用的查询通常用作替换缺少它的引擎的LEFT JOIN(最值得注意的是7.2之前的PostgreSQL)

This approach has some serious drawbacks:

这种方法有一些严重的缺点:

  1. It may fail if TABLE2.ID is not UNIQUE

    如果TABLE2.ID不是UNIQUE,它可能会失败

  2. Some engines will not be able to use anything else than NESTED LOOPS for this query

    对于此查询,某些引擎将无法使用除NESTED LOOPS之外的任何内容

  3. If you need to select more than one column, you will need to write the subquery several times

    如果需要选择多个列,则需要多次编写子查询

If your engine supports LEFT JOIN, use the LEFT JOIN.

如果您的引擎支持LEFT JOIN,请使用LEFT JOIN。

In MySQL, however, there are some cases when an aggregate function in a select-level subquery can be more efficient than that in a LEFT JOIN with a GROUP BY.

但是,在MySQL中,有些情况下,选择级子查询中的聚合函数比具有GROUP BY的LEFT JOIN中的聚合函数更有效。

See this article in my blog for the examples:

有关示例,请参阅我的博客中的这篇文章:

#6


1  

This is not a bad programming practice at all IMO, it is a little bit ugly though. It can actually be a performance boost in situations where the sub-select is from a very large table while you are expecting a very small result set (you have to consider indexes and platform, 2000 having a different optimizer and all from 2005). Here is how I format it to be easier to read.

这对所有IMO来说都不是一个糟糕的编程实践,但它有点难看。实际上,在子选择来自非常大的表而您期望非常小的结果集(您必须考虑索引和平台,2000具有不同的优化器以及所有来自2005)的情况下,它实际上可以是性能提升。这是我如何格式化它以便于阅读。

select
  column1
  [column2] = (subselect...)
from
  table1

Edit: This of course assumes that your subselect will only return one value, if not it could be returning you bad results. See gbn's response.

编辑:这当然假设您的子选择只返回一个值,如果不是,它可能会返回错误的结果。见gbn的回应。

#7


0  

it makes it a lot easier to use other types of joins (left outer, cross, etc) because the syntax for those in subquery terms is less than ideal for readability

它使得使用其他类型的连接(左外部,交叉等)变得更加容易,因为子查询术语中的语法不太适合可读性

#8


0  

At the end of the day, the goal when writing code, beyond functional requirements, is to make the intent of your code clear to a reader. If you use a JOIN, the intent is obvious. If you use a subquery in the manner you describe, it begs the question of why you did it that way. What were you trying to achieve that a JOIN would not have accomplished? In short, you waste the reader's time in trying to determine if the author was solving some problem in an ingenious fashion or if they were writing the code after a hard night of drinking.

在一天结束时,编写代码的目标超出了功能要求,目的是使代码的意图对读者清楚。如果你使用JOIN,意图是显而易见的。如果你以你描述的方式使用子查询,那么就会出现为什么你这样做的问题。你想要实现JOIN不会实现的目标是什么?简而言之,你浪费了读者的时间来试图确定作者是否以巧妙的方式解决了一些问题,或者他们是否在经过一夜辛苦的饮酒后编写代码。

#1


8  

Personally, I find this incredibly difficult to read. It isn't the structure a SQL developer expects. By using JOIN, you are keeping all of your table sources in a single spot instead of spreading it throughout your query.

就个人而言,我发现这非常难以阅读。它不是SQL开发人员所期望的结构。通过使用JOIN,您可以将所有表源保留在一个位置,而不是在整个查询中传播它。

What happens if you need to have three or four joins? Putting all of those into the SELECT clause is going to get hairy.

如果您需要三个或四个连接会发生什么?将所有这些放入SELECT子句将变得毛茸茸。

#2


6  

A join is usually faster than a correlated subquery as it acts on the set of rows rather than one row at a time. I would never let this code go to my production server.

连接通常比相关子查询更快,因为它一次作用于行集而不是一行。我永远不会让这些代码进入我的生产服务器。

And I find a join much much easier to read and maintain.

我发现连接更容易阅读和维护。

#3


6  

If you needed more than one column from the second table, then you would require two subqueries. This typically would not perform as well as a join.

如果您需要第二个表中的多个列,则需要两个子查询。这通常不会像连接一样好。

#4


4  

This is not equivalent to JOIN.

这不等于JOIN。

If you have multiple rows in TABLE2 for each row in TABLE1, you won't get them. For each row in TABLE1 you get one row output so you can't get multiple from TABLE2.

如果TABLE2中的每一行都包含TABLE1中的每一行,则不会获得它们。对于TABLE1中的每一行,您将获得一行输出,因此您无法从TABLE2中获取多个行。

This is why I'd use "JOIN": to make sure I get the data I wanted...

这就是我使用“JOIN”的原因:确保我得到我想要的数据......

After your update: I rarely use correlation except with EXISTS...

更新后:我很少使用相关性,除了EXISTS ...

#5


4  

The query you use was often used as a replacement for a LEFT JOIN for the engines that lacked it (most notably, PostgreSQL before 7.2)

您使用的查询通常用作替换缺少它的引擎的LEFT JOIN(最值得注意的是7.2之前的PostgreSQL)

This approach has some serious drawbacks:

这种方法有一些严重的缺点:

  1. It may fail if TABLE2.ID is not UNIQUE

    如果TABLE2.ID不是UNIQUE,它可能会失败

  2. Some engines will not be able to use anything else than NESTED LOOPS for this query

    对于此查询,某些引擎将无法使用除NESTED LOOPS之外的任何内容

  3. If you need to select more than one column, you will need to write the subquery several times

    如果需要选择多个列,则需要多次编写子查询

If your engine supports LEFT JOIN, use the LEFT JOIN.

如果您的引擎支持LEFT JOIN,请使用LEFT JOIN。

In MySQL, however, there are some cases when an aggregate function in a select-level subquery can be more efficient than that in a LEFT JOIN with a GROUP BY.

但是,在MySQL中,有些情况下,选择级子查询中的聚合函数比具有GROUP BY的LEFT JOIN中的聚合函数更有效。

See this article in my blog for the examples:

有关示例,请参阅我的博客中的这篇文章:

#6


1  

This is not a bad programming practice at all IMO, it is a little bit ugly though. It can actually be a performance boost in situations where the sub-select is from a very large table while you are expecting a very small result set (you have to consider indexes and platform, 2000 having a different optimizer and all from 2005). Here is how I format it to be easier to read.

这对所有IMO来说都不是一个糟糕的编程实践,但它有点难看。实际上,在子选择来自非常大的表而您期望非常小的结果集(您必须考虑索引和平台,2000具有不同的优化器以及所有来自2005)的情况下,它实际上可以是性能提升。这是我如何格式化它以便于阅读。

select
  column1
  [column2] = (subselect...)
from
  table1

Edit: This of course assumes that your subselect will only return one value, if not it could be returning you bad results. See gbn's response.

编辑:这当然假设您的子选择只返回一个值,如果不是,它可能会返回错误的结果。见gbn的回应。

#7


0  

it makes it a lot easier to use other types of joins (left outer, cross, etc) because the syntax for those in subquery terms is less than ideal for readability

它使得使用其他类型的连接(左外部,交叉等)变得更加容易,因为子查询术语中的语法不太适合可读性

#8


0  

At the end of the day, the goal when writing code, beyond functional requirements, is to make the intent of your code clear to a reader. If you use a JOIN, the intent is obvious. If you use a subquery in the manner you describe, it begs the question of why you did it that way. What were you trying to achieve that a JOIN would not have accomplished? In short, you waste the reader's time in trying to determine if the author was solving some problem in an ingenious fashion or if they were writing the code after a hard night of drinking.

在一天结束时,编写代码的目标超出了功能要求,目的是使代码的意图对读者清楚。如果你使用JOIN,意图是显而易见的。如果你以你描述的方式使用子查询,那么就会出现为什么你这样做的问题。你想要实现JOIN不会实现的目标是什么?简而言之,你浪费了读者的时间来试图确定作者是否以巧妙的方式解决了一些问题,或者他们是否在经过一夜辛苦的饮酒后编写代码。