I have an SQL question, related to this and this question (but different). Basically I want to know how I can avoid a nested query.
我有一个SQL问题,与此问题和这个问题有关(但不同)。基本上我想知道如何避免嵌套查询。
Let's say I have a huge table of jobs (jobs
) executed by a company in their history. These jobs are characterized by year, month, location and the code belonging to the tool used for the job. Additionally I have a table of tools (tools
), translating tool codes to tool descriptions and further data about the tool. Now they want a website where they can select year, month, location and tool using a dropdown box, after which the matching jobs will be displayed. I want to fill the last dropdown with only the relevant tools matching the before selection of year, month and location, so I write the following nested query:
假设我有一个公司在其历史上执行的大量工作(工作)表。这些作业的特点是年,月,位置以及属于该作业所用工具的代码。此外,我还有一个工具表(工具),将工具代码转换为工具描述以及有关该工具的更多数据。现在他们想要一个网站,他们可以使用下拉框选择年份,月份,位置和工具,之后将显示匹配的作业。我想填写最后一个下拉列表,只有与之前选择的年,月和位置匹配的相关工具,所以我编写以下嵌套查询:
SELECT c.tool_code, t.tool_description
FROM (
SELECT DISTINCT j.tool_code
FROM jobs AS j
WHERE j.year = ....
AND j.month = ....
AND j.location = ....
) AS c
LEFT JOIN tools as t
ON c.tool_code = t.tool_code
ORDER BY c.tool_code ASC
I resorted to this nested query because it was much faster than performing a JOIN on the complete database and selecting from that. It got my query time down a lot. But as I have recently read that MySQL nested queries should be avoided at all cost, I am wondering whether I am wrong in this approach. Should I rewrite my query differently? And how?
我使用这个嵌套查询,因为它比在整个数据库上执行JOIN并从中选择要快得多。它让我的查询时间缩短了很多。但正如我最近读到的那样,应该不惜一切代价避免使用MySQL嵌套查询,我想知道这种方法是否错误。我应该以不同方式重写我的查询吗?如何?
2 个解决方案
#1
2
No, you shouldn't, your query is fine.
不,你不应该,你的查询没问题。
Just create an index on jobs (year, month, location, tool_code)
and tools (tool_code)
so that the INDEX FOR GROUP-BY
can be used.
只需在作业(年,月,位置,工具代码)和工具(工具代码)上创建索引,以便可以使用INDEX FOR GROUP-BY。
The article your provided describes the subquery predicates (IN (SELECT ...)
), not the nested queries (SELECT FROM (SELECT ...)
).
您提供的文章描述了子查询谓词(IN(SELECT ...)),而不是嵌套查询(SELECT FROM(SELECT ...))。
Even with the subqueries, the article is wrong: while MySQL
is not able to optimize all subqueries, it deals with IN (SELECT …)
predicates just fine.
即使使用子查询,文章也是错误的:虽然MySQL无法优化所有子查询,但它处理IN(SELECT ...)谓词就好了。
I don't know why the author chose to put DISTINCT
here:
我不知道为什么作者选择将DISTINCT放在这里:
SELECT id, name, price
FROM widgets
WHERE id IN
(
SELECT DISTINCT widgetId
FROM widgetOrders
)
and why do they think this will help to improve performance, but given that widgetID
is indexed, MySQL
will just transform this query:
为什么他们认为这有助于提高性能,但鉴于widgetID已编入索引,MySQL将只转换此查询:
SELECT id, name, price
FROM widgets
WHERE id IN
(
SELECT widgetId
FROM widgetOrders
)
into an index_subquery
到index_subquery
Essentially, this is just like EXISTS
clause: the inner subquery will be executed once per widgets
row with the additional predicate added:
本质上,这就像EXISTS子句:内部子查询将在每个小部件行执行一次,并添加了额外的谓词:
SELECT NULL
FROM widgetOrders
WHERE widgetId = widgets.id
and stop on the first match in widgetOrders
.
并在widgetOrders的第一场比赛中停止。
This query:
这个查询:
SELECT DISTINCT w.id,w.name,w.price
FROM widgets w
INNER JOIN
widgetOrders o
ON w.id = o.widgetId
will have to use temporary
to get rid of the duplicates and will be much slower.
将不得不使用临时来摆脱重复,并将慢得多。
#2
2
You could avoid the subquery by using GROUP BY
, but if the subquery performs better, keep it.
您可以通过使用GROUP BY来避免子查询,但如果子查询执行得更好,请保留它。
Why do you use a LEFT JOIN
instead of a JOIN
to join tools
?
为什么使用LEFT JOIN而不是JOIN来加入工具?
#1
2
No, you shouldn't, your query is fine.
不,你不应该,你的查询没问题。
Just create an index on jobs (year, month, location, tool_code)
and tools (tool_code)
so that the INDEX FOR GROUP-BY
can be used.
只需在作业(年,月,位置,工具代码)和工具(工具代码)上创建索引,以便可以使用INDEX FOR GROUP-BY。
The article your provided describes the subquery predicates (IN (SELECT ...)
), not the nested queries (SELECT FROM (SELECT ...)
).
您提供的文章描述了子查询谓词(IN(SELECT ...)),而不是嵌套查询(SELECT FROM(SELECT ...))。
Even with the subqueries, the article is wrong: while MySQL
is not able to optimize all subqueries, it deals with IN (SELECT …)
predicates just fine.
即使使用子查询,文章也是错误的:虽然MySQL无法优化所有子查询,但它处理IN(SELECT ...)谓词就好了。
I don't know why the author chose to put DISTINCT
here:
我不知道为什么作者选择将DISTINCT放在这里:
SELECT id, name, price
FROM widgets
WHERE id IN
(
SELECT DISTINCT widgetId
FROM widgetOrders
)
and why do they think this will help to improve performance, but given that widgetID
is indexed, MySQL
will just transform this query:
为什么他们认为这有助于提高性能,但鉴于widgetID已编入索引,MySQL将只转换此查询:
SELECT id, name, price
FROM widgets
WHERE id IN
(
SELECT widgetId
FROM widgetOrders
)
into an index_subquery
到index_subquery
Essentially, this is just like EXISTS
clause: the inner subquery will be executed once per widgets
row with the additional predicate added:
本质上,这就像EXISTS子句:内部子查询将在每个小部件行执行一次,并添加了额外的谓词:
SELECT NULL
FROM widgetOrders
WHERE widgetId = widgets.id
and stop on the first match in widgetOrders
.
并在widgetOrders的第一场比赛中停止。
This query:
这个查询:
SELECT DISTINCT w.id,w.name,w.price
FROM widgets w
INNER JOIN
widgetOrders o
ON w.id = o.widgetId
will have to use temporary
to get rid of the duplicates and will be much slower.
将不得不使用临时来摆脱重复,并将慢得多。
#2
2
You could avoid the subquery by using GROUP BY
, but if the subquery performs better, keep it.
您可以通过使用GROUP BY来避免子查询,但如果子查询执行得更好,请保留它。
Why do you use a LEFT JOIN
instead of a JOIN
to join tools
?
为什么使用LEFT JOIN而不是JOIN来加入工具?