It is common to use SELECT
within SELECT
to reduce the number of queries; but as I examined this leads to slow query (which is obviously harmful for mysql performance). I had a simple query as
在SELECT中使用SELECT来减少查询次数是很常见的。但正如我检查,这导致慢查询(这显然有害于mysql性能)。我有一个简单的查询
SELECT something
FROM posts
WHERE id IN (
SELECT tag_map.id
FROM tag_map
INNER JOIN tags
ON tags.tag_id=tag_map.tag_id
WHERE tag IN ('tag1', 'tag2', 'tag3', 'tag4', 'tag5', 'tag6')
)
This leads to slow queries of "query time 3-4s; lock time about 0.000090s; with about 200 rows examined".
这导致查询“查询时间3-4s;锁定时间约为0.000090s;检查约200行”的查询速度慢。
If I split the SELECT
queries, each of them will be quite fast; but this will increase the number of queries which is not good at high concurrency.
如果我拆分SELECT查询,它们中的每一个都会非常快;但这会增加不兼容高并发性的查询数量。
Is it the usual situation, or something is wrong with my coding?
这是通常的情况,还是我的编码有问题?
3 个解决方案
#1
13
In MySQL, doing a subquery like this is a "correlated query". This means that the results of the outer SELECT
depend on the result of the inner SELECT
. The outcome is that your inner query is executed once per row, which is very slow.
在MySQL中,执行这样的子查询是“相关查询”。这意味着外部SELECT的结果取决于内部SELECT的结果。结果是你的内部查询每行执行一次,这非常慢。
You should refactor this query; whether you join twice or use two queries is mostly irrelevant. Joining twice would give you:
你应该重构这个查询;无论你是加入两次还是使用两个查询都是无关紧要的。加入两次会给你:
SELECT something
FROM posts
INNER JOIN tag_map ON tag_map.id = posts.id
INNER JOIN tags ON tags.tag_id = tag_map.tag_id
WHERE tags.tag IN ('tag1', ...)
For more information, see the MySQL manual on converting subqueries to JOINs.
有关更多信息,请参阅有关将子查询转换为JOIN的MySQL手册。
Tip: EXPLAIN SELECT
will show you how the optimizer plans on handling your query. If you see DEPENDENT SUBQUERY
you should refactor, these are mega-slow.
提示:EXPLAIN SELECT将向您显示优化程序如何计划处理您的查询。如果你看到依赖的SUBQUERY你应该重构,这些都是非常慢的。
#2
2
You could improve it by using the following:
您可以使用以下方法改进它:
SELECT something
FROM posts
INNER JOIN tag_map ON tag_map.id = posts.id
INNER JOIN tags
ON tags.tag_id=tag_map.tag_id
WHERE <tablename>.tag IN ('tag1', 'tag2', 'tag3', 'tag4', 'tag5', 'tag6')
Just make sure you only select what you need and do not use *; also state in which table you have the tag column so you can substitute <tablename>
只要确保你只选择你需要的东西而不要使用*;还说明您在哪个表中有标记列,以便替换
#3
1
Join does filtering of results. First join will keep results having 1st ON condition satisfied and then 2nd condition gives final result on 2nd ON condition.
加入会对结果进行过滤。第一次连接将保持满足第一ON条件的结果,然后第二条件在第二ON条件下给出最终结果。
SELECT something
FROM posts
INNER JOIN tag_map ON tag_map.id = posts.id
INNER JOIN tags ON tags.tag_id = tag_map.tag_id AND tags.tag IN ('tag1', 'tag2', 'tag3', 'tag4', 'tag5', 'tag6');
You can see these discussions on stack overflow :
您可以在堆栈溢出上看到这些讨论:
问题1问题2
Join helps to decrease time complexity and increases stability of server.
加入有助于降低时间复杂度并提高服务器的稳定性。
Information for converting sub queries to joins:
将子查询转换为连接的信息:
link1 link2 link3
#1
13
In MySQL, doing a subquery like this is a "correlated query". This means that the results of the outer SELECT
depend on the result of the inner SELECT
. The outcome is that your inner query is executed once per row, which is very slow.
在MySQL中,执行这样的子查询是“相关查询”。这意味着外部SELECT的结果取决于内部SELECT的结果。结果是你的内部查询每行执行一次,这非常慢。
You should refactor this query; whether you join twice or use two queries is mostly irrelevant. Joining twice would give you:
你应该重构这个查询;无论你是加入两次还是使用两个查询都是无关紧要的。加入两次会给你:
SELECT something
FROM posts
INNER JOIN tag_map ON tag_map.id = posts.id
INNER JOIN tags ON tags.tag_id = tag_map.tag_id
WHERE tags.tag IN ('tag1', ...)
For more information, see the MySQL manual on converting subqueries to JOINs.
有关更多信息,请参阅有关将子查询转换为JOIN的MySQL手册。
Tip: EXPLAIN SELECT
will show you how the optimizer plans on handling your query. If you see DEPENDENT SUBQUERY
you should refactor, these are mega-slow.
提示:EXPLAIN SELECT将向您显示优化程序如何计划处理您的查询。如果你看到依赖的SUBQUERY你应该重构,这些都是非常慢的。
#2
2
You could improve it by using the following:
您可以使用以下方法改进它:
SELECT something
FROM posts
INNER JOIN tag_map ON tag_map.id = posts.id
INNER JOIN tags
ON tags.tag_id=tag_map.tag_id
WHERE <tablename>.tag IN ('tag1', 'tag2', 'tag3', 'tag4', 'tag5', 'tag6')
Just make sure you only select what you need and do not use *; also state in which table you have the tag column so you can substitute <tablename>
只要确保你只选择你需要的东西而不要使用*;还说明您在哪个表中有标记列,以便替换
#3
1
Join does filtering of results. First join will keep results having 1st ON condition satisfied and then 2nd condition gives final result on 2nd ON condition.
加入会对结果进行过滤。第一次连接将保持满足第一ON条件的结果,然后第二条件在第二ON条件下给出最终结果。
SELECT something
FROM posts
INNER JOIN tag_map ON tag_map.id = posts.id
INNER JOIN tags ON tags.tag_id = tag_map.tag_id AND tags.tag IN ('tag1', 'tag2', 'tag3', 'tag4', 'tag5', 'tag6');
You can see these discussions on stack overflow :
您可以在堆栈溢出上看到这些讨论:
问题1问题2
Join helps to decrease time complexity and increases stability of server.
加入有助于降低时间复杂度并提高服务器的稳定性。
Information for converting sub queries to joins:
将子查询转换为连接的信息:
link1 link2 link3