复杂的MySQL查询仍然使用filesort,尽管存在索引

时间:2022-09-17 23:26:48

I have a Joomla table with thousands of rows of content (approx 3million). I'm having a bit of trouble rewriting the database queries to be as fast as possible when querying the tables.

我有一个包含数千行内容(约300万)的Joomla表。我在查询表时尽可能快地重写数据库查询时遇到了一些麻烦。

Here is my full query:

这是我的完整查询:

SELECT cc.title AS category, a.id, a.title, a.alias, a.title_alias, a.introtext, a.fulltext, a.sectionid, a.state, a.catid, a.created, a.created_by, a.created_by_alias, a.modified, a.modified_by, a.checked_out, a.checked_out_time, a.publish_up, a.publish_down, a.attribs, a.hits, a.images, a.urls, a.ordering, a.metakey, a.metadesc, a.access, CASE WHEN CHAR_LENGTH(a.alias) THEN CONCAT_WS(":", a.id, a.alias) ELSE a.id END AS slug, CASE WHEN CHAR_LENGTH(cc.alias) THEN CONCAT_WS(":", cc.id, cc.alias) ELSE cc.id END AS catslug, CHAR_LENGTH( a.`fulltext` ) AS readmore, u.name AS author, u.usertype, g.name AS groups, u.email AS author_email
FROM j15_content AS a
LEFT JOIN j15_categories AS cc
ON a.catid = cc.id
LEFT JOIN j15_users AS u
ON u.id = a.created_by
LEFT JOIN j15_groups AS g
ON a.access = g.id
WHERE 1
AND a.access <= 0
AND a.catid = 108
AND a.state = 1
AND ( publish_up = '0000-00-00 00:00:00' OR publish_up <= '2012-02-08 00:16:26' )
AND ( publish_down = '0000-00-00 00:00:00' OR publish_down >= '2012-02-08 00:16:26' )
ORDER BY a.title, a.created DESC
LIMIT 0, 10

Here is the output from an EXPLAIN:

以下是EXPLAIN的输出:

 +----+-------------+-------+--------+-------------------------------------------------------+-----------+---------+---------------------------+---------+-----------------------------+
| id | select_type | table | type   | possible_keys                                         | key       | key_len | ref                       | rows    | Extra                       |
+----+-------------+-------+--------+-------------------------------------------------------+-----------+---------+---------------------------+---------+-----------------------------+
|  1 | SIMPLE      | a     | ref    | idx_access,idx_state,idx_catid,idx_access_state_catid | idx_catid | 4       | const                     | 3108187 | Using where; Using filesort |
|  1 | SIMPLE      | cc    | const  | PRIMARY                                               | PRIMARY   | 4       | const                     |       1 |                             |
|  1 | SIMPLE      | u     | eq_ref | PRIMARY                                               | PRIMARY   | 4       | database.a.created_by     |       1 |                             |
|  1 | SIMPLE      | g     | eq_ref | PRIMARY                                               | PRIMARY   | 1       | database.a.access         |       1 |                             |
+----+-------------+-------+--------+-------------------------------------------------------+-----------+---------+---------------------------+---------+-----------------------------+

And to show what indexes exist, SHOW INDEX FROM j15_content:

并显示存在哪些索引,SHOW INDEX FROM j15_content:

+-------------+------------+------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table       | Non_unique | Key_name               | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------------+------------+------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| j15_content |          0 | PRIMARY                |            1 | id          | A         |     3228356 |     NULL | NULL   |      | BTREE      |         |
| j15_content |          1 | idx_section            |            1 | sectionid   | A         |           2 |     NULL | NULL   |      | BTREE      |         |
| j15_content |          1 | idx_access             |            1 | access      | A         |           1 |     NULL | NULL   |      | BTREE      |         |
| j15_content |          1 | idx_checkout           |            1 | checked_out | A         |           2 |     NULL | NULL   |      | BTREE      |         |
| j15_content |          1 | idx_state              |            1 | state       | A         |           2 |     NULL | NULL   |      | BTREE      |         |
| j15_content |          1 | idx_catid              |            1 | catid       | A         |           6 |     NULL | NULL   |      | BTREE      |         |
| j15_content |          1 | idx_createdby          |            1 | created_by  | A         |           1 |     NULL | NULL   |      | BTREE      |         |
| j15_content |          1 | title                  |            1 | title       | A         |      201772 |        4 | NULL   |      | BTREE      |         |
| j15_content |          1 | idx_access_state_catid |            1 | access      | A         |           1 |     NULL | NULL   |      | BTREE      |         |
| j15_content |          1 | idx_access_state_catid |            2 | state       | A         |           2 |     NULL | NULL   |      | BTREE      |         |
| j15_content |          1 | idx_access_state_catid |            3 | catid       | A         |           7 |     NULL | NULL   |      | BTREE      |         |
| j15_content |          1 | idx_title_created      |            1 | title       | A         |     3228356 |        8 | NULL   |      | BTREE      |         |
| j15_content |          1 | idx_title_created      |            2 | created     | A         |     3228356 |     NULL | NULL   |      | BTREE      |         |
+-------------+------------+------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

As you can see there are a few pieces of data being taken from the database. Now I have tested by simplifying the query that the real issue lies with the ORDER BY clause. Without ordering the results, the query is quite responsive, here is an explanation:

如您所见,从数据库中获取了一些数据。现在我通过简化查询来测试真正的问题在于ORDER BY子句。如果没有对结果进行排序,查询就会非常敏感,这里有一个解释:

+----+-------------+-------+--------+-------------------------------------------------------+-----------+---------+---------------------------+---------+-------------+
| id | select_type | table | type   | possible_keys                                         | key       | key_len | ref                       | rows    | Extra       |
+----+-------------+-------+--------+-------------------------------------------------------+-----------+---------+---------------------------+---------+-------------+
|  1 | SIMPLE      | a     | ref    | idx_access,idx_state,idx_catid,idx_access_state_catid | idx_catid | 4       | const                     | 3108187 | Using where |
|  1 | SIMPLE      | cc    | const  | PRIMARY                                               | PRIMARY   | 4       | const                     |       1 |             |
|  1 | SIMPLE      | u     | eq_ref | PRIMARY                                               | PRIMARY   | 4       | database.a.created_by     |       1 |             |
|  1 | SIMPLE      | g     | eq_ref | PRIMARY                                               | PRIMARY   | 1       | database.a.access         |       1 |             |
+----+-------------+-------+--------+-------------------------------------------------------+-----------+---------+---------------------------+---------+-------------+

As you can see it's the fatal filesort that's killing the server. With this many rows, I'm doing my best to optimize everything through indexes but something still isn't right with this. Any input would be greatly appreciated.

正如你所看到的那样,致命的文件夹正在杀死服务器。有了这么多行,我正在尽最大努力通过索引来优化所有内容,但有些东西仍然不正确。任何投入将不胜感激。

Tried using FORCE INDEX to no avail:

尝试使用FORCE INDEX无济于事:

explain     SELECT cc.title AS category, a.id, a.title, a.alias, a.title_alias, a.introtext, a.fulltext, a.sectionid, a.state, a.catid, a.created, a.created_by, a.created_by_alias, a.modified, a.modified_by, a.checked_out, a.checked_out_time, a.publish_up, a.publish_down, a.attribs, a.hits, a.images, a.urls, a.ordering, a.metakey, a.metadesc, a.access, CASE WHEN CHAR_LENGTH(a.alias) THEN CONCAT_WS(":", a.id, a.alias) ELSE a.id END AS slug, CASE WHEN CHAR_LENGTH(cc.alias) THEN CONCAT_WS(":", cc.id, cc.alias) ELSE cc.id END AS catslug, CHAR_LENGTH( a.`fulltext` ) AS readmore, u.name AS author, u.usertype, g.name AS groups, u.email AS author_email
    ->     FROM bak_content AS a
    ->     FORCE INDEX (idx_title_created)
    ->     LEFT JOIN bak_categories AS cc
    ->     ON a.catid = cc.id
    ->     LEFT JOIN bak_users AS u
    ->     ON u.id = a.created_by
    ->     LEFT JOIN bak_groups AS g
    ->     ON a.access = g.id
    ->     WHERE 1
    ->     AND a.access <= 0
    ->     AND a.catid = 108
    ->     AND a.state = 1
    ->     AND ( publish_up = '0000-00-00 00:00:00' OR publish_up <= '2012-02-08
    ->     AND ( publish_down = '0000-00-00 00:00:00' OR publish_down >= '2012-0
    ->     ORDER BY a.title, a.created DESC
    ->     LIMIT 0, 10;

Produces:

生产:

+----+-------------+-------+--------+---------------+---------+---------+-------
| id | select_type | table | type   | possible_keys | key     | key_len | ref
+----+-------------+-------+--------+---------------+---------+---------+-------
|  1 | SIMPLE      | a     | ALL    | NULL          | NULL    | NULL    | NULL
|  1 | SIMPLE      | cc    | const  | PRIMARY       | PRIMARY | 4       | const
|  1 | SIMPLE      | u     | eq_ref | PRIMARY       | PRIMARY | 4       | database
|  1 | SIMPLE      | g     | eq_ref | PRIMARY       | PRIMARY | 1       | database
+----+-------------+-------+--------+---------------+---------+---------+-------

6 个解决方案

#1


5  

AFAIK this can't be reasonably solved using an index, hints or restructuring of the query itself.

使用索引,提示或重构查询本身无法合理地解决AFAIK问题。

The reason this is slow is the fact that it requires a filesort of 2M rows which does actually take a long time. If you zoom in on the order by it's specified as ORDER BY a.title, a.created DESC. The problem is the combination of sorting on more than 1 column and having a DESC part. Mysql does not support descending indexes (the keyword DESC is supported in the CREATE INDEX statement but only for future use).

这个问题很慢的原因是它需要一个2M行的文件排,这实际上需要很长时间。如果您通过指定为ORDER BY a.title,a.created DESC来放大订单。问题是在多于1列上进行排序和具有DESC部分的组合。 Mysql不支持降序索引(CREATE INDEX语句支持关键字DESC,但仅供将来使用)。

The suggested workaround is to create an extra column 'reverse_created' that gets automatically populated in such a way that your query can use ORDER BY a.title, a.reverse_created. So you fill it with max_time - created_time. Then create an index on that combination and (if needed) specify that index as a hint.

建议的解决方法是创建一个额外的列'reverse_created',它会自动填充,以便查询可以使用ORDER BY a.title,a.reverse_created。所以你用max_time填充它 - created_time。然后在该组合上创建索引,并(如果需要)将该索引指定为提示。

There are a couple of really good blog articles about this topic that explain this a lot better and with examples:

有一些关于这个主题的非常好的博客文章可以更好地解释这个问题,并举例说明:

-Update- You should be able to do a quick test on this by removing the "DESC" part from the order by in your query. The results will be functionally wrong but it should use the existing index you have (or otherwise the force should work).

-Update-您应该能够通过在查询中删除订单中的“DESC”部分来对此进行快速测试。结果将在功能上错误,但它应该使用您拥有的现有索引(否则强制应该工作)。

#2


0  

Sometimes MySQL has trouble finding the proper index. You can resolve this by hinting to the proper index.

有时MySQL无法找到合适的索引。您可以通过提示正确的索引来解决此问题。

Hint syntax: http://dev.mysql.com/doc/refman/4.1/en/index-hints.html

提示语法:http://dev.mysql.com/doc/refman/4.1/en/index-hints.html

Make sure you have the right index and tune it's performance by experimenting.

确保您拥有正确的索引并通过试验调整其性能。

Cheers!

干杯!

#3


0  

Can you try this variation:

你可以尝试这种变化:

SELECT cc.title AS category, ...
FROM 
    ( SELECT *
      FROM j15_content AS a 
               USE INDEX (title)             --- with and without the hint
      WHERE 1
        AND a.access <= 0
        AND a.catid = 108
        AND a.state = 1
        AND ( publish_up = '0000-00-00 00:00:00' 
           OR publish_up <= '2012-02-08 00:16:26' )
        AND ( publish_down = '0000-00-00 00:00:00' 
           OR publish_down >= '2012-02-08 00:16:26' )
      ORDER BY a.title, a.created DESC
      LIMIT 0, 10
    ) AS a
  LEFT JOIN j15_categories AS cc
    ON a.catid = cc.id
  LEFT JOIN j15_users AS u
    ON u.id = a.created_by
  LEFT JOIN j15_groups AS g
    ON a.access = g.id

An index on (catid, state, title) would be even better I think.

我认为(catid,state,title)的指数会更好。

#4


0  

Perhaps trying this might help:

也许尝试这可能有所帮助:

CREATE INDEX idx_catid_title_created ON j15_content (catid,title(8),created);
DROP INDEX idx_catid ON j15_content;

#5


0  

Have you tried increasing these values tmp_table_size and max_heap_table_size:

您是否尝试过增加这些值tmp_table_size和max_heap_table_size:

There is a short explanation here and also links to the detail of each of them.

这里有一个简短的解释,也链接到每个细节。

Hope this helps!

希望这可以帮助!

#6


0  

I hope this is syntactically correct

我希望这在语法上是正确的

SELECT
    cc.title AS category,
    a.id, a.title, a.alias, a.title_alias,
    a.introtext, a.fulltext, a.sectionid,
    a.state, a.catid, a.created, a.created_by,
    a.created_by_alias, a.modified, a.modified_by,
    a.checked_out, a.checked_out_time,
    a.publish_up, a.publish_down, a.attribs,
    a.hits, a.images, a.urls, a.ordering, a.metakey,
    a.metadesc, a.access,
    CASE WHEN CHAR_LENGTH(a.alias) THEN CONCAT_WS(":", a.id, a.alias) ELSE a.id END AS slug,
    CASE WHEN CHAR_LENGTH(cc.alias) THEN CONCAT_WS(":", cc.id, cc.alias) ELSE cc.id END AS catslug, CHAR_LENGTH( a.`fulltext` ) AS readmore,
    u.name AS author, u.usertype, g.name AS groups, u.email AS author_email 
FROM
(
    SELECT aa.*
    FROM 
    (
        SELECT id FROM 
        FROM j15_content
        WHERE catid=108 AND state=1
        AND a.access <= 0 
        AND (publish_up   = '0000-00-00 00:00:00' OR   publish_up <= '2012-02-08 00:16:26')
        AND (publish_down = '0000-00-00 00:00:00' OR publish_down >= '2012-02-08 00:16:26')
        ORDER BY title,created DESC
        LIMIT 0,10
    ) needed_keys
    LEFT JOIN j15_content aa USING (id)
) a
LEFT JOIN j15_categories AS cc ON a.catid = cc.id 
LEFT JOIN j15_users AS u ON a.created_by = u.id
LEFT JOIN j15_groups AS g ON a.access = g.id;

You will need a supporting index for subquery needed_keys

您将需要子查询needed_keys的支持索引

ALTER TABLE j15_content ADD INDEX subquery_ndx (catid,state,access,title,created);

Give it a Try !!!

试一试 !!!

#1


5  

AFAIK this can't be reasonably solved using an index, hints or restructuring of the query itself.

使用索引,提示或重构查询本身无法合理地解决AFAIK问题。

The reason this is slow is the fact that it requires a filesort of 2M rows which does actually take a long time. If you zoom in on the order by it's specified as ORDER BY a.title, a.created DESC. The problem is the combination of sorting on more than 1 column and having a DESC part. Mysql does not support descending indexes (the keyword DESC is supported in the CREATE INDEX statement but only for future use).

这个问题很慢的原因是它需要一个2M行的文件排,这实际上需要很长时间。如果您通过指定为ORDER BY a.title,a.created DESC来放大订单。问题是在多于1列上进行排序和具有DESC部分的组合。 Mysql不支持降序索引(CREATE INDEX语句支持关键字DESC,但仅供将来使用)。

The suggested workaround is to create an extra column 'reverse_created' that gets automatically populated in such a way that your query can use ORDER BY a.title, a.reverse_created. So you fill it with max_time - created_time. Then create an index on that combination and (if needed) specify that index as a hint.

建议的解决方法是创建一个额外的列'reverse_created',它会自动填充,以便查询可以使用ORDER BY a.title,a.reverse_created。所以你用max_time填充它 - created_time。然后在该组合上创建索引,并(如果需要)将该索引指定为提示。

There are a couple of really good blog articles about this topic that explain this a lot better and with examples:

有一些关于这个主题的非常好的博客文章可以更好地解释这个问题,并举例说明:

-Update- You should be able to do a quick test on this by removing the "DESC" part from the order by in your query. The results will be functionally wrong but it should use the existing index you have (or otherwise the force should work).

-Update-您应该能够通过在查询中删除订单中的“DESC”部分来对此进行快速测试。结果将在功能上错误,但它应该使用您拥有的现有索引(否则强制应该工作)。

#2


0  

Sometimes MySQL has trouble finding the proper index. You can resolve this by hinting to the proper index.

有时MySQL无法找到合适的索引。您可以通过提示正确的索引来解决此问题。

Hint syntax: http://dev.mysql.com/doc/refman/4.1/en/index-hints.html

提示语法:http://dev.mysql.com/doc/refman/4.1/en/index-hints.html

Make sure you have the right index and tune it's performance by experimenting.

确保您拥有正确的索引并通过试验调整其性能。

Cheers!

干杯!

#3


0  

Can you try this variation:

你可以尝试这种变化:

SELECT cc.title AS category, ...
FROM 
    ( SELECT *
      FROM j15_content AS a 
               USE INDEX (title)             --- with and without the hint
      WHERE 1
        AND a.access <= 0
        AND a.catid = 108
        AND a.state = 1
        AND ( publish_up = '0000-00-00 00:00:00' 
           OR publish_up <= '2012-02-08 00:16:26' )
        AND ( publish_down = '0000-00-00 00:00:00' 
           OR publish_down >= '2012-02-08 00:16:26' )
      ORDER BY a.title, a.created DESC
      LIMIT 0, 10
    ) AS a
  LEFT JOIN j15_categories AS cc
    ON a.catid = cc.id
  LEFT JOIN j15_users AS u
    ON u.id = a.created_by
  LEFT JOIN j15_groups AS g
    ON a.access = g.id

An index on (catid, state, title) would be even better I think.

我认为(catid,state,title)的指数会更好。

#4


0  

Perhaps trying this might help:

也许尝试这可能有所帮助:

CREATE INDEX idx_catid_title_created ON j15_content (catid,title(8),created);
DROP INDEX idx_catid ON j15_content;

#5


0  

Have you tried increasing these values tmp_table_size and max_heap_table_size:

您是否尝试过增加这些值tmp_table_size和max_heap_table_size:

There is a short explanation here and also links to the detail of each of them.

这里有一个简短的解释,也链接到每个细节。

Hope this helps!

希望这可以帮助!

#6


0  

I hope this is syntactically correct

我希望这在语法上是正确的

SELECT
    cc.title AS category,
    a.id, a.title, a.alias, a.title_alias,
    a.introtext, a.fulltext, a.sectionid,
    a.state, a.catid, a.created, a.created_by,
    a.created_by_alias, a.modified, a.modified_by,
    a.checked_out, a.checked_out_time,
    a.publish_up, a.publish_down, a.attribs,
    a.hits, a.images, a.urls, a.ordering, a.metakey,
    a.metadesc, a.access,
    CASE WHEN CHAR_LENGTH(a.alias) THEN CONCAT_WS(":", a.id, a.alias) ELSE a.id END AS slug,
    CASE WHEN CHAR_LENGTH(cc.alias) THEN CONCAT_WS(":", cc.id, cc.alias) ELSE cc.id END AS catslug, CHAR_LENGTH( a.`fulltext` ) AS readmore,
    u.name AS author, u.usertype, g.name AS groups, u.email AS author_email 
FROM
(
    SELECT aa.*
    FROM 
    (
        SELECT id FROM 
        FROM j15_content
        WHERE catid=108 AND state=1
        AND a.access <= 0 
        AND (publish_up   = '0000-00-00 00:00:00' OR   publish_up <= '2012-02-08 00:16:26')
        AND (publish_down = '0000-00-00 00:00:00' OR publish_down >= '2012-02-08 00:16:26')
        ORDER BY title,created DESC
        LIMIT 0,10
    ) needed_keys
    LEFT JOIN j15_content aa USING (id)
) a
LEFT JOIN j15_categories AS cc ON a.catid = cc.id 
LEFT JOIN j15_users AS u ON a.created_by = u.id
LEFT JOIN j15_groups AS g ON a.access = g.id;

You will need a supporting index for subquery needed_keys

您将需要子查询needed_keys的支持索引

ALTER TABLE j15_content ADD INDEX subquery_ndx (catid,state,access,title,created);

Give it a Try !!!

试一试 !!!