在大型表MySQL上优化ORDER BY查询

时间:2022-02-23 01:08:01

I have a browse category query that im trying to optimize. Im ending up with Using temporary; Using filesort in the explain and the query is slow on a category with say 60,000 rows. If i remove the Order By clauses the query runs very fast .05 seconds to do 60,000 rows. With the Order By clauses its very slow around 5 seconds. Parts contains some 500,000 rows as does Parts_Category.

我有一个浏览类别查询,我试图优化。我最后使用临时;在说明和查询中使用filesort在类别为60,000行的类型上很慢。如果我删除Order By子句,查询运行速度非常快.05秒可以执行60,000行。使用Order By子句非常慢,大约5秒钟。部件包含大约500,000行,Parts_Category也是如此。

I have a group index on Parts (status, level, warehouse, updated) called sort_index

我在零件(状态,级别,仓库,更新)上有一个名为sort_index的组索引

At the top of the explain I have | ALL | Using temporary; Using filesort

在解释的顶部我有|所有|使用临时;使用filesort

All the other indexes are showing OK. Can somebody please tell me what the problem might be? Im out of ideas. Maybe i should rearrange this query so i can get better performance perhaps?

所有其他索引都显示正常。有人可以告诉我问题可能是什么?我没有想法。也许我应该重新安排这个查询,以便我可以获得更好的表现?

query.

SELECT Parts.*, Image.type, Image.width, Image.height,
(SELECT name FROM Location_State WHERE id = Parts.state_id) AS state, 
(SELECT name FROM Location_Region WHERE id = Parts.region_id) AS region, 
(SELECT start_date FROM Promotion WHERE id = Parts.promotion_id) AS promotion_start_date, 
(SELECT end_date FROM Promotion WHERE id = Parts.promotion_id) AS promotion_end_date 
FROM ( SELECT parts_id FROM Parts_Category WHERE Parts_Category.category_id = '40' 
UNION SELECT parts_id FROM Parts_Category WHERE Parts_Category.main_category_id = '40') cid 
LEFT JOIN Image ON Parts.image_id = Image.id
JOIN Parts ON Parts.id = cid.parts_id AND Parts.status = 'A'
ORDER BY Parts.level DESC, Parts.warehouse DESC, Parts.updated DESC LIMIT 0, 15
Table structure for table Parts

Field   Type    Null    Default
id  int(11) No  auto_increment
image_id    int(11) Yes 0
gallery_id  int(11) Yes 0
image_count int(3)  Yes 0
promotion_id    int(11) Yes 0
country_id  int(11) Yes NULL
state_id    int(11) Yes NULL
region_id   int(11) Yes NULL
city_id int(11) Yes NULL
area_id int(11) Yes NULL
updated datetime    Yes 0000-00-00 00:00:00
entered datetime    Yes 0000-00-00 00:00:00
renewal_date    date    Yes 0000-00-00
discount_id varchar(10) Yes NULL
title           varchar(100)    Yes 
search_title    varchar(255)    Yes 
warehouse   varchar(50) Yes 
url varchar(255)    Yes 
display_url varchar(255)    Yes 
friendly_url    varchar(100)    Yes NULL
description varchar(255)    Yes 
keywords    varchar(1000)   Yes NULL
attachment_file varchar(255)    Yes 
attachment_caption  varchar(255)    Yes 
status  char(1) Yes 
level   tinyint(3)  Yes 0
worldwide   tinyint(1)  Yes 0
random_number   int(11) Yes NULL
reminder    tinyint(4)  Yes NULL
category_search varchar(1000)   Yes 
video_snippet   varchar(1000)   Yes 
importID    int(11) Yes 0

Indexes

PRIMARY             518623       id
random_number INDEX 32201   random_number
country_id  INDEX   1       country_id
state_id    INDEX   8       state_id
region_id   INDEX   5       region_id
renewal_date    INDEX   1       renewal_date
worldwide   INDEX   1       worldwide
friendly_url    INDEX   518623      friendly_url
promotion_id    INDEX   1       promotion_id
city_id         INDEX   1   city_id
area_id     INDEX   1       area_id
zip_code    INDEX   2790        zip_code
importID    INDEX   518623      importID
image_id    INDEX   10          image_id

--------------
index_browse_category   INDEX   52 
level
status
warehouse   
updated
-----------------
keywords    FULLTEXT    1 
description
keywords
category_search 


Parts_Category

id              int(11)         No   auto_increment     
parts_id        int(11)             No  0       
category_id         int(11)             No  0       
main_category_id    int(10)             No  0   

Index

PRIMARY          PRIMARY    519330           id
category_id          INDEX  519330          category_id
parts_id
main_category_id     INDEX  519330              main_category_id
parts_id





2 个解决方案

#1


39  

Try rewriting your query as this:

尝试重写您的查询,如下所示:

SELECT  p.*, i.type, i.width, i.height,
        (SELECT name FROM Location_State WHERE id = p.state_id) AS state, 
        (SELECT name FROM Location_Region WHERE id = p.region_id) AS region, 
        (SELECT start_date FROM Promotion WHERE id = p.promotion_id) AS promotion_start_date, 
        (SELECT end_date FROM Promotion WHERE id = p.promotion_id) AS promotion_end_date 
FROM    parts p
LEFT JOIN
        image i
ON      i.id = p.image_id
WHERE   EXISTS (
        SELECT  NULL
        FROM    Parts_Category pc
        WHERE   pc.category_id = '40'
                AND pc.parts_id = p.id
        UNION ALL
        SELECT  NULL
        FROM    Parts_Category pc
        WHERE   pc.main_category_id = '40'
                AND pc.parts_id = p.id
        )
        AND p.status = 'A'
ORDER BY
        p.status DESC, p.level DESC, p.warehouse DESC, p.updated DESC
LIMIT   15

You need the following indexes for this to work efficiently:

您需要以下索引才能有效地工作:

parts (status, level, warehouse, updated) -- this one you have
parts_category (category_id, parts_id)
parts_category (main_category_id, parts_id)

Update:

I just created the tables as this:

我刚刚创建了这样的表:

DROP TABLE IF EXISTS `test`.`image`;
CREATE TABLE  `test`.`image` (
  `id` int(11) NOT NULL,
  `type` int(11) NOT NULL,
  `width` int(11) NOT NULL,
  `height` int(11) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

DROP TABLE IF EXISTS `test`.`location_region`;
CREATE TABLE  `test`.`location_region` (
  `id` int(11) NOT NULL,
  `name` varchar(20) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

DROP TABLE IF EXISTS `test`.`location_state`;
CREATE TABLE  `test`.`location_state` (
  `id` int(11) NOT NULL,
  `name` varchar(20) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

DROP TABLE IF EXISTS `test`.`parts`;
CREATE TABLE  `test`.`parts` (
  `id` int(11) NOT NULL,
  `status` char(1) NOT NULL,
  `level` int(11) NOT NULL,
  `warehouse` int(11) NOT NULL,
  `updated` int(11) NOT NULL,
  `state_id` int(11) NOT NULL,
  `region_id` int(11) NOT NULL,
  `promotion_id` int(11) NOT NULL,
  `image_id` int(11) NOT NULL DEFAULT '1',
  PRIMARY KEY (`id`),
  KEY `status` (`status`,`level`,`warehouse`,`updated`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

DROP TABLE IF EXISTS `test`.`parts_category`;
CREATE TABLE  `test`.`parts_category` (
  `id` int(11) NOT NULL,
  `parts_id` int(11) NOT NULL,
  `category_id` int(11) NOT NULL,
  `main_category_id` int(11) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `ix_pc_cat_parts` (`category_id`,`parts_id`),
  KEY `ix_pc_main_parts` (`main_category_id`,`parts_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

DROP TABLE IF EXISTS `test`.`promotion`;
CREATE TABLE  `test`.`promotion` (
  `id` int(11) NOT NULL,
  `start_date` datetime NOT NULL,
  `end_date` datetime NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

and filled them with sample data:

并填写样本数据:

INSERT
INTO    parts
SELECT  id,
        CASE WHEN RAND() < 0.1 THEN 'A' ELSE 'B' END,
        RAND() * 100,
        RAND() * 100,
        RAND() * 100,
        RAND() * 50,
        RAND() * 50,
        RAND() * 50,
        RAND() * 50
FROM    t_source
LIMIT 500000;
INSERT
INTO    parts_category
SELECT  id,
        id,
        RAND() * 100,
        RAND() * 100
FROM    t_source
LIMIT 500000;
INSERT
INTO    location_state
SELECT  id, CONCAT('State ', id)
FROM    t_source
LIMIT 1000;
INSERT
INTO    location_region
SELECT  id, CONCAT('Region ', id)
FROM    t_source
LIMIT 1000;
INSERT
INTO    promotion
SELECT  id,
        '2009-07-22' - INTERVAL RAND() * 5 - 20 DAY,
        '2009-07-22' - INTERVAL RAND() * 5 DAY
FROM    t_source
LIMIT 1000;

The query above runs for 30 milliseconds and yields the following plan:

上面的查询运行30毫秒并产生以下计划:

1, 'PRIMARY', 'p', 'ref', 'status', 'status', '3', 'const', 107408, 'Using where'
1, 'PRIMARY', 'i', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'test.p.image_id', 1, ''
6, 'DEPENDENT SUBQUERY', 'pc', 'ref', 'ix_pc_cat_parts', 'ix_pc_cat_parts', '8', 'const,test.p.id', 1, 'Using index'
7, 'DEPENDENT UNION', 'pc', 'ref', 'ix_pc_main_parts', 'ix_pc_main_parts', '8', 'const,test.p.id', 1, 'Using index'
, 'UNION RESULT', '<union6,7>', 'ALL', '', '', '', '', , ''
5, 'DEPENDENT SUBQUERY', 'Promotion', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'test.p.promotion_id', 1, ''
4, 'DEPENDENT SUBQUERY', 'Promotion', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'test.p.promotion_id', 1, ''
3, 'DEPENDENT SUBQUERY', 'Location_Region', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'test.p.region_id', 1, ''
2, 'DEPENDENT SUBQUERY', 'Location_State', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'test.p.state_id', 1, ''

As you can see, no temporary, no filesort, everything's very fast.

正如你所看到的,没有临时的,没有文件存储,一切都非常快。

To help you anymore, I just need to see how your tables are defined.

为了帮助你,我只需要看看你的表是如何定义的。

#2


1  

John, the problem is your query is constructed such that it is selecting from a derived table. The derived table cannot benefit from your indices. Try updating your query as follows:

John,问题是你的查询被构造成从派生表中选择。派生表无法从您的索引中受益。尝试更新您的查询,如下所示:

SELECT
  Parts.*,
  Image.type, Image.width, Image.height,
  Location_State.name AS state,
  Location_Region.name AS region,
  Promotion.start_date AS promotion_start_date,
  Promotion.end_date AS promotion_end_date
FROM Parts
LEFT JOIN Image ON Parts.image_id = Image.id
LEFT JOIN Location_State ON Parts.state_id = Location_State.id
LEFT JOIN Location_Region ON Parts.state_id = Location_Region.id
LEFT JOIN Promotion ON Parts.promotion_id = Promotion.id
INNER JOIN Parts_Category ON (Parts_Category.category_id = 40 OR Parts_Category.main_category_id = 40)
WHERE Parts.status = 'A'
GROUP BY Parts.id
ORDER BY Parts.level DESC, Parts.warehouse DESC, Parts.updated DESC LIMIT 0, 15

Note, if you don't need LEFT JOIN's for your Location_State, Location_Region, Promotion tables, then use an INNER JOIN instead. It will likely perform better.

注意,如果您的Location_State,Location_Region,Promotion表不需要LEFT JOIN,那么请使用INNER JOIN。它可能会表现得更好。

To further assist in optimizing this query, please provide the following:

为进一步协助优化此查询,请提供以下信息:

SHOW CREATE TABLE Parts;

If the rewritten query I provided works the same as your example (it should), then also provide:

如果我提供的重写查询与您的示例(它应该)相同,那么还提供:

EXPLAIN <my query here>\G

#1


39  

Try rewriting your query as this:

尝试重写您的查询,如下所示:

SELECT  p.*, i.type, i.width, i.height,
        (SELECT name FROM Location_State WHERE id = p.state_id) AS state, 
        (SELECT name FROM Location_Region WHERE id = p.region_id) AS region, 
        (SELECT start_date FROM Promotion WHERE id = p.promotion_id) AS promotion_start_date, 
        (SELECT end_date FROM Promotion WHERE id = p.promotion_id) AS promotion_end_date 
FROM    parts p
LEFT JOIN
        image i
ON      i.id = p.image_id
WHERE   EXISTS (
        SELECT  NULL
        FROM    Parts_Category pc
        WHERE   pc.category_id = '40'
                AND pc.parts_id = p.id
        UNION ALL
        SELECT  NULL
        FROM    Parts_Category pc
        WHERE   pc.main_category_id = '40'
                AND pc.parts_id = p.id
        )
        AND p.status = 'A'
ORDER BY
        p.status DESC, p.level DESC, p.warehouse DESC, p.updated DESC
LIMIT   15

You need the following indexes for this to work efficiently:

您需要以下索引才能有效地工作:

parts (status, level, warehouse, updated) -- this one you have
parts_category (category_id, parts_id)
parts_category (main_category_id, parts_id)

Update:

I just created the tables as this:

我刚刚创建了这样的表:

DROP TABLE IF EXISTS `test`.`image`;
CREATE TABLE  `test`.`image` (
  `id` int(11) NOT NULL,
  `type` int(11) NOT NULL,
  `width` int(11) NOT NULL,
  `height` int(11) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

DROP TABLE IF EXISTS `test`.`location_region`;
CREATE TABLE  `test`.`location_region` (
  `id` int(11) NOT NULL,
  `name` varchar(20) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

DROP TABLE IF EXISTS `test`.`location_state`;
CREATE TABLE  `test`.`location_state` (
  `id` int(11) NOT NULL,
  `name` varchar(20) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

DROP TABLE IF EXISTS `test`.`parts`;
CREATE TABLE  `test`.`parts` (
  `id` int(11) NOT NULL,
  `status` char(1) NOT NULL,
  `level` int(11) NOT NULL,
  `warehouse` int(11) NOT NULL,
  `updated` int(11) NOT NULL,
  `state_id` int(11) NOT NULL,
  `region_id` int(11) NOT NULL,
  `promotion_id` int(11) NOT NULL,
  `image_id` int(11) NOT NULL DEFAULT '1',
  PRIMARY KEY (`id`),
  KEY `status` (`status`,`level`,`warehouse`,`updated`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

DROP TABLE IF EXISTS `test`.`parts_category`;
CREATE TABLE  `test`.`parts_category` (
  `id` int(11) NOT NULL,
  `parts_id` int(11) NOT NULL,
  `category_id` int(11) NOT NULL,
  `main_category_id` int(11) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `ix_pc_cat_parts` (`category_id`,`parts_id`),
  KEY `ix_pc_main_parts` (`main_category_id`,`parts_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

DROP TABLE IF EXISTS `test`.`promotion`;
CREATE TABLE  `test`.`promotion` (
  `id` int(11) NOT NULL,
  `start_date` datetime NOT NULL,
  `end_date` datetime NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

and filled them with sample data:

并填写样本数据:

INSERT
INTO    parts
SELECT  id,
        CASE WHEN RAND() < 0.1 THEN 'A' ELSE 'B' END,
        RAND() * 100,
        RAND() * 100,
        RAND() * 100,
        RAND() * 50,
        RAND() * 50,
        RAND() * 50,
        RAND() * 50
FROM    t_source
LIMIT 500000;
INSERT
INTO    parts_category
SELECT  id,
        id,
        RAND() * 100,
        RAND() * 100
FROM    t_source
LIMIT 500000;
INSERT
INTO    location_state
SELECT  id, CONCAT('State ', id)
FROM    t_source
LIMIT 1000;
INSERT
INTO    location_region
SELECT  id, CONCAT('Region ', id)
FROM    t_source
LIMIT 1000;
INSERT
INTO    promotion
SELECT  id,
        '2009-07-22' - INTERVAL RAND() * 5 - 20 DAY,
        '2009-07-22' - INTERVAL RAND() * 5 DAY
FROM    t_source
LIMIT 1000;

The query above runs for 30 milliseconds and yields the following plan:

上面的查询运行30毫秒并产生以下计划:

1, 'PRIMARY', 'p', 'ref', 'status', 'status', '3', 'const', 107408, 'Using where'
1, 'PRIMARY', 'i', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'test.p.image_id', 1, ''
6, 'DEPENDENT SUBQUERY', 'pc', 'ref', 'ix_pc_cat_parts', 'ix_pc_cat_parts', '8', 'const,test.p.id', 1, 'Using index'
7, 'DEPENDENT UNION', 'pc', 'ref', 'ix_pc_main_parts', 'ix_pc_main_parts', '8', 'const,test.p.id', 1, 'Using index'
, 'UNION RESULT', '<union6,7>', 'ALL', '', '', '', '', , ''
5, 'DEPENDENT SUBQUERY', 'Promotion', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'test.p.promotion_id', 1, ''
4, 'DEPENDENT SUBQUERY', 'Promotion', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'test.p.promotion_id', 1, ''
3, 'DEPENDENT SUBQUERY', 'Location_Region', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'test.p.region_id', 1, ''
2, 'DEPENDENT SUBQUERY', 'Location_State', 'eq_ref', 'PRIMARY', 'PRIMARY', '4', 'test.p.state_id', 1, ''

As you can see, no temporary, no filesort, everything's very fast.

正如你所看到的,没有临时的,没有文件存储,一切都非常快。

To help you anymore, I just need to see how your tables are defined.

为了帮助你,我只需要看看你的表是如何定义的。

#2


1  

John, the problem is your query is constructed such that it is selecting from a derived table. The derived table cannot benefit from your indices. Try updating your query as follows:

John,问题是你的查询被构造成从派生表中选择。派生表无法从您的索引中受益。尝试更新您的查询,如下所示:

SELECT
  Parts.*,
  Image.type, Image.width, Image.height,
  Location_State.name AS state,
  Location_Region.name AS region,
  Promotion.start_date AS promotion_start_date,
  Promotion.end_date AS promotion_end_date
FROM Parts
LEFT JOIN Image ON Parts.image_id = Image.id
LEFT JOIN Location_State ON Parts.state_id = Location_State.id
LEFT JOIN Location_Region ON Parts.state_id = Location_Region.id
LEFT JOIN Promotion ON Parts.promotion_id = Promotion.id
INNER JOIN Parts_Category ON (Parts_Category.category_id = 40 OR Parts_Category.main_category_id = 40)
WHERE Parts.status = 'A'
GROUP BY Parts.id
ORDER BY Parts.level DESC, Parts.warehouse DESC, Parts.updated DESC LIMIT 0, 15

Note, if you don't need LEFT JOIN's for your Location_State, Location_Region, Promotion tables, then use an INNER JOIN instead. It will likely perform better.

注意,如果您的Location_State,Location_Region,Promotion表不需要LEFT JOIN,那么请使用INNER JOIN。它可能会表现得更好。

To further assist in optimizing this query, please provide the following:

为进一步协助优化此查询,请提供以下信息:

SHOW CREATE TABLE Parts;

If the rewritten query I provided works the same as your example (it should), then also provide:

如果我提供的重写查询与您的示例(它应该)相同,那么还提供:

EXPLAIN <my query here>\G