How can I optimize this single query from a single large-ish table (~75M rows)?
如何从一个大的表(~75M行)优化这个查询?
SELECT
log_id
FROM
score
WHERE
class_id IN (17,395)
ORDER BY date_reverse
LIMIT 10000;
I pull the most recent 10k records for a particular set of classes so that I can quickly know if they already exist or not during a larger import script.
我为一组特定的类提取最近的10k记录,以便在较大的导入脚本中快速知道它们是否已经存在。
I think I've indexed appropriately but this query lasts anywhere from 5-50 seconds!
我认为我已经适当地建立了索引,但是这个查询可以持续5-50秒!
Let me know if you need anything else.
如果你还需要什么,请告诉我。
EXPLAIN
SELECT
log_id
FROM
score
WHERE
class_id IN (17,395)
ORDER BY date_reverse
LIMIT 10000;
*** row 1 ***
table: score
type: range
possible_keys: class_id,score_multi_2,class_id_date_reverse,score_multi_5
key: class_id_date_reverse
key_len: 4
ref: NULL
rows: 1287726
Extra: Using where; Using index; Using filesort
CREATE TABLE `score` (
`log_id` bigint(20) NOT NULL,
`profile_id` bigint(20) DEFAULT NULL,
`date` datetime DEFAULT NULL,
`class_id` int(11) NOT NULL,
`score` float(10,6) DEFAULT NULL,
`score_date` datetime DEFAULT NULL,
`process_date` datetime DEFAULT NULL,
`status_type_id` int(3) NOT NULL DEFAULT '0',
`date_reverse` int(11) DEFAULT NULL,
UNIQUE KEY `unique_key` (`log_id`,`class_id`),
KEY `class_id` (`class_id`),
KEY `profile_id` (`profile_id`),
KEY `date` (`date`),
KEY `score` (`score`),
KEY `status_type_id` (`status_type_id `),
KEY `status_type_id_date` (`status_type_id`,`date`),
KEY `class_status_type_id_date_log_id` (`class_id`,`status_type_id`,`date`,`log_id`),
KEY `date_reverse` (`date_reverse`),
KEY `class_id_date_reverse` (`class_id`,`date_reverse`),
KEY `date` (`date`),
KEY `class_id_date_reverse_log_id` (`class_id`,`date_reverse`,`log_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
1 个解决方案
#1
3
My guess is that the fastest way to run this query is to bite the bullet and allow a sort on 20,000 rows. The query I have in mind is:
我的猜测是,运行这个查询的最快方式是咬紧牙关,允许对20,000行进行排序。我想到的问题是:
SELECT *
FROM ((SELECT log_id
FROM score
WHERE class_id = 17
ORDER BY date_reverse
LIMIT 10000
) UNION ALL
(SELECT log_id
FROM score
WHERE class_id = 395
ORDER BY date_reverse
LIMIT 10000
)
) s
ORDER BY date_reverse
LIMIT 10000;
For this query, you want the composite index on score(class_id, date_reverse, log_id)
. Each subquery should use this index quite effectively. However, the final sort will need to use file sort.
对于这个查询,需要在score(class_id、date_reverse、log_id)上建立复合索引。每个子查询都应该非常有效地使用这个索引。但是,最终的排序将需要使用文件排序。
#1
3
My guess is that the fastest way to run this query is to bite the bullet and allow a sort on 20,000 rows. The query I have in mind is:
我的猜测是,运行这个查询的最快方式是咬紧牙关,允许对20,000行进行排序。我想到的问题是:
SELECT *
FROM ((SELECT log_id
FROM score
WHERE class_id = 17
ORDER BY date_reverse
LIMIT 10000
) UNION ALL
(SELECT log_id
FROM score
WHERE class_id = 395
ORDER BY date_reverse
LIMIT 10000
)
) s
ORDER BY date_reverse
LIMIT 10000;
For this query, you want the composite index on score(class_id, date_reverse, log_id)
. Each subquery should use this index quite effectively. However, the final sort will need to use file sort.
对于这个查询,需要在score(class_id、date_reverse、log_id)上建立复合索引。每个子查询都应该非常有效地使用这个索引。但是,最终的排序将需要使用文件排序。