I have several million records in the following table:
在下表中我有几百万条记录:
CREATE TABLE `customers` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`store_id` int(10) unsigned DEFAULT NULL,
`first_name` varchar(64) DEFAULT NULL,
`middle_name` varchar(64) DEFAULT NULL,
`last_name` varchar(64) DEFAULT NULL,
`email` varchar(128) DEFAULT NULL,
`phone` varchar(20) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `index_store_email` (`store_id`,`email`),
KEY `index_store_phone` (`store_id`,`phone`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;
Query #1 takes ~800ms:SELECT COUNT(*) FROM `customers` WHERE `store_id` = 1;
查询#1需要~800ms:从' store_id ' = 1的' customers '中选择COUNT(*);
Query #2 takes ~1.5ms:SELECT COUNT(*) FROM `customers` WHERE `store_id` = 1 AND `email` IS NULL;
查询#2取~1.5ms:从' customers '中选择COUNT(*),其中' store_id ' = 1, ' email '为空;
Query #3 takes a whopping 5 seconds:SELECT COUNT(*) FROM `customers` WHERE `store_id` = 1 AND `email` IS NOT NULL;
查询#3耗时5秒:从' customers '中选择COUNT(*),其中' store_id ' = 1和' email '不是空的;
Notes:
注:
- I've simplified the table to ask the question, but the query is identical.
- 我简化了问这个问题的表,但是查询是相同的。
- Yes, my table is optimized.
- 是的,我的表优化了。
- Yes, both fields are indexed, see the create syntax above.
- 是的,两个字段都被索引,请参阅上面的create语法。
- There are only a few
store_id
s, but every record has one. - 只有几个store_id,但是每个记录都有一个。
- There are very few customers with
email
set tonull
. - 很少有客户将电子邮件设置为null。
I find a few things strange here:
我在这里发现了一些奇怪的事情:
- Query #1 is simplest! There are only a few possible INT values. Shouldn't it be fastest?
- 查询# 1是简单的!只有几个可能的INT值。不应该是最快的?
- Why is Query #3 so slow? I could cut the time in half by doing the other two queries, and subtracting #1 from #2, but I shouldn't have to.
- 为什么查询#3这么慢?我可以通过做另外两个查询将时间缩短一半,并从#2中减去#1,但我不需要这么做。
Any thoughts on this seemingly basic question? Feel like I'm missing something simple. Did I sleep through a class in db school?
对这个看似基本的问题有什么想法吗?感觉我错过了一些简单的东西。我在db学校上了一节课吗?
2 个解决方案
#1
2
At times the MySQL query parser guesses wrong when it decides which indices to use. For cases like these the index hints can be useful (http://dev.mysql.com/doc/refman/5.7/en/index-hints.html)
有时MySQL查询解析器在决定使用哪些索引时猜测错误。对于这样的情况,索引提示可能是有用的(http://dev.mysql.com/doc/refman/5.7/en/index-hints.html)。
To force the use of an index:
强制使用指数:
SELECT * FROM table1 USE INDEX (col1_index,col2_index)
WHERE col1=1 AND col2=2 AND col3=3;
To force the use of an index including replacing table scans:
强制使用索引,包括替换表扫描:
SELECT * FROM table1 FORCE INDEX (col1_index,col2_index)
WHERE col1=1 AND col2=2 AND col3=3;
To ignore a certain index:
忽略某一指标:
SELECT * FROM table1 IGNORE INDEX (col3_index)
WHERE col1=1 AND col2=2 AND col3=3;
To debug which index is being used the EXPLAIN
statement can be used: (https://dev.mysql.com/doc/refman/5.7/en/explain-output.html)
要调试哪个索引,可以使用EXPLAIN语句:(https://dev.mysql.com/doc/refman/5.7/en/explainoutput.html)
EXPLAIN SELECT * FROM table1
WHERE col1=1 AND col2=2 AND col3=3;
#2
2
Drop the index with just (store_id)
; it is redundant with two other indexes.
用just (store_id)删除索引;它与另外两个索引是冗余的。
This will probably also obviate the need for FORCE INDEX
, etc.
这可能也会消除对力指数等的需要。
INDEX(store_id, email)
works for all three queries.
索引(store_id、电子邮件)适用于所有三个查询。
#1
2
At times the MySQL query parser guesses wrong when it decides which indices to use. For cases like these the index hints can be useful (http://dev.mysql.com/doc/refman/5.7/en/index-hints.html)
有时MySQL查询解析器在决定使用哪些索引时猜测错误。对于这样的情况,索引提示可能是有用的(http://dev.mysql.com/doc/refman/5.7/en/index-hints.html)。
To force the use of an index:
强制使用指数:
SELECT * FROM table1 USE INDEX (col1_index,col2_index)
WHERE col1=1 AND col2=2 AND col3=3;
To force the use of an index including replacing table scans:
强制使用索引,包括替换表扫描:
SELECT * FROM table1 FORCE INDEX (col1_index,col2_index)
WHERE col1=1 AND col2=2 AND col3=3;
To ignore a certain index:
忽略某一指标:
SELECT * FROM table1 IGNORE INDEX (col3_index)
WHERE col1=1 AND col2=2 AND col3=3;
To debug which index is being used the EXPLAIN
statement can be used: (https://dev.mysql.com/doc/refman/5.7/en/explain-output.html)
要调试哪个索引,可以使用EXPLAIN语句:(https://dev.mysql.com/doc/refman/5.7/en/explainoutput.html)
EXPLAIN SELECT * FROM table1
WHERE col1=1 AND col2=2 AND col3=3;
#2
2
Drop the index with just (store_id)
; it is redundant with two other indexes.
用just (store_id)删除索引;它与另外两个索引是冗余的。
This will probably also obviate the need for FORCE INDEX
, etc.
这可能也会消除对力指数等的需要。
INDEX(store_id, email)
works for all three queries.
索引(store_id、电子邮件)适用于所有三个查询。