I have a keywords
table like this:
我有一个像这样的关键字表:
+---------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| name | varchar(255) | YES | MUL | NULL | |
| country | varchar(2) | YES | | NULL | |
+---------+--------------+------+-----+---------+----------------+
And I have compound index on [name, country]
:
我在[name,country]上有复合索引:
+----------+------------+------------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type |
+----------+------------+------------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+
| keywords | 0 | PRIMARY | 1 | id | A | 377729 | NULL | NULL | | BTREE |
| keywords | 1 | index_keywords_on_name_and_country | 1 | name | A | 377729 | NULL | NULL | YES | BTREE |
| keywords | 1 | index_keywords_on_name_and_country | 2 | country | A | 377729 | NULL | NULL | YES | BTREE |
+----------+------------+------------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+
I need to use BINARY LOWER
to compare the name
field, so my query will be like this:
我需要使用BINARY LOWER来比较名称字段,所以我的查询将是这样的:
SELECT keywords.* FROM `keywords` WHERE (BINARY LOWER(`name`) = BINARY LOWER('Apple') AND `country` = 'US');
But the problem is: it's not using the index. Using the Explain
I have:
但问题是:它没有使用索引。使用说明我有:
+------+-------------+----------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+----------+------+---------------+------+---------+------+--------+-------------+
| 1 | SIMPLE | keywords | ALL | NULL | NULL | NULL | NULL | 366519 | Using where |
+------+-------------+----------+------+---------------+------+---------+------+--------+-------------+
However, instead of select *
, if I select some fields, it will then use index:
但是,如果我选择一些字段,它将使用索引而不是select *;
Explain SELECT keywords.id, keywords.name FROM `keywords` WHERE (BINARY LOWER(`name`) = BINARY LOWER('Apple') AND `country` = 'US');
+------+-------------+----------+-------+---------------+------------------------------------+---------+------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+----------+-------+---------------+------------------------------------+---------+------+--------+--------------------------+
| 1 | SIMPLE | keywords | index | NULL | index_keywords_on_name_and_country | 777 | NULL | 366519 | Using where; Using index |
+------+-------------+----------+-------+---------------+------------------------------------+---------+------+--------+--------------------------+
I'm using MySQL 5.5.
我正在使用MySQL 5.5。
Any reason why this happens?
出现这种情况的原因是什么
And is there a way I can use the index on my query? Or how can change my query and table in order to use the index to speed up the query.
有没有办法可以在我的查询中使用索引?或者如何更改我的查询和表以使用索引来加速查询。
Thanks
谢谢
2 个解决方案
#1
1
Why do you need to do the comparison to binary lower()
? This seems like a very odd requirement for keywords.
为什么需要与二进制lower()进行比较?这似乎是关键字的一个非常奇怪的要求。
In any case, you could do this with subqueries:
在任何情况下,您都可以使用子查询执行此操作:
SELECT k.*
FROM (SELECT k.*
FROM `keywords` k
WHERE name = 'Apple' and country = 'US'
) k
WHERE (BINARY LOWER(`name`) = BINARY LOWER('Apple') AND `country` = 'US');
The inner subquery should use the index. The resulting scan should be on a small subset, so it should be fast.
内部子查询应该使用索引。生成的扫描应该在一个小子集上,因此它应该很快。
#2
0
Yes, changing the character set (and collation) spoils the use of an index. The optimizer can't rely on the collation you specify alphabetizing strings in the same way they're stored in the index, so therefore it doesn't use the index.
是的,更改字符集(和排序规则)会破坏索引的使用。优化器不能依赖于您指定字母顺序排列字符串的排序规则,就像它们存储在索引中一样,因此它不使用索引。
If you use a case-insensitive COLLATION, you don't have to do this BINARY LOWER expression at all.
如果您使用不区分大小写的COLLATION,则根本不必执行此BINARY LOWER表达式。
mysql> select 'apple' = 'Apple';
+-------------------+
| 'apple' = 'Apple' |
+-------------------+
| 1 |
+-------------------+
The "ci" suffix in collations indicates case-insensitivity.
排序中的“ci”后缀表示不区分大小写。
mysql> show session variables like 'collation%';
+----------------------+-------------------+
| Variable_name | Value |
+----------------------+-------------------+
| collation_connection | utf8_general_ci |
| collation_database | latin1_swedish_ci |
| collation_server | latin1_swedish_ci |
+----------------------+-------------------+
So just do the simple string comparison (provided you have set the collation order for this table to a ci collation):
所以只需进行简单的字符串比较(前提是您已将此表的排序顺序设置为ci排序规则):
SELECT keywords.* FROM `keywords` WHERE `name` = 'Apple' AND `country` = 'US');
Re your comment:
你的评论:
Comparing accented characters depends on the character set and collation.
比较重音字符取决于字符集和整理。
mysql> SELECT 'Lé' = 'le';
mysql> SET NAMES latin1 COLLATE latin1_general_ci;
mysql> select 'lé' = 'Lé';
+---------------+
| 'lé' = 'Lé' |
+---------------+
| 1 |
+---------------+
mysql> select 'lé' = 'Le';
+--------------+
| 'lé' = 'Le' |
+--------------+
| 0 |
+--------------+
I can't find a Unicode collation in MySQL that treat accented characters as different, but preserves case-insensitivity.
我在MySQL中找不到将重音字符视为不同的Unicode排序规则,但保留了不区分大小写。
#1
1
Why do you need to do the comparison to binary lower()
? This seems like a very odd requirement for keywords.
为什么需要与二进制lower()进行比较?这似乎是关键字的一个非常奇怪的要求。
In any case, you could do this with subqueries:
在任何情况下,您都可以使用子查询执行此操作:
SELECT k.*
FROM (SELECT k.*
FROM `keywords` k
WHERE name = 'Apple' and country = 'US'
) k
WHERE (BINARY LOWER(`name`) = BINARY LOWER('Apple') AND `country` = 'US');
The inner subquery should use the index. The resulting scan should be on a small subset, so it should be fast.
内部子查询应该使用索引。生成的扫描应该在一个小子集上,因此它应该很快。
#2
0
Yes, changing the character set (and collation) spoils the use of an index. The optimizer can't rely on the collation you specify alphabetizing strings in the same way they're stored in the index, so therefore it doesn't use the index.
是的,更改字符集(和排序规则)会破坏索引的使用。优化器不能依赖于您指定字母顺序排列字符串的排序规则,就像它们存储在索引中一样,因此它不使用索引。
If you use a case-insensitive COLLATION, you don't have to do this BINARY LOWER expression at all.
如果您使用不区分大小写的COLLATION,则根本不必执行此BINARY LOWER表达式。
mysql> select 'apple' = 'Apple';
+-------------------+
| 'apple' = 'Apple' |
+-------------------+
| 1 |
+-------------------+
The "ci" suffix in collations indicates case-insensitivity.
排序中的“ci”后缀表示不区分大小写。
mysql> show session variables like 'collation%';
+----------------------+-------------------+
| Variable_name | Value |
+----------------------+-------------------+
| collation_connection | utf8_general_ci |
| collation_database | latin1_swedish_ci |
| collation_server | latin1_swedish_ci |
+----------------------+-------------------+
So just do the simple string comparison (provided you have set the collation order for this table to a ci collation):
所以只需进行简单的字符串比较(前提是您已将此表的排序顺序设置为ci排序规则):
SELECT keywords.* FROM `keywords` WHERE `name` = 'Apple' AND `country` = 'US');
Re your comment:
你的评论:
Comparing accented characters depends on the character set and collation.
比较重音字符取决于字符集和整理。
mysql> SELECT 'Lé' = 'le';
mysql> SET NAMES latin1 COLLATE latin1_general_ci;
mysql> select 'lé' = 'Lé';
+---------------+
| 'lé' = 'Lé' |
+---------------+
| 1 |
+---------------+
mysql> select 'lé' = 'Le';
+--------------+
| 'lé' = 'Le' |
+--------------+
| 0 |
+--------------+
I can't find a Unicode collation in MySQL that treat accented characters as different, but preserves case-insensitivity.
我在MySQL中找不到将重音字符视为不同的Unicode排序规则,但保留了不区分大小写。