为什么我的索引不用于文本列上的SELECT DISTINCT查询？

I expected either index to be used for my SELECT DISTINCT query below:

我希望下面的索引可用于我的SELECT DISTINCT查询:

CREATE TABLE test(
  value TEXT
);

INSERT INTO test (value) VALUES ('a');
INSERT INTO test (value) VALUES ('b');
INSERT INTO test (value) VALUES ('c');

CREATE INDEX value_i ON test(value(32));
CREATE FULLTEXT INDEX value_i_ft ON test(value);

SELECT DISTINCT value FROM test;

EXPLAIN SELECT DISTINCT value FROM test;

However, it seems not:

但是,它似乎不是:

--------------
EXPLAIN SELECT DISTINCT value FROM test
--------------

+----+-------------+-------+------+---------------+------+---------+------+------+-----------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra           |
+----+-------------+-------+------+---------------+------+---------+------+------+-----------------+
|  1 | SIMPLE      | test  | ALL  | NULL          | NULL | NULL    | NULL |    2 | Using temporary |
+----+-------------+-------+------+---------------+------+---------+------+------+-----------------+
1 row in set (0.00 sec)

I'm using mysql Ver 14.12 Distrib 5.0.77, for redhat-linux-gnu (i686) using readline 5.1. This kind of query is taking 1.5s on 70,000 rows with 250 distinct values, versus about 10ms on the same table for indexed integer columns.

我正在使用mysql Ver 14.12 Distrib 5.0.77,对于使用readline 5.1的redhat-linux-gnu(i686)。这种查询在70,000行上具有1.5秒,具有250个不同的值,而对于索引整数列,在同一表上约为10ms。

3 个解决方案

#1

check out http://dev.mysql.com/doc/refman/5.1/en/fulltext-fine-tuning.html

查看http://dev.mysql.com/doc/refman/5.1/en/fulltext-fine-tuning.html

The minimum and maximum lengths of words to be indexed are defined by the ft_min_word_len and ft_max_word_len system variables. The default minimum length is 3. In your example all records are 1 char long.

要索引的单词的最小和最大长度由ft_min_word_len和ft_max_word_len系统变量定义。默认最小长度为3.在您的示例中,所有记录都是1个字符长。

#2

Does it do this for a fully-populated table as well?

它是否也为完全填充的表格执行此操作?

In my experience (at least with MS SQL), the optimizer knows when the table is tiny (like your small example), and knows that loading the index would be wasteful, and just loads the entire table.

根据我的经验(至少使用MS SQL),优化器知道表何时很小(就像你的小例子),并且知道加载索引会浪费,只是加载整个表。

#3

If the optimizer determines that it's quicker not to use the index, it won't. Are you sure the index has been populated by the time you run the query?

如果优化器确定不使用索引更快,则不会。您确定在运行查询时填充了索引吗?

Try inserting this before the query:

尝试在查询之前插入此内容:

RUNSTATS ON TABLE test

#1