如何提高COUNT（DISTINCT field1）... GROUP BY field2的性能？

I have the following query

我有以下查询

EXPLAIN SELECT COUNT(DISTINCT ip_address) as ip_address, exec_date
    FROM requests
    GROUP BY exec_date;

id  select_type table       type        possible_keys   key         key_len ref      rows   Extra
1   SIMPLE      requests    range       NULL            daily_ips   263     NULL    488213  Using index for group-by (scanning)

With a covering index daily_ips

覆盖索引daily_ips

Table       Non_unique  Key_name    Seq_in_index    Column_name Collation   Cardinality Sub_part    Packed  Null    Index_type  Comment Index_comment
requests    1           daily_ips   1               exec_date   A           16          NULL        NULL    YES BTREE       
requests    1           daily_ips   2               ip_address  A           483492      NULL        NULL    YES BTREE

Is there any way I can further optimize this query?

有什么办法可以进一步优化这个查询吗?

What exactly does Using index for group-by (scanning) mean? Does it mean that the entire GROUP BY clause is done entirely from an index while the COUNT(DISTINCT ip_address) part of the statement is not?

使用索引进行分组(扫描)究竟是什么意思?这是否意味着整个GROUP BY子句完全是从索引完成的,而语句的COUNT(DISTINCT ip_address)部分不是?

2 个解决方案

#1

Based on the data you've provided, I don't see any way you can further optimize the query.

根据您提供的数据,我认为您无法进一步优化查询。

As to your follow-up question, MySQL's manual page describing explain output for Using index for group-by says:

至于你的后续问题,MySQL的手册页描述了使用group-by索引的解释输出:

Similar to the Using index table access method, Using index for group-by indicates that MySQL found an index that can be used to retrieve all columns of a GROUP BY or DISTINCT query without any extra disk access to the actual table. Additionally, the index is used in the most efficient way so that for each group, only a few index entries are read. For details, see Section 8.13.10, “GROUP BY Optimization”.

与Using index table访问方法类似,Using group for group-by表示MySQL找到了一个索引,可用于检索GROUP BY或DISTINCT查询的所有列,而无需对实际表进行任何额外的磁盘访问。此外,索引以最有效的方式使用,因此对于每个组,只读取少数索引条目。有关详细信息,请参见第8.13.10节“GROUP BY优化”。

Your index is particularly well-suited to speeding up your query. Because only indexed fields are being selected (each column in your query also appears in the index), MySQL may not even need to hit the table at all, as all the relevant data appears in the index.

您的索引特别适合加速查询。因为只选择了索引字段(查询中的每一列也出现在索引中),所以MySQL甚至可能根本不需要访问表,因为所有相关数据都出现在索引中。

If executing a query were like performing a search on google, imagine not having to click through to any of the linked sites, because you found the information you were looking for directly in the search results - that's sort of like what not needing to scan the table data is like. Here is some more information on how MySQL uses indexes:

如果执行查询就像在谷歌上执行搜索一样,想象不必点击进入任何链接的网站,因为您在搜索结果中直接找到了您要查找的信息 - 这有点像不需要扫描表数据就像。以下是有关MySQL如何使用索引的更多信息:

In some cases, a query can be optimized to retrieve values without consulting the data rows. (An index that provides all the necessary results for a query is called a covering index.) If a query uses only columns from a table that are numeric and that form a leftmost prefix for some key, the selected values can be retrieved from the index tree for greater speed:

在某些情况下,可以优化查询以在不咨询数据行的情况下检索值。 (为查询提供所有必要结果的索引称为覆盖索引。)如果查询仅使用表中的数字列并且形成某个键的最左前缀,则可以从索引中检索所选值树速度更快:

SELECT key_part3 FROM tbl_name WHERE key_part1=1

SELECT key_part3 FROM tbl_name WHERE key_part1 = 1

#2

You can you the Objectify :

Objectify你可以:

Objectify ofy = ObjectifyService.begin(); Query query = ofy.query(here is the class name.class).filter("column name in the table", value for which to query).list();

Objectify ofy = ObjectifyService.begin();查询query = ofy.query(这里是类name.class).filter(“表中的列名”,要查询的值).list();

Before that you might need to add the jar for Objectify.

在此之前,您可能需要为Objectify添加jar。

#1