Percona MySQL 5.7
Percona MySQL 5.7
table scheeme:
表scheeme:
CREATE TABLE Developer.Rate (
ID bigint(20) UNSIGNED NOT NULL AUTO_INCREMENT,
TIME datetime NOT NULL,
BASE varchar(3) NOT NULL,
QUOTE varchar(3) NOT NULL,
BID double NOT NULL,
ASK double NOT NULL,
PRIMARY KEY (ID),
INDEX IDX_TIME (TIME),
UNIQUE INDEX IDX_UK (BASE, QUOTE, TIME)
)
ENGINE = INNODB
ROW_FORMAT = COMPRESSED;
I try to make request for latests data before selected period. The optimazer use no-complete unique key, only 2 columns of 3.
我尝试在选定的时间段之前请求获取最新的数据。optimazer使用不完整的唯一键,只有2列3。
If I do request in common way:
如我以共同的方式提出要求:
EXPLAIN FORMAT=JSON
SELECT
BID
FROM
Rate
WHERE
BASE = 'EUR'
AND QUOTE = 'USD'
AND `TIME` <= (NOW() - INTERVAL 1 MONTH)
ORDER BY
`TIME` DESC
LIMIT 1
;
"Explain" shows that only 2 first columns of index are used: BASE, QUOTE
“Explain”表示只使用了索引的前两列:BASE, QUOTE
{
"query_block": {
"select_id": 1,
"cost_info": {
"query_cost": "10231052.40"
},
"ordering_operation": {
"using_filesort": false,
"table": {
"table_name": "Rate",
"access_type": "ref",
"possible_keys": [
"IDX_UK",
"IDX_TIME"
],
"key": "IDX_UK",
"used_key_parts": [
"BASE",
"QUOTE"
],
"key_length": "22",
"ref": [
"const",
"const"
],
"rows_examined_per_scan": 45966462,
"rows_produced_per_join": 22983231,
"filtered": "50.00",
"cost_info": {
"read_cost": "1037760.00",
"eval_cost": "4596646.20",
"prefix_cost": "10231052.40",
"data_read_per_join": "1G"
},
"used_columns": [
"ID",
"TIME",
"BASE",
"QUOTE",
"BID"
],
"attached_condition": "((`Developer`.`Rate`.`BASE` <=> 'EUR') and (`Developer`.`Rate`.`QUOTE` <=> 'USD') and (`Developer`.`Rate`.`TIME` <= <cache>((now() - interval 1 month))))"
}
}
}
}
But if you force the optimizer to use IDX_UK, MySQL uses all 3 columns in the request:
但是如果您强迫优化器使用IDX_UK,那么MySQL将使用请求中的所有3列:
EXPLAIN FORMAT=JSON
SELECT
BID
FROM
Rate FORCE INDEX(IDX_UK)
WHERE
BASE = 'EUR'
AND QUOTE = 'USD'
AND `TIME` <= (NOW() - INTERVAL 1 MONTH)
ORDER BY
`TIME` DESC
LIMIT 1
{
"query_block": {
"select_id": 1,
"cost_info": {
"query_cost": "10231052.40"
},
"ordering_operation": {
"using_filesort": false,
"table": {
"table_name": "Rate",
"access_type": "range",
"possible_keys": [
"IDX_UK"
],
"key": "IDX_UK",
"used_key_parts": [
"BASE",
"QUOTE",
"TIME"
],
"key_length": "27",
"rows_examined_per_scan": 45966462,
"rows_produced_per_join": 15320621,
"filtered": "100.00",
"index_condition": "((`Developer`.`Rate`.`BASE` = 'EUR') and (`Developer`.`Rate`.`QUOTE` = 'USD') and (`Developer`.`Rate`.`TIME` <= <cache>((now() - interval 1 month))))",
"cost_info": {
"read_cost": "1037760.00",
"eval_cost": "3064124.31",
"prefix_cost": "10231052.40",
"data_read_per_join": "818M"
},
"used_columns": [
"ID",
"TIME",
"BASE",
"QUOTE",
"BID"
]
}
}
}
}
Why the optimizer don't use all 3 columns without explicit declaration of index?
为什么优化器不使用所有3列而没有显式的索引声明?
Added:
补充道:
A'm I understanding right, I should to use request like this?
A我理解对了吗?我应该使用这样的请求吗?
Reuest example:
活的例子:
EXPLAIN FORMAT=JSON
SELECT
BID
FROM
Rate
WHERE
BASE = 'EUR'
AND QUOTE = 'USD'
AND `TIME` <= (NOW() - INTERVAL 1 MONTH)
ORDER BY
BASE DESC, QUOTE DESC, TIME DESC
LIMIT 1
If I understand it right, the output of Explain vouldn't be better. There are still only 2 columns are used without TIME
如果我理解正确的话,解释的输出就不会更好。仍然只有2列没有时间使用
Explain Output
解释输出
{ "query_block": { "select_id": 1, "cost_info": { "query_cost": "10384642.20" }, "ordering_operation": { "using_filesort": false, "table": { "table_name": "Rate", "access_type": "ref", "possible_keys": [ "IDX_UK", "IDX_TIME" ], "key": "IDX_UK", "used_key_parts": [ "BASE", "QUOTE" ], "key_length": "22", "ref": [ "const", "const" ], "rows_examined_per_scan": 46734411, "rows_produced_per_join": 23367205, "filtered": "50.00", "index_condition": "((
Developer
.Rate
.BASE
<=> 'EUR') and (Developer
.Rate
.QUOTE
<=> 'USD') and (Developer
.Rate
.TIME
<= ((now() - interval 1 month))))", "cost_info": { "read_cost": "1037760.00", "eval_cost": "4673441.10", "prefix_cost": "10384642.20", "data_read_per_join": "1G" }, "used_columns": [ "ID", "TIME", "BASE", "QUOTE", "BID" ] } } } }
" {" " per_7_block ": "select_id": 1, " _cost_info ": {"query_cost": "10384642.20"}, "ordering_operation": {"using_filesort": false, "table": " table_example ": "Rate", "access_type": " _ref "基础<=> 'EUR')和(Developer.Rate)。报价<=> 'USD')和(Developer.Rate)。时间<=(现在(现在()-间隔1个月))”、“cost_info”:{“read_cost”:“1037760.00”,“eval_cost”:“4673441.10”,“prefix_cost”:“10384642.20”,“data_read_per_join”:“1 g”},“used_columns”:[“ID”、“时间”、“基地”,“引用”、“收购”)} } } }
Added 2:
补充2:
I made these 4 requests:
我提出了以下4个要求:
— 1 —
- 1 -
<code>FLUSH STATUS;
SELECT
BID
FROM
Rate
WHERE
BASE = 'EUR'
AND QUOTE = 'USD'
AND `TIME` <= (NOW() - INTERVAL 1 MONTH)
LIMIT 1;
SHOW SESSION STATUS LIKE 'Handler%';</code>
— 2 —
- 2
<code>FLUSH STATUS;
SELECT
BID
FROM
Rate FORCE INDEX (IDX_UK)
WHERE
BASE = 'EUR'
AND QUOTE = 'USD'
AND `TIME` <= (NOW() - INTERVAL 1 MONTH)
LIMIT 1;
SHOW SESSION STATUS LIKE 'Handler%';
</code>
— 3 —
- 3 -
<code>FLUSH STATUS;
SELECT
BID
FROM
Rate
WHERE
BASE = 'EUR'
AND QUOTE = 'USD'
AND `TIME` <= (NOW() - INTERVAL 1 MONTH)
ORDER BY
`TIME` DESC
LIMIT 1;
SHOW SESSION STATUS LIKE 'Handler%';</code>
— 4 —
- 4 -
<code>
FLUSH STATUS;
SELECT
BID
FROM
Rate FORCE INDEX (IDX_UK)
WHERE
BASE = 'EUR'
AND QUOTE = 'USD'
AND `TIME` <= (NOW() - INTERVAL 1 MONTH)
ORDER BY
`TIME` DESC
LIMIT 1;
SHOW SESSION STATUS LIKE 'Handler%';</code>
The output of session_status is the same in all requests except request 3. In output of request 3: Handler_read_prev = 486474; In output of all ather requests: Handler_read_prev = 0;
在所有请求中,session_status的输出都是相同的,除了请求3。在请求的输出中:Handler_read_prev = 486474;在所有通气请求的输出中:Handler_read_prev = 0;
Added 3:
添加3:
I made a copy of the table, removed Id field, promoted UNIQUE key as PRIMARY.
我复制了表,删除了Id字段,将唯一键提升为主键。
The scheme:
的方案:
CREATE TABLE Developer.Rate2 (
TIME datetime NOT NULL,
BASE varchar(3) NOT NULL,
QUOTE varchar(3) NOT NULL,
BID double NOT NULL,
ASK double NOT NULL,
PRIMARY KEY (BASE, QUOTE, TIME),
INDEX IDX_BID_ASK (BID, ASK)
)
ENGINE = INNODB
AVG_ROW_LENGTH = 26
CHARACTER SET utf8
COLLATE utf8_general_ci
ROW_FORMAT = COMPRESSED;
{
"query_block": {
"select_id": 1,
"cost_info": {
"query_cost": "9673452.20"
},
"ordering_operation": {
"using_filesort": false,
"table": {
"table_name": "Rate2",
"access_type": "range",
"possible_keys": [
"PRIMARY"
],
"key": "PRIMARY",
"used_key_parts": [
"BASE",
"QUOTE",
"TIME"
],
"key_length": "27",
"rows_examined_per_scan": 48023345,
"rows_produced_per_join": 16006180,
"filtered": "100.00",
"cost_info": {
"read_cost": "68783.20",
"eval_cost": "3201236.12",
"prefix_cost": "9673452.20",
"data_read_per_join": "732M"
},
"used_columns": [
"TIME",
"BASE",
"QUOTE",
"BID"
],
"attached_condition": "((`Developer`.`Rate2`.`BASE` = 'EUR') and (`Developer`.`Rate2`.`QUOTE` = 'USD') and (`Developer`.`Rate2`.`TIME` <= <cache>((now() - interval 1 month))))"
}
}
}
}
Now the request really works and Explain shows all 3 columns are used. This variant works.
现在这个请求确实有效,Explain显示了所有3列都被使用。这种变体。
2 个解决方案
#1
1
Get rid of ID
, it is of no use. Promote your UNIQUE
key to be PRIMARY
. Now, magically, the query will be faster, and the Question you posed will become moot. (You may also need the DESC
trick that lorraine suggested.)
去掉ID就没用了。提升你独特的钥匙为主要。现在,神奇的是,查询将会更快,您提出的问题将变得毫无意义。(你也可能需要洛林建议的DESC技巧。)
Here's another technique to compare performance:
下面是另一种比较性能的技术:
FLUSH STATUS;
SELECT ...;
SHOW SESSION STATUS LIKE 'Handler%';
I would be interested to see the output from the SHOW
for with and without the DESC
trick. And with/without the FORCE INDEX
you alluded to.
我很想看看这个节目的输出有和没有DESC技巧。没有你提到的力指数。
Why faster? Your query was using a secondary index, but it needed bid
, which was not 'covered' by the index. To get bid
, the PRIMARY KEY
needed to be drilled down in the 'data'. By changing it so that the PK is used, this extra drill-down is obviated.
为什么快?您的查询使用的是辅助索引,但它需要投标,而索引没有“覆盖”该投标。为了获得投标,需要在“数据”中钻取主键。通过改变它,使PK被使用,这个额外的钻取被排除。
#2
1
The behavior you describe (ref access instead of range access over more columns) reminds me of Bug#81341 and Bug#87613. These bugs were fixed in MySQL 5.7.17 and 5.7.21, respectively. Which version are you using?
您描述的行为(引用访问而不是更多列上的范围访问)使我想起了Bug#81341和Bug#87613。这些错误分别在MySQL 5.7.17和5.7.21中被修复。你用的是哪个版本?
#1
1
Get rid of ID
, it is of no use. Promote your UNIQUE
key to be PRIMARY
. Now, magically, the query will be faster, and the Question you posed will become moot. (You may also need the DESC
trick that lorraine suggested.)
去掉ID就没用了。提升你独特的钥匙为主要。现在,神奇的是,查询将会更快,您提出的问题将变得毫无意义。(你也可能需要洛林建议的DESC技巧。)
Here's another technique to compare performance:
下面是另一种比较性能的技术:
FLUSH STATUS;
SELECT ...;
SHOW SESSION STATUS LIKE 'Handler%';
I would be interested to see the output from the SHOW
for with and without the DESC
trick. And with/without the FORCE INDEX
you alluded to.
我很想看看这个节目的输出有和没有DESC技巧。没有你提到的力指数。
Why faster? Your query was using a secondary index, but it needed bid
, which was not 'covered' by the index. To get bid
, the PRIMARY KEY
needed to be drilled down in the 'data'. By changing it so that the PK is used, this extra drill-down is obviated.
为什么快?您的查询使用的是辅助索引,但它需要投标,而索引没有“覆盖”该投标。为了获得投标,需要在“数据”中钻取主键。通过改变它,使PK被使用,这个额外的钻取被排除。
#2
1
The behavior you describe (ref access instead of range access over more columns) reminds me of Bug#81341 and Bug#87613. These bugs were fixed in MySQL 5.7.17 and 5.7.21, respectively. Which version are you using?
您描述的行为(引用访问而不是更多列上的范围访问)使我想起了Bug#81341和Bug#87613。这些错误分别在MySQL 5.7.17和5.7.21中被修复。你用的是哪个版本?