I have a table game_log
with fields id
, game_id
and several varchar
fields.
我有一个带有字段id、game_id和几个varchar字段的表game_log。
id
is primary key and game_id
is non-unique key.
id是主键,game_id是非唯一键。
There're two frequent queries:
有两种常见的查询:
SELECT * FROM game_log ORDER BY id DESC LIMIT 20
SELECT * FROM game_log WHERE game_id = <value> ORDER BY id DESC
The table is huge (6.1GB and 32M rows). InnoDB. Rows in it are being added randomly (one per query). Also, some games are being deleted.
这个表很大(6.1GB和3200行)。InnoDB。其中的行是随机添加的(每个查询一个)。此外,一些游戏正在被删除。
I need to reduce disk IO and imrpove responsiveness.
我需要减少磁盘IO和imrpove响应能力。
Should I use key
or range
partitioning? If range
, then by id
or by game_id
? Is there any theory?
我应该使用键还是范围分区?如果范围,那么是id还是game_id?有什么理论?
1 个解决方案
#1
4
Use partitioning by range.
使用分区的范围。
If you partition by key, both of your example queries have to touch every partition.
如果您按键进行分区,那么您的两个示例查询都必须涉及到每个分区。
The theory is that partitioning by KEY is like partitioning by hash, in that consecutive values of the primary key are bound to be stored in separate partitions. By querying a range of id values, you spoil the partition pruning.
理论是,按键分区就像按散列分区一样,因为主键的连续值一定要存储在单独的分区中。通过查询一系列id值,就破坏了分区修剪。
Demo:
演示:
CREATE TABLE `game_log` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`game_id` int(11) NOT NULL DEFAULT '0',
`xyz` varchar(15) DEFAULT NULL,
PRIMARY KEY (`id`,`game_id`)
)
PARTITION BY KEY ()
PARTITIONS 13;
INSERT INTO game_log (game_id) VALUES (1), (2), (3), (4), (5), (6);
EXPLAIN PARTITIONS SELECT * FROM game_log ORDER BY id DESC LIMIT 3\G
id: 1
select_type: SIMPLE
table: game_log
partitions: p0,p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11,p12
EXPLAIN PARTITIONS SELECT * FROM game_log WHERE game_id = 4 ORDER BY id DESC LIMIT 3\G
id: 1
select_type: SIMPLE
table: game_log
partitions: p0,p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11,p12
Whereas if you partition by range on game_id, you can get partition pruning to help you at least when you query for a specific game_id. But your query for any game_id order by id desc is still bound to touch every partition.
然而,如果您在game_id上按范围进行分区,那么至少在查询特定的game_id时,您可以得到分区修剪以帮助您。但是,通过id desc查询任何game_id订单仍然绑定到每个分区。
CREATE TABLE `game_log` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`game_id` int(11) NOT NULL DEFAULT '0',
`xyz` varchar(15) DEFAULT NULL,
PRIMARY KEY (`id`,`game_id`)
)
PARTITION BY RANGE (game_id)
(PARTITION p1 VALUES LESS THAN (3),
PARTITION p2 VALUES LESS THAN (6),
PARTITION p3 VALUES LESS THAN MAXVALUE);
INSERT INTO game_log (game_id) VALUES (1), (2), (3), (4), (5), (6);
EXPLAIN PARTITIONS SELECT * FROM game_log ORDER BY id DESC LIMIT 3\G
id: 1
select_type: SIMPLE
table: game_log
partitions: p1,p2,p3
EXPLAIN PARTITIONS SELECT * FROM game_log WHERE game_id = 4 ORDER BY id DESC LIMIT 3\G
id: 1
select_type: SIMPLE
table: game_log
partitions: p2
#1
4
Use partitioning by range.
使用分区的范围。
If you partition by key, both of your example queries have to touch every partition.
如果您按键进行分区,那么您的两个示例查询都必须涉及到每个分区。
The theory is that partitioning by KEY is like partitioning by hash, in that consecutive values of the primary key are bound to be stored in separate partitions. By querying a range of id values, you spoil the partition pruning.
理论是,按键分区就像按散列分区一样,因为主键的连续值一定要存储在单独的分区中。通过查询一系列id值,就破坏了分区修剪。
Demo:
演示:
CREATE TABLE `game_log` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`game_id` int(11) NOT NULL DEFAULT '0',
`xyz` varchar(15) DEFAULT NULL,
PRIMARY KEY (`id`,`game_id`)
)
PARTITION BY KEY ()
PARTITIONS 13;
INSERT INTO game_log (game_id) VALUES (1), (2), (3), (4), (5), (6);
EXPLAIN PARTITIONS SELECT * FROM game_log ORDER BY id DESC LIMIT 3\G
id: 1
select_type: SIMPLE
table: game_log
partitions: p0,p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11,p12
EXPLAIN PARTITIONS SELECT * FROM game_log WHERE game_id = 4 ORDER BY id DESC LIMIT 3\G
id: 1
select_type: SIMPLE
table: game_log
partitions: p0,p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11,p12
Whereas if you partition by range on game_id, you can get partition pruning to help you at least when you query for a specific game_id. But your query for any game_id order by id desc is still bound to touch every partition.
然而,如果您在game_id上按范围进行分区,那么至少在查询特定的game_id时,您可以得到分区修剪以帮助您。但是,通过id desc查询任何game_id订单仍然绑定到每个分区。
CREATE TABLE `game_log` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`game_id` int(11) NOT NULL DEFAULT '0',
`xyz` varchar(15) DEFAULT NULL,
PRIMARY KEY (`id`,`game_id`)
)
PARTITION BY RANGE (game_id)
(PARTITION p1 VALUES LESS THAN (3),
PARTITION p2 VALUES LESS THAN (6),
PARTITION p3 VALUES LESS THAN MAXVALUE);
INSERT INTO game_log (game_id) VALUES (1), (2), (3), (4), (5), (6);
EXPLAIN PARTITIONS SELECT * FROM game_log ORDER BY id DESC LIMIT 3\G
id: 1
select_type: SIMPLE
table: game_log
partitions: p1,p2,p3
EXPLAIN PARTITIONS SELECT * FROM game_log WHERE game_id = 4 ORDER BY id DESC LIMIT 3\G
id: 1
select_type: SIMPLE
table: game_log
partitions: p2