知识点:Mysql 索引原理完全手册(1)
知识点:Mysql 索引原理完全手册(2)
知识点:Mysql 索引优化实战(3)
知识点:Mysql 数据库索引优化实战(4)
一:插入订单
业务逻辑:插入订单数据,为了避免重复导单,一般会通过交易号去数据库中查询,判断该订单是否已经存在。
最基础的sql语句
mysql> select * from book_order where order_id = "10000"; +-------+--------------------+-------+------+----------+--------------+----------+------------------+-------------+-------------+------------+---------------------+ | id | order_id | general | net | stock_id | order_status | description | finance_desc | create_type | order_state | creator | create_time | +-------+--------------------+-------+------+----------+--------------+----------+------------------+-------------+-------------+------------+---------------------+ | 10000 | 10000 | 6.6 | 6.13 | 1 | 10 | ok | ok | auto | 1 | itdragon | 2018-06-18 17:01:52 | +-------+--------------------+-------+------+----------+--------------+----------+------------------+-------------+-------------+------------+---------------------+ mysql> explain select * from book_order where order_id = "10000"; +----+-------------+---------------------+------------+------+---------------+------+---------+------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------------------+------------+------+---------------+------+---------+------+------+----------+-------------+ | 1 | SIMPLE | book_order | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 33.33 | Using where | +----+-------------+---------------------+------------+------+---------------+------+---------+------+------+----------+-------------+
几百上千万的订单,全表扫描!
通过explain命令可以清楚MySQL是如何处理sql语句的。打印的内容分别表示:
- id : 查询序列号为1。
- select_type : 查询类型是简单查询,简单的select语句没有union和子查询。
- table : 表是 book_order。
- partitions : 没有分区。
- type : 连接类型,all表示采用全表扫描的方式。
- possible_keys : 可能用到索引为null。
- key : 实际用到索引是null。
- key_len : 索引长度当然也是null。
- ref : 没有哪个列或者参数和key一起被使用。
- Extra : 使用了where查询。
是type为ALL,全表扫描,假设数据库中有几百万条数据,在没有索引的帮助下会异常卡顿。
初步优化:为order_id创建索引
mysql> create unique index idx_order_transaID on book_order (order_id); mysql> explain select * from book_order where order_id = "10000"; +----+-------------+---------------------+------------+-------+--------------------+--------------------+---------+-------+------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------------------+------------+-------+--------------------+--------------------+---------+-------+------+----------+-------+ | 1 | SIMPLE | book_order | NULL | const | idx_order_transaID | idx_order_transaID | 453 | const | 1 | 100 | NULL | +----+-------------+---------------------+------------+-------+--------------------+--------------------+---------+-------+------+----------+-------+
这里创建的索引是唯一索引,而非普通索引。
唯一索引打印的type值是const。表示通过索引一次就可以找到。即找到值就结束扫描返回查询结果。
普通索引打印的type值是ref。表示非唯一性索引扫描。找到值还要继续扫描,直到将索引文件扫描完为止。(这里没有贴出代码),显而易见,const的性能要远高于ref。并且根据业务逻辑来判断,创建唯一索引是合情合理的。
再次优化:覆盖索引
mysql> explain select order_id from book_order where order_id = "10000"; +----+-------------+---------------------+------------+-------+--------------------+--------------------+---------+-------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------------------+------------+-------+--------------------+--------------------+---------+-------+------+----------+-------------+ | 1 | SIMPLE | book_order | NULL | const | idx_order_transaID | idx_order_transaID | 453 | const | 1 | 100 | Using index | +----+-------------+---------------------+------------+-------+--------------------+--------------------+---------+-------+------+----------+-------------+
这里将select * from改为了select order_id from后,Extra 显示 Using index,表示该查询使用了覆盖索引,说明该sql语句的性能很好。若提示的是Using filesort(使用内部排序)和Using temporary(使用临时表)则表明该sql需要立即优化了。
根据业务逻辑来的,查询结构返回order_id 是可以满足业务逻辑要求的。
二:查询订单
业务逻辑:优先处理订单级别高,录入时间长的订单。
通过订单级别和订单录入时间排序
最基础的sql语句
mysql> explain select * from book_order order by order_state,create_time;
+----+-------------+---------------------+------------+------+---------------+------+---------+------+------+----------+----------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------------------+------------+------+---------------+------+---------+------+------+----------+----------------+ | 1 | SIMPLE | book_order | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 100 | Using filesort | +----+-------------+---------------------+------------+------+---------------+------+---------+------+------+----------+----------------+
首先,采用全表扫描就不合理,还使用了文件排序Using filesort,更加拖慢了性能。 MySQL在4.1版本之前文件排序是采用双路排序的算法,由于两次扫描磁盘,I/O耗时太长。后优化成单路排序算法。其本质就是用空间换时间,但如果数据量太大,buffer的空间不足,会导致多次I/O的情况。其效果反而更差。与其找运维同事修改MySQL配置,还不如自己乖乖地建索引。
初步优化:为order_state,create_time 创建复合索引
mysql> create index idx_order_stateDate on book_order (order_state,create_time); mysql> explain select * from book_order order by order_state,create_time; +----+-------------+---------------------+------------+------+---------------+------+---------+------+------+----------+----------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------------------+------------+------+---------------+------+---------+------+------+----------+----------------+ | 1 | SIMPLE | book_order | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 100 | Using filesort | +----+-------------+---------------------+------------+------+---------------+------+---------+------+------+----------+----------------+
创建复合索引后你会惊奇的发现,和没创建索引一样???都是全表扫描,都用到了文件排序。是索引失效?还是索引创建失败?
我们试着看看下面打印情况
mysql> explain select order_state,create_time from book_order order by order_state,create_time;
+----+-------------+---------------------+------------+-------+---------------+---------------------+---------+------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------------------+------------+-------+---------------+---------------------+---------+------+------+----------+-------------+ | 1 | SIMPLE | book_order | NULL | index | NULL | idx_order_stateDate | 68 | NULL | 3 | 100 | Using index | +----+-------------+---------------------+------------+-------+---------------+---------------------+---------+------+------+----------+-------------+
将select * from 换成了 select order_state,create_time from 后。type从all升级为index,表示(full index scan)全索引文件扫描,Extra也显示使用了覆盖索引。可是不对啊!!!!检索虽然快了,但返回的内容只有order_state和create_time 两个字段,让业务同事怎么用?难道把每个字段都建一个复合索引?
MySQL没有这么笨,可以使用force index 强制指定索引。在原来的sql语句上修改 force index(idx_order_stateDate) 即可。
mysql> explain select * from book_order force index(idx_order_stateDate) order by order_state,create_time; +----+-------------+---------------------+------------+-------+---------------+---------------------+---------+------+------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------------------+------------+-------+---------------+---------------------+---------+------+------+----------+-------+ | 1 | SIMPLE | book_order | NULL | index | NULL | idx_order_stateDate | 68 | NULL | 3 | 100 | NULL | +----+-------------+---------------------+------------+-------+---------------+---------------------+---------+------+------+----------+-------+
再次优化:订单级别真的要排序么?
对于这种重复且分布平均的字段,排序和加索引的作用不大。
我们能否先固定 order_state 的值,然后再给 create_time 排序?
mysql> explain select * from book_order where order_state=3 order by create_time;
+----+-------------+---------------------+------------+------+---------------------+---------------------+---------+-------+------+----------+-----------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------------------+------------+------+---------------------+---------------------+---------+-------+------+----------+-----------------------+ | 1 | SIMPLE | book_order | NULL | ref | idx_order_stateDate | idx_order_stateDate | 5 | const | 1 | 100 | Using index condition | +----+-------------+---------------------+------------+------+---------------------+---------------------+---------+-------+------+----------+-----------------------+
和之前的sql比起来,type从index 升级为 ref(非唯一性索引扫描)。索引的长度从68变成了5,说明只用了一个索引。ref也是一个常量。Extra 为Using index condition 表示自动根据临界值,选择索引扫描还是全表扫描。总的来说性能远胜于之前的sql。
小结
建索引:
-
- 主键,唯一索引
-
- 经常用作查询条件的字段需要创建索引
-
- 经常需要排序、分组和统计的字段需要建立索引
-
- 查询中与其他表关联的字段,外键关系建立索引
不要建索引:
-
- 百万级以下的数据不需要创建索引
-
- 经常增删改的表不需要创建索引
-
- 数据重复且分布平均的字段不需要创建索引
-
- 频发更新的字段不适合创建索引
-
- where条件里用不到的字段不需要创建索引
talk is esay , show me the code