MySQL数据库分区操作【RANGE】

时间:2021-02-20 00:21:39

客服平台,线上查询存在性能问题,为了解决或者说是缓解这个问题,除了加必要的索引,另外就是将表进行分区。

这里主要是针对既有的表进行分区,采用的是alter table xxx的方式,当然,也可以采用create table xxx partition by range(abc)的方式,都是可以的。两种方式,都验证和测试过,都可行!这里主要介绍alter的方式!

主要是因为alter的过程,遇到一点小小的问题,以备后查。

通过show create table 的方式查看我们的chat_message_history表,结构如下:

Table    Create Table
chat_message_history CREATE TABLE `chat_message_history` (
`id` int() NOT NULL AUTO_INCREMENT,
`visitor_id` varchar() DEFAULT NULL,
`visitor_name` varchar() DEFAULT NULL,
`contentBlob` blob,
`sender` varchar() DEFAULT NULL,
`message_time` datetime DEFAULT NULL COMMENT '消息发送时间',
`jobId` varchar() DEFAULT NULL,
`robot_response` varchar() DEFAULT NULL COMMENT '机器人的回复消息',
`skill_group_id` varchar() DEFAULT NULL,
`type` varchar() DEFAULT NULL,
`new_skill_group_id` varchar() DEFAULT NULL,
`channel` varchar() DEFAULT 'WXAPP' COMMENT '渠道',
`message_id` varchar() DEFAULT NULL,
`sessionId` varchar() DEFAULT NULL,
`message_status` varchar() DEFAULT NULL,
`error_message` varchar() DEFAULT NULL,
`businessType` varchar() DEFAULT NULL COMMENT '1-欢迎语;',
`pFlag` varchar() DEFAULT NULL COMMENT '消息的产品属性 1-微医保 2-微重疾',
PRIMARY KEY (`id`),
KEY `IDX_jobId` (`jobId`),
KEY `idx_his_vis_ctdesc_key` (`visitor_id`,`skill_group_id`,`message_time`)
) DEFAULT CHARSET=utf8

然后就是alter table的方式添加分区,分区按照消息时间,大体是每个月一个分区:

alter table chat_message_history partition by range(to_days(message_time)) (
partition p201708 values less than (to_days('2017-08-31')),
partition p201709 values less than (to_days('2017-09-30')),
partition p201710 values less than (to_days('2017-10-31')),
partition p201711 values less than (to_days('2017-11-30')),
partition p201712 values less than (to_days('2017-12-31')),
partition p201801 values less than (to_days('2018-01-31')),
partition p201802 values less than (to_days('2018-02-30')),
partition p201803 values less than (to_days('2018-03-31')),
partition p201804 values less than (to_days('2018-04-30')),
partition p201805 values less than (to_days('2018-05-31')),
partition p201806 values less than (to_days('2018-06-30')),
partition p201807 values less than (to_days('2018-07-31')),
partition p201808 values less than (to_days('2018-08-31')),
partition p201809 values less than (to_days('2018-09-30')),
partition p201810 values less than (to_days('2018-10-31')),
partition p201811 values less than (to_days('2018-11-30')),
partition p201812 values less than (to_days('2018-12-31')),
partition p201901 values less than (to_days('2019-01-31')),
partition p201902 values less than (to_days('2019-02-30')),
partition p201903 values less than (to_days('2019-03-31')),
partition p201904 values less than (to_days('2019-04-30')),
partition p201905 values less than (to_days('2019-05-31')),
partition p201906 values less than (to_days('2019-06-30')),
partition p201907 values less than (to_days('2019-07-31')),
partition p201908 values less than (to_days('2019-08-31')),
partition p201909 values less than (to_days('2019-09-30')),
partition p201910 values less than (to_days('2019-10-31')),
partition p201911 values less than (to_days('2019-11-30')),
partition p201912 values less than (to_days('2019-12-31')),
partition p202001 values less than (to_days('2020-01-31')),
partition p202002 values less than (to_days('2020-02-30')),
partition p202003 values less than (to_days('2020-03-31')),
partition p202004 values less than (to_days('2020-04-30')),
partition p202005 values less than (to_days('2020-05-31')),
partition p202006 values less than (to_days('2020-06-30')),
partition p202007 values less than (to_days('2020-07-31')),
partition p202008 values less than (to_days('2020-08-31')),
partition p202009 values less than (to_days('2020-09-30')),
partition p202010 values less than (to_days('2020-10-31')),
partition p202011 values less than (to_days('2020-11-30')),
partition p202012 values less than (to_days('2020-12-31')),
PARTITION p202XYZ VALUES LESS THAN (MAXVALUE));

上述SQL执行报错:

ERROR  (HY000): Not allowed to use NULL value in VALUES LESS THAN

仔细查看,上述的LESS THAN后面没有NULL的值啊,都是写的很明确的年月日进行获取天数来得到分界线的啊。。。 最后研究下to_days(expr)函数,

官方文档:

TO_DAYS(date)

Given a date date, returns a day number (the number of days since year 0).

我怀疑是因为我给定的每年的2月份的年月日信息不合法,验证一下:

mysql> select TO_DAYS('2017-02-30');
+-----------------------+
| TO_DAYS('2017-02-30') |
+-----------------------+
| NULL |
+-----------------------+
row in set, warning (0.00 sec) mysql>
mysql> show warnings;
+---------+------+----------------------------------------+
| Level | Code | Message |
+---------+------+----------------------------------------+
| Warning | | Incorrect datetime value: '2017-02-30' |
+---------+------+----------------------------------------+
row in set (0.00 sec)

结合上述错误提示,将分区SQL语句调整一下如下:

alter table chat_message_history partition by range(to_days(message_time)) (
partition p201708 values less than (to_days('2017-09-01')),
partition p201709 values less than (to_days('2017-10-01')),
partition p201710 values less than (to_days('2017-11-01')),
partition p201711 values less than (to_days('2017-12-01')),
partition p201712 values less than (to_days('2018-01-01')),
partition p201801 values less than (to_days('2018-02-01')),
partition p201802 values less than (to_days('2018-03-01')),
partition p201803 values less than (to_days('2018-04-01')),
partition p201804 values less than (to_days('2018-05-01')),
partition p201805 values less than (to_days('2018-06-01')),
partition p201806 values less than (to_days('2018-07-01')),
partition p201807 values less than (to_days('2018-08-01')),
partition p201808 values less than (to_days('2018-09-01')),
partition p201809 values less than (to_days('2018-10-01')),
partition p201810 values less than (to_days('2018-11-01')),
partition p201811 values less than (to_days('2018-12-01')),
partition p201812 values less than (to_days('2019-01-01')),
partition p201901 values less than (to_days('2019-02-01')),
partition p201902 values less than (to_days('2019-03-01')),
partition p201903 values less than (to_days('2019-04-01')),
partition p201904 values less than (to_days('2019-05-01')),
partition p201905 values less than (to_days('2019-06-01')),
partition p201906 values less than (to_days('2019-07-01')),
partition p201907 values less than (to_days('2019-08-01')),
partition p201908 values less than (to_days('2019-09-01')),
partition p201909 values less than (to_days('2019-10-01')),
partition p201910 values less than (to_days('2019-11-01')),
partition p201911 values less than (to_days('2019-12-01')),
partition p201912 values less than (to_days('2020-01-01')),
partition p202001 values less than (to_days('2020-02-01')),
partition p202002 values less than (to_days('2020-03-01')),
partition p202003 values less than (to_days('2020-04-01')),
partition p202004 values less than (to_days('2020-05-01')),
partition p202005 values less than (to_days('2020-06-01')),
partition p202006 values less than (to_days('2020-07-01')),
partition p202007 values less than (to_days('2020-08-01')),
partition p202008 values less than (to_days('2020-09-01')),
partition p202009 values less than (to_days('2020-10-01')),
partition p202010 values less than (to_days('2020-11-01')),
partition p202011 values less than (to_days('2020-12-01')),
partition p202012 values less than (to_days('2021-01-01')),
PARTITION p202XYZ VALUES LESS THAN (MAXVALUE));

执行后还是报错:

ERROR  (HY000): A PRIMARY KEY must include all columns in the table's partitioning function

这个错误是说,分区函数里面,主键必须包含所有的用于建立分区的列。我这里分区,是按照message_time进行分区,所以,这里将message_time和既有的id主键建立联合主键。SQL如下(先删除既有的id主键,再建联合主键):

alter table chat_message_history drop primary key,add primary key (`id`,`message_time`); 

再次执行创建分区的SQL:

mysql> alter table chat_message_history partition by range(to_days(message_time)) (
-> partition p201708 values less than (to_days('2017-09-01')),
-> partition p201709 values less than (to_days('2017-10-01')),
-> partition p201710 values less than (to_days('2017-11-01')),
-> partition p201711 values less than (to_days('2017-12-01')),
-> partition p201712 values less than (to_days('2018-01-01')),
-> partition p201801 values less than (to_days('2018-02-01')),
-> partition p201802 values less than (to_days('2018-03-01')),
-> partition p201803 values less than (to_days('2018-04-01')),
-> partition p201804 values less than (to_days('2018-05-01')),
-> partition p201805 values less than (to_days('2018-06-01')),
-> partition p201806 values less than (to_days('2018-07-01')),
-> partition p201807 values less than (to_days('2018-08-01')),
-> partition p201808 values less than (to_days('2018-09-01')),
-> partition p201809 values less than (to_days('2018-10-01')),
-> partition p201810 values less than (to_days('2018-11-01')),
-> partition p201811 values less than (to_days('2018-12-01')),
-> partition p201812 values less than (to_days('2019-01-01')),
-> partition p201901 values less than (to_days('2019-02-01')),
-> partition p201902 values less than (to_days('2019-03-01')),
-> partition p201903 values less than (to_days('2019-04-01')),
-> partition p201904 values less than (to_days('2019-05-01')),
-> partition p201905 values less than (to_days('2019-06-01')),
-> partition p201906 values less than (to_days('2019-07-01')),
-> partition p201907 values less than (to_days('2019-08-01')),
-> partition p201908 values less than (to_days('2019-09-01')),
-> partition p201909 values less than (to_days('2019-10-01')),
-> partition p201910 values less than (to_days('2019-11-01')),
-> partition p201911 values less than (to_days('2019-12-01')),
-> partition p201912 values less than (to_days('2020-01-01')),
-> partition p202001 values less than (to_days('2021-02-01')),
-> partition p202002 values less than (to_days('2020-03-01')),
-> partition p202003 values less than (to_days('2020-04-01')),
-> partition p202004 values less than (to_days('2020-05-01')),
-> partition p202005 values less than (to_days('2020-06-01')),
-> partition p202006 values less than (to_days('2020-07-01')),
-> partition p202007 values less than (to_days('2020-08-01')),
-> partition p202008 values less than (to_days('2020-09-01')),
-> partition p202009 values less than (to_days('2020-10-01')),
-> partition p202010 values less than (to_days('2020-11-01')),
-> partition p202011 values less than (to_days('2020-12-01')),
-> partition p202012 values less than (to_days('2021-01-01')),
-> PARTITION p202XYZ VALUES LESS THAN (MAXVALUE));
Query OK, rows affected (1.28 sec)
Records: Duplicates: Warnings:

这回成功了,真是折腾!!!

现在就要来验证一下,我们的分区是否起到作用了。主要是进行对比呗,先看没有建立分区的SQL查询:

mysql> explain select * from chat_message_history where message_time > '2017-12-01' and message_time < '2018-01-01';
+----+-------------+----------------------+------+---------------+------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------------+------+---------------+------+---------+------+---------+-------------+
| | SIMPLE | chat_message_history | ALL | NULL | NULL | NULL | NULL | 5103176 | Using where |
+----+-------------+----------------------+------+---------------+------+---------+------+---------+-------------+
row in set (0.00 sec)

涉及到表扫描行数是5103176,这个表一共530W行记录,这里就扫描了510W行,够可以的。。。

那么,加了分区后呢?请看下面的SQL查询:

mysql> explain select * from chat_message_history where message_time > '2017-12-01' and message_time < '2018-01-01';
+----+-------------+----------------------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------------+------+---------------+------+---------+------+--------+-------------+
| | SIMPLE | chat_message_history | ALL | NULL | NULL | NULL | NULL | 829848 | Using where |
+----+-------------+----------------------+------+---------------+------+---------+------+--------+-------------+
row in set (0.00 sec)

这回查询扫描的行数,就变成了80多万行了,少了不少啊!

从这次分区看,分区查询和不分区查询,影响到的扫描行数还是挺明显的。

总结一下:

1,MySQL数据量达到几百万后,多表联合查询时,性能极其不稳定,这个是我们线上系统的真实写照,几天内,两次查询导致数据库连接数耗尽,这次600个连接,全部占用,导致系统不可用!

2,数据量大了,采用分区,或者加索引,可以缓解眼前的问题,但是,随着时间推移,若查询数据量不做限制,最终还是会出现查询响应非常慢的问题。所以,建议采用数据分割或者说是表拆分的方式,基于一定的业务场景或者需要进行,可以保证系统的高可用性。

MySQL数据库分区操作【RANGE】的更多相关文章

  1. (转)运维角度浅谈MySQL数据库优化

    转自:http://lizhenliang.blog.51cto.com/7876557/1657465 一个成熟的数据库架构并不是一开始设计就具备高可用.高伸缩等特性的,它是随着用户量的增加,基础架 ...

  2. MySQL数据库在linux的安装,编程与操作

    一.安装 ubuntu上安装MySQL非常简单只需要几条命令就可以完成. 1. sudo apt-get install mysql-server   2. apt-get isntall mysql ...

  3. 配合crond服务实现自定义周期备份MySQL数据库(使用innobackupex进行备份)

    备份 新建一个脚本/root/backup.py,内容如下: #!/usr/bin/env python# -*- coding: utf-8 -*- ''' 脚本作者:昨夜星辰 脚本作用:配合cro ...

  4. MySQL数据库之数据类型BOOL&sol;BOOLEAN与TINYINT测试总结

    From: http://database.51cto.com/art/201203/323863.htm 网络上很多人咨询mysql是否提供布尔类型?MySQL数据库确实提供布尔类型,此数据类型的关 ...

  5. python2&period;7爬取豆瓣电影top250并写入到TXT,Excel,MySQL数据库

    python2.7爬取豆瓣电影top250并分别写入到TXT,Excel,MySQL数据库 1.任务 爬取豆瓣电影top250 以txt文件保存 以Excel文档保存 将数据录入数据库 2.分析 电影 ...

  6. 利用python2&period;7正则表达式进行豆瓣电影Top250的网络数据采集及MySQL数据库操作

    转载请注明出处 利用python2.7正则表达式进行豆瓣电影Top250的网络数据采集 1.任务 采集豆瓣电影名称.链接.评分.导演.演员.年份.国家.评论人数.简评等信息 将以上数据存入MySQL数 ...

  7. python【第十二篇下】操作MySQL数据库以及ORM之 sqlalchemy

    内容一览: 1.Python操作MySQL数据库 2.ORM sqlalchemy学习 1.Python操作MySQL数据库 2. ORM sqlachemy 2.1 ORM简介 对象关系映射(英语: ...

  8. 用python3&period;x与mysql数据库构建简单的爬虫系统(转)

    这是在博客园的第一篇文章,由于本人还是一个编程菜鸟,也写不出那些高大上的牛逼文章,这篇文章就是对自己这段时间学习python的一个总结吧. 众所周知python是一门对初学编程的人相当友好的编程语言, ...

  9. 第 8 章 MySQL 数据库 Query 的优化

      前言: 在之前“影响 MySQL 应用系统性能的相关因素”一章中我们就已经分析过了Query语句对数据库性能的影响非常大,所以本章将专门针对 MySQL 的 Query 语句的优化进行相应的分析. ...

随机推荐

  1. No&period;25

    每天三件事必做: 1.背单词: 2.跑步: 3.读书.

  2. Windows Azure Web Site &lpar;13&rpar; Azure Web Site备份

    <Windows Azure Platform 系列文章目录> 我们在使用Windows Azure Web Site的时候,经常会遇到需要对Web Site进行备份的情况.在这里笔者简单 ...

  3. mysqli&lowbar;multi&lowbar;query&lpar;&dollar;link&comma; &dollar;wsql&rpar;

    if (mysqli_multi_query($link, $wsql)) { do { if ($result = mysqli_store_result($link)) { mysqli_free ...

  4. 现代程序设计 homework-07

    现代程序设计 homework-07 这次作业是要阅读C++11的新特性,按照老师blog提供的链接稍微学习了一下,一下就是一些学习总结(或者说就是介绍)之类的:由于英文能力有限,并且很多中文资料也都 ...

  5. yumdebug

  6. 对boost&colon;&colon;shared&lowbar;from&lowbar;this的进一步封装

    对boost::shared_from_this的进一步封装 熟悉异步编程的同学可能会对boost::shared_from_this有所了解.我们在传入回调的时候,通常会想要其带上当前类对象的上下文 ...

  7. java 包之 BeanUtils包的使用

    BeanUtils工具包是由Apache公司所开发,主要是方便程序员对Bean类能够进行简便的操作. 在使用BeanUtils工具包之前我们需要的Jar包有以下几种: (1)   BeanUtils相 ...

  8. Educational Codeforces Round 15&lowbar;C&period; Cellular Network

    C. Cellular Network time limit per test 3 seconds memory limit per test 256 megabytes input standard ...

  9. ORACLE设置自启动记录

    设置开机自启动1. 修改Oracle系统配置文件:/etc/oratab,只有这样,Oracle 自带的dbstart和dbshut才能够发挥作用.[root@hailiang ~]# vi /etc ...

  10. docker搭建elk

    docker run -d -p 5601:5601 -p 9200:9200  -p 5044:5044 -v /opt/data/elk-data:/var/lib/elasticsearch - ...