1,关于MySQL Group Replication
基于组的复制(Group-basedReplication)是一种被使用在容错系统中的技术。Replication-group(复制组)是由能够相互通信的多个服务器(节点)组成的。
在通信层,Groupreplication实现了一系列的机制:比如原子消息(atomicmessage delivery)和全序化消息(totalorderingof messages)。这些原子化,抽象化的机制,为实现更先进的数据库复制方案提供了强有力的支持。
MySQL Group Replication正是基于这些技术和概念,实现了一种多主全更新的复制协议。
简而言之,一个Replication-group就是一组节点,每个节点都可以独立执行事务,而读写事务则会在于group内的其他节点进行协调之后再commit。
因此,当一个事务准备提交时,会自动在group内进行原子性的广播,告知其他节点变更了什么内容/执行了什么事务。
这种原子广播的方式,使得这个事务在每一个节点上都保持着同样顺序。
这意味着每一个节点都以同样的顺序,接收到了同样的事务日志,所以每一个节点以同样的顺序重演了这些事务日志,最终整个group保持了完全一致的状态。
然而,不同的节点上执行的事务之间有可能存在资源争用。这种现象容易出现在两个不同的并发事务上。
假设在不同的节点上有两个并发事务,更新了同一行数据,那么就会发生资源争用。
面对这种情况,GroupReplication判定先提交的事务为有效事务,会在整个group里面重演,后提交的事务会直接中断,或者回滚,最后丢弃掉。
因此,这也是一个无共享的复制方案,每一个节点都保存了完整的数据副本。看如下图片01.png,描述了具体的工作流程,能够简洁的和其他方案进行对比。这个复制方案,在某种程度上,和数据库状态机(DBSM)的Replication方法比较类似。
2,安装mysql 5.7.17
官方下载地址:http://dev.mysql.com/downloads/mysql/,不过官方只保留最新的version,5.7.17这个url地址不一定长期有效,所以,需要的不一定有,我这里在百度云盘保留了下来,版本是5.7.17,可以随时去下载使用,分享地址:链接:http://pan.baidu.com/s/1jIhqSXw 密码:ifx5
安装过程参考我的blog地址:http://blog.csdn.net/mchdba/article/details/53889781,大概变化的地方就是my.cnf里面的server-id需要修改。
在三台db服务器上面设置/etc/hosts映射,如下:
192.168.121.71 db1 192.168.121.111 db2 192.168.121.24 db3 |
安装的数据库服务器:
数据库服务器地址 |
端口 |
数据目录 |
Server-id |
192.168.121.71(db1) |
3317 |
/data/mysql/data |
12001 |
192.168.121.111(db2) |
3317 |
/data/mysql/data |
12002 |
192.168.121.24(db3) |
3317 |
/data/mysql/data |
12003 |
3,创建复制环境
设置hostname和ip映射
在db1、db2、db3上都设置
vim /etc/hosts 192.168.121.71 db1 hch_test_dbm2_121_71 192.168.121.111 db2 bpe_service 192.168.121.24 db3 hch_test_web_1_24 |
在db1/db2/db3上建立复制账号:
mysql> SET SQL_LOG_BIN=0; Query OK, 0 rows affected (0.00 sec)
mysql> mysql> GRANT REPLICATION SLAVE ON *.* TO 'repl'@'192.168.%' IDENTIFIED BY 'rlpbright_1927@ys'; Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> mysql> SET SQL_LOG_BIN=1; Query OK, 0 rows affected (0.00 sec)
mysql> |
blog源地址:http://blog.csdn.net/mchdba/article/details/54381854,作者 mchdba(黄杉),谢绝转载
4,安装group replication插件
在db1、db2、db3上依次安装group replication插件
mysql> INSTALL PLUGIN group_replication SONAME 'group_replication.so'; Query OK, 0 rows affected (0.01 sec)
mysql> |
或者直接在配置文件my.cnf中配置:
plugin-load=group_replication
查看group replication组件
mysql> show plugins; +----------------------------+----------+--------------------+----------------------+---------+ |
看到有group_replication ACTIVE,表示插group_replication插件安装成功。
5,配置group replication参数
确保binlog_format是row格式。
mysql> show variables like 'binlog_format'; +---------------+-------+| Variable_name | Value |+---------------+-------+| binlog_format | ROW |+---------------+-------+1 row in set (0.00 sec) mysql> |
两种配置方式,在线添加 OR 配置文件
(1) db1上的my.cnf配置: # replication config server_id=12001 gtid_mode=ON enforce_gtid_consistency=ON master_info_repository=TABLE relay_log_info_repository=TABLE binlog_checksum=NONE log_slave_updates=ON log_bin=binlog binlog_format=ROW
# group replication config transaction_write_set_extraction=XXHASH64 loose-group_replication_group_name="e4668cea-d7ca-11e6-86b5-18a99b76310d" loose-group_replication_start_on_boot=off loose-group_replication_local_address= "db1:24901" loose-group_replication_group_seeds= "db1:24901,db2:24902,db3:24903" loose-group_replication_bootstrap_group= off loose-group_replication_single_primary_mode=FALSE loose-group_replication_enforce_update_everywhere_checks= TRUE
(2)db2上的my.cnf配置: # replication config server_id=12002 gtid_mode=ON enforce_gtid_consistency=ON master_info_repository=TABLE relay_log_info_repository=TABLE binlog_checksum=NONE log_slave_updates=ON log_bin=binlog binlog_format=ROW
# group replication config transaction_write_set_extraction=XXHASH64 loose-group_replication_group_name="e4668cea-d7ca-11e6-86b5-18a99b76310d" loose-group_replication_start_on_boot=off loose-group_replication_local_address= "db2:24902" loose-group_replication_group_seeds= "db1:24901,db2:24902,db3:24903" loose-group_replication_bootstrap_group= off loose-group_replication_single_primary_mode=FALSE loose-group_replication_enforce_update_everywhere_checks= TRUE
(3)db3上的my.cnf配置: # replication config server_id=12003 gtid_mode=ON enforce_gtid_consistency=ON master_info_repository=TABLE relay_log_info_repository=TABLE binlog_checksum=NONE log_slave_updates=ON log_bin=binlog binlog_format=ROW
# group replication config transaction_write_set_extraction=XXHASH64 loose-group_replication_group_name="e4668cea-d7ca-11e6-86b5-18a99b76310d" loose-group_replication_start_on_boot=off loose-group_replication_local_address= "db3:24903" loose-group_replication_group_seeds= "db1:24901,db2:24902,db3:24903" loose-group_replication_bootstrap_group= off loose-group_replication_single_primary_mode=FALSE loose-group_replication_enforce_update_everywhere_checks= TRUE
|
replication配置参数请参考:
http://mysqlhighavailability.com/mysqlha/gr/doc/limitations.html;
group replication配置参考:
http://mysqlhighavailability.com/mysqlha/gr/doc/getting_started.html#group-replication
配置完后,重启3个db上的mysql服务
Db1启动:
[root@hch_test_dbm2_121_71 ~]# service mysqld start Starting MySQL.. [确定] [root@hch_test_dbm2_121_71 ~]# |
Db2启动:
[root@bpe_service ~]# service mysqld start Starting MySQL.. [确定] [root@bpe_service ~]# |
Db3启动:
[root@hch_test_web_1_24 ~]# service mysqld start Starting MySQL.. [确定] [root@hch_test_web_1_24 ~]# |
6,启动mgr集群
构建之前,保证db1、db2、db3的read_only是关闭的
开始构建group replication集群,通常操作命令
mysql> CHANGE MASTER TO MASTER_USER='repl', MASTER_PASSWORD='rlpbright_1927@ys' FOR CHANNEL 'group_replication_recovery'; Query OK, 0 rows affected, 2 warnings (0.02 sec)
mysql> |
db1上建立基本主库master库:
# 设置group_replication_bootstrap_group为ON是为了标示以后加入集群的服务器以这台服务器为基准,以后加入的就不需要设置。 mysql> SET GLOBAL group_replication_bootstrap_group = ON;Query OK, 0 rows affected (0.00 sec) mysql> START GROUP_REPLICATION;Query OK, 0 rows affected (1.03 sec) mysql> SET GLOBAL group_replication_bootstrap_group=OFF;Query OK, 0 rows affected (0.00 sec) mysql>mysql> SELECT * FROM performance_schema.replication_group_members;+---------------------------+--------------------------------------+----------------------+-------------+--------------+| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |+---------------------------+--------------------------------------+----------------------+-------------+--------------+| group_replication_applier | 3381d155-d7d1-11e6-94f7-b8ca3af6e36c | hch_test_dbm2_121_71 | 3317 | ONLINE |+---------------------------+--------------------------------------+----------------------+-------------+--------------+1 row in set (0.00 sec) mysql> |
Db2上启动group_replication:
Db2上mysql命令行上执行启动: mysql> START GROUP_REPLICATION; Query OK, 0 rows affected (1.02 sec)
mysql>
db1后台error log日志显示,看到db2加入进来了,进了group: 2017-01-11T07:51:40.623093Z 0 [Note] Plugin group_replication reported: 'Marking group replication view change with view_id 14841207424144277:8' 2017-01-11T07:51:40.650962Z 21 [Note] Start binlog_dump to master_thread_id(21) slave_server(12002), pos(, 4) 2017-01-11T07:51:40.687441Z 0 [Note] Plugin group_replication reported: 'The member with address bpe_service:3317 was declared online within the replication group' 2017-01-11T07:52:40.651481Z 21 [Note] Aborted connection 21 to db: 'unconnected' user: 'repl' host: '192.168.121.111' (failed on flush_net())
再去master库db1上,查看group_replication成员,会有db2的显示,看到MEMBER_STATE都是ONLINE,表示都是MASTER: mysql> SELECT * FROM performance_schema.replication_group_members;+---------------------------+--------------------------------------+----------------------+-------------+--------------+| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |+---------------------------+--------------------------------------+----------------------+-------------+--------------+| group_replication_applier | 3381d155-d7d1-11e6-94f7-b8ca3af6e36c | hch_test_dbm2_121_71 | 3317 | ONLINE || group_replication_applier | 84dba8ff-d7d2-11e6-aa9a-18a99b76310d | bpe_service | 3317 | ONLINE |+---------------------------+--------------------------------------+----------------------+-------------+--------------+2 rows in set (0.00 sec) mysql> |
db3上启动group_replication:
Db3命令行上执行: mysql> set global group_replication_allow_local_disjoint_gtids_join=ON; Query OK, 0 rows affected (0.00 sec)
mysql> start group_replication; Query OK, 0 rows affected (1.99 sec)
mysql>
再去master库db1上,查看group_replication成员,会有db3的显示,而且已经是ONLINE了 mysql> SELECT * FROM performance_schema.replication_group_members;+---------------------------+--------------------------------------+----------------------+-------------+--------------+| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |+---------------------------+--------------------------------------+----------------------+-------------+--------------+| group_replication_applier | 21750571-d7d3-11e6-91e2-18a99b763071 | hch_test_web_1_24 | 3317 | ONLINE || group_replication_applier | 3381d155-d7d1-11e6-94f7-b8ca3af6e36c | hch_test_dbm2_121_71 | 3317 | ONLINE || group_replication_applier | 84dba8ff-d7d2-11e6-aa9a-18a99b76310d | bpe_service | 3317 | ONLINE |+---------------------------+--------------------------------------+----------------------+-------------+--------------+3 rows in set (0.00 sec) mysql> db1上后台error log显示: 2017-01-11T08:13:05.972276Z 0 [Note] Plugin group_replication reported: 'getstart group_id 8da193f6' 2017-01-11T08:13:08.146144Z 0 [Note] Plugin group_replication reported: 'Marking group replication view change with view_id 14841207424144277:11' 2017-01-11T08:13:08.174808Z 25 [Note] Start binlog_dump to master_thread_id(25) slave_server(12003), pos(, 4) 2017-01-11T08:13:08.218338Z 0 [Note] Plugin group_replication reported: 'The member with address hch_test_web_1_24:3317 was declared online within the replication group' 2017-01-11T08:14:08.175308Z 25 [Note] Aborted connection 25 to db: 'unconnected' user: 'repl' host: '192.168.121.24' (failed on flush_net()) |
最后查看集群状态:
mysql> SELECT * FROM performance_schema.replication_group_members;+---------------------------+--------------------------------------+----------------------+-------------+--------------+| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |+---------------------------+--------------------------------------+----------------------+-------------+--------------+| group_replication_applier | 3d872c2e-d670-11e6-ac1f-b8ca3af6e36c | hch_test_dbm2_121_71 | 3317 | ONLINE || group_replication_applier | ef8ac2de-d671-11e6-9ba4-18a99b763071 | hch_test_web_1_24 | 3317 | ONLINE || group_replication_applier | fdf2b02e-d66f-11e6-98a8-18a99b76310d | bpe_service | 3317 | ONLINE |+---------------------------+--------------------------------------+----------------------+-------------+--------------+3 rows in set (0.00 sec) mysql> |
7,验证集群复制功能
测试,在master库db1上建立测试库db1,测试表t1,录入一条数据
mysql> create database db1; Query OK, 1 row affected (0.00 sec)
mysql> create table db1.t1(id int,cn varchar(32)); Query OK, 0 rows affected (0.02 sec)
mysql> mysql> insert into t1(id,cn)values( 1,'a'); ERROR 3098 (HY000): The table does not comply with the requirements by an external plugin. mysql> mysql> # 这里原因是group_replaction环境下面,表必须有主键不然不允许往里insert值。所以修改表t1,将id字段设置程主键即可。 mysql> alter table t1 modify id int primary key; Query OK, 0 rows affected (0.02 sec) Records: 0 Duplicates: 0 Warnings: 0
|
然后在db2上执行insert操作,则成功
然后再db3上执行insert操作,则成功
在db1、db2、db3分部录入的数据,再分表去db1、db2、db3上查看t1表的数据,数据都已经复制同步过来了,已经实现了group里面多成员member多处写操作。
8 添加新的成员
简要步骤:
(1) 安装mysql实例
(2) 准备my.cnf
#replication server_id=12001 # 这个是随着新服务器的添加会变化 gtid_mode=ON enforce_gtid_consistency=ON master_info_repository=TABLE relay_log_info_repository=TABLE binlog_checksum=NONE log_slave_updates=ON log_bin=binlog binlog_format=ROW
#group replication transaction_write_set_extraction=XXHASH64 loose-group_replication_group_name="e4668cea-d7ca-11e6-86b5-18a99b76310d" loose-group_replication_start_on_boot=off loose-group_replication_local_address= "db1:24901" # 这个是随着新服务器的添加会变化 loose-group_replication_group_seeds= "db1:24901,db2:24902,db3:24903"# 这个是随着新服务器的添加会变化 loose-group_replication_bootstrap_group= off loose-group_replication_single_primary_mode=FALSE loose-group_replication_enforce_update_everywhere_checks= TRUE
|
(3) 启动mysql实例,service mysqld start;
(4) 开启group replication
SET SQL_LOG_BIN=0; # 在建立复制账号的时候,不需要进入binlog日志 GRANT REPLICATION SLAVE ON *.* TO 'repl'@'192.168.%' IDENTIFIED BY 'rlpbright_1927@ys'; SET SQL_LOG_BIN=1; CHANGE MASTER TO MASTER_USER='repl', MASTER_PASSWORD='rlpbright_1927@ys' FOR CHANNEL 'group_replication_recovery'; START group_replication; SELECT * FROM performance_schema.replication_group_members; |
(5)验证组成员,SELECT * FROM performance_schema.replication_group_members;
9,管理操作命令
参考资料地址:http://mysqlhighavailability.com/mysqlha/gr/doc/monitoring.html
查看group状态信息:SELECT * FROM performance_schema.replication_group_members;
查询replication协议的其它数据:select * from performance_schema.replication_group_member_stats\G
|
查看后台进程2个:
[root@hch_test_dbm2_121_71 soft]# netstat -ntpl|grep mysql tcp 0 0 0.0.0.0:24901 0.0.0.0:* LISTEN 3665/mysqld tcp 0 0 :::3317 :::* LISTEN 3665/mysqld [root@hch_test_dbm2_121_71 soft]# |
组复制成员状态表:select * from performance_schema.replication_group_member_stats;查看的是当前的状态:
performance_schema.replication_applier_status
10,问题记录
10.1问题记录一
Db2的mysql窗口报错: mysql> START GROUP_REPLICATION; ERROR 3092 (HY000): The server is not configured properly to be an active member of the group. Please see more details on error log. mysql>
db2的error log信息如下: 2017-01-11T07:57:41.161360Z 0 [ERROR] Plugin group_replication reported: 'This member has more executed transactions than those present in the group. Local transactions: 84dba8ff-d7d2-11e6-aa9a-18a99b76310d:1 > Group transactions: 3381d155-d7d1-11e6-94f7-b8ca3af6e36c:1, e4668cea-d7ca-11e6-86b5-18a99b76310d:1-3' 2017-01-11T07:57:41.161430Z 0 [ERROR] Plugin group_replication reported: 'The member contains transactions not present in the group. The member will now exit the group.' 2017-01-11T07:57:41.161445Z 0 [Note] Plugin group_replication reported: 'To force this member into the group you can use the group_replication_allow_local_disjoint_gtids_join option' 2017-01-11T07:57:41.161589Z 3 [Note] Plugin group_replication reported: 'Going to wait for view modification' 2017-01-11T07:57:41.162126Z 0 [Note] Plugin group_replication reported: 'getstart group_id 8da193f6' 2017-01-11T07:57:44.508236Z 0 [Note] Plugin group_replication reported: 'state 4330 action xa_terminate' 2017-01-11T07:57:44.508429Z 0 [Note] Plugin group_replication reported: 'new state x_start' 2017-01-11T07:57:44.508561Z 0 [Note] Plugin group_replication reported: 'state 4257 action xa_exit' 2017-01-11T07:57:44.508720Z 0 [Note] Plugin group_replication reported: 'Exiting xcom thread' 2017-01-11T07:57:44.508746Z 0 [Note] Plugin group_replication reported: 'new state x_start' 2017-01-11T07:57:49.549137Z 3 [Note] Plugin group_replication reported: 'auto_increment_increment is reset to 1' 2017-01-11T07:57:49.549192Z 3 [Note] Plugin group_replication reported: 'auto_increment_offset is reset to 1' 2017-01-11T07:57:49.549472Z 23 [Note] Error reading relay log event for channel 'group_replication_applier': slave SQL thread was killed 2017-01-11T07:57:49.550037Z 20 [Note] Plugin group_replication reported: 'The group replication applier thread was killed'
【解决办法】设置group_replication_allow_local_disjoint_gtids_join为ON mysql> set global group_replication_allow_local_disjoint_gtids_join=ON; Query OK, 0 rows affected (0.00 sec)
mysql> START GROUP_REPLICATION; Query OK, 0 rows affected (5.52 sec)
mysql> |
10.2问题记录二
Sql界面显示状态为ERROR:
Errorlog显示:
2017-01-11T08:07:34.587378Z 12 [Note] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_recovery' executed'. Previous state master_host='bpe_service', master_port= 3317, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 4, master_bind=''. 2017-01-11T08:07:34.591679Z 12 [ERROR] Plugin group_replication reported: 'Fatal error during the Recovery process of Group Replication. The server will leave the group.' 2017-01-11T08:07:34.592447Z 0 [Note] Plugin group_replication reported: 'getstart group_id 8da193f6' 2017-01-11T08:07:37.943536Z 0 [Note] Plugin group_replication reported: 'state 4330 action xa_terminate' 2017-01-11T08:07:37.943852Z 0 [Note] Plugin group_replication reported: 'new state x_start' 2017-01-11T08:07:37.943879Z 0 [Note] Plugin group_replication reported: 'state 4257 action xa_exit' 2017-01-11T08:07:37.943981Z 0 [Note] Plugin group_replication reported: 'Exiting xcom thread' 2017-01-11T08:07:37.943999Z 0 [Note] Plugin group_replication reported: 'new state x_start' |
【解决办法】:
mysql> CHANGE MASTER TO MASTER_USER='repl', MASTER_PASSWORD='rlpbright_1927@ys' FOR CHANNEL 'group_replication_recovery';Query OK, 0 rows affected, 2 warnings (0.01 sec) mysql> stop group_replication;Query OK, 0 rows affected (0.01 sec) mysql> start group_replication;Query OK, 0 rows affected (2.20 sec) mysql>mysql> SELECT * FROM performance_schema.replication_group_members;+---------------------------+--------------------------------------+----------------------+-------------+--------------+| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |+---------------------------+--------------------------------------+----------------------+-------------+--------------+| group_replication_applier | 21750571-d7d3-11e6-91e2-18a99b763071 | hch_test_web_1_24 | 3317 | ONLINE || group_replication_applier | 3381d155-d7d1-11e6-94f7-b8ca3af6e36c | hch_test_dbm2_121_71 | 3317 | ONLINE || group_replication_applier | 84dba8ff-d7d2-11e6-aa9a-18a99b76310d | bpe_service | 3317 | ONLINE |+---------------------------+--------------------------------------+----------------------+-------------+--------------+3 rows in set (0.00 sec) mysql> |
参考文章: