centos7下mysql半同步复制原理安装测试详解

时间:2022-01-23 07:50:34

原理简介

在MySQL5.5之前,MySQL的复制其实都是异步复制(见下图),主库和从库的数据之间存在一定的延迟,这样存在一个隐患:当在主库上写入一个事务并提交成功,而从库尚未得到主库推送的BinLog日志时,恰好主库宕机了,例如主库可能因磁盘损坏、内存故障等造成主库上该事务Binlog丢失,此时从库就可能损失这个事务,从而造成主从不一致。

      centos7下mysql半同步复制原理安装测试详解

为了解决这个问题,从MySQL5.5开始引入了半同步复制机制(Semi_synchronous Replication)。为了保证主库上的每一个Binlog事务都能够被可靠的复制到从库上,主库在每次事务成功提交时,并不及时反馈给前端用户,而是等待其中一个从库也接受到Binlog事务并成功写入中继日志后,主库才返回Commit操作成功给客户端。半同步复制保证了事务成功提交后,至少有两份日志记录,一份在主库的Binlog日志上,另一份在至少一个从库的中继日志Relay Log上,从而更进一步保证了数据的完整性。半同步复制的大致流程如下图:

    centos7下mysql半同步复制原理安装测试详解

半同步复制模式下,假如在上图步骤①②③中任何一个步骤中主库宕机,则事务并未提交成功,从库上也没收到事务对应的Binlog日志,所以主从数据是一致的;假如在步骤④传送Binlog日志到从库时,从库宕机或者网络故障,导致Binlog并没有及时地传送到从库上,此时主库上的事务会等待一段时间(时间长短由参数rpl_semi_sync_master_timeout设置的毫秒数决定),如果Binlog在这段时间内都无法成功推送到从库上,则MySQL自动调整复制为异步复制,事务正常返回提交结果给客户端。

半同步复制很大程度上取决于主从库之间的网络情况,往返时延RTT(Round-Trip Time)越小决定了从库的实时性越好。通俗地说,主从库之间的网络越快,从库越实时。

测试环境

centos7下mysql半同步复制原理安装测试详解

安装步骤

半同步复制是以插件形式来实现的,安装比较简单,在异步复制的环境上(这里已经安装好传统的异步复制),安装半同步插件即可,也可在新建时写入my.cnf,参考文章尾部;

1. 查看MySQL服务器是否支持动态增加插件

mysql> select @@have_dynamic_loading;
+------------------------+
| @@have_dynamic_loading |
+------------------------+
| YES |  //YES表示支持
+------------------------+
row in set (0.00 sec)

2. 确认支持动态插件后,检查安装目录是否存在所需插件,一般在mysql安装目录中的一个.../plugin/目录下,可以搜索一下:

[root@server- ~]# find / -name semisync_*.so
/usr/lib64/mysql/plugin/debug/semisync_master.so
/usr/lib64/mysql/plugin/debug/semisync_slave.so
/usr/lib64/mysql/plugin/semisync_master.so
/usr/lib64/mysql/plugin/semisync_slave.so

3. 在主库上安装插件semisync_master.so

mysql> install plugin rpl_semi_sync_master SONAME 'semisync_master.so';
Query OK, rows affected (0.03 sec)

4. 在从库上安装插件semisync_master.so

mysql> install plugin rpl_semi_sync_slave SONAME 'semisync_slave.so';
Query OK, rows affected (0.00 sec)

插件安装完成后,从plugin表中能够看到刚才安装的插件(这里在主库上查看一下):

mysql> select * from mysql.plugin;
+----------------------+--------------------+
| name | dl |
+----------------------+--------------------+
| rpl_semi_sync_master | semisync_master.so |
+----------------------+--------------------+
row in set (0.00 sec)

也就是说,安装完成后,MySQL会在系统表plugin中记录刚才安装的插件,下次系统重启后会自动加载插件。

5. 分别在主库和从库上配置参数打开半同步semi-sync,默认半同步设置是不打开的。

在主库上配置全局参数:

mysql> show variables like 'rpl_semi_sync_master%';
+------------------------------------+-------+
| Variable_name | Value |
+------------------------------------+-------+
| rpl_semi_sync_master_enabled | OFF |
| rpl_semi_sync_master_timeout | |
| rpl_semi_sync_master_trace_level | |
| rpl_semi_sync_master_wait_no_slave | ON |
+------------------------------------+-------+
rows in set (0.00 sec) mysql> set global rpl_semi_sync_master_enabled = 1;
Query OK, rows affected (0.00 sec) mysql> set global rpl_semi_sync_master_timeout = 20000;
Query OK, rows affected (0.00 sec) mysql> show variables like 'rpl_semi_sync_master%';
+------------------------------------+-------+
| Variable_name | Value |
+------------------------------------+-------+
| rpl_semi_sync_master_enabled | ON |
| rpl_semi_sync_master_timeout | |
| rpl_semi_sync_master_trace_level | |
| rpl_semi_sync_master_wait_no_slave | ON |
+------------------------------------+-------+
rows in set (0.00 sec)

在从库上配置参数:

mysql> set global rpl_semi_sync_slave_enabled = 1;
Query OK, rows affected (0.00 sec)

6. 由于之前配置的是传统的异步复制,所以需要重启一下从库上的I/O线程(如果是全新配置的半同步复制则不需要,后面会提到全新配置):

mysql> stop slave io_thread;
Query OK, rows affected (0.05 sec) mysql> start slave io_thread;
Query OK, rows affected (0.00 sec)

到此,半同步复制配置完成,下面可以来验证一下。

实际测试

1. 先查看当前主库上半同步复制的一些状态值:

mysql> show status like 'Rpl_semi_sync%';
+--------------------------------------------+-------+
| Variable_name | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients | |
| Rpl_semi_sync_master_net_avg_wait_time | |
| Rpl_semi_sync_master_net_wait_time | |
| Rpl_semi_sync_master_net_waits | |
| Rpl_semi_sync_master_no_times | |
| Rpl_semi_sync_master_no_tx | |  //留意该值,后面测试会有变化
| Rpl_semi_sync_master_status | ON |  //留意该值,后面测试会有变化
| Rpl_semi_sync_master_timefunc_failures | |
| Rpl_semi_sync_master_tx_avg_wait_time | |
| Rpl_semi_sync_master_tx_wait_time | |
| Rpl_semi_sync_master_tx_waits | |
| Rpl_semi_sync_master_wait_pos_backtraverse | |
| Rpl_semi_sync_master_wait_sessions | |
| Rpl_semi_sync_master_yes_tx | |  //留意该值,后面测试会有变化
+--------------------------------------------+-------+
rows in set (0.00 sec)

注意环境不一样,可能显示也不一样,着重关注以下3个状态值的变化,而不是上面这些初始值。

Rpl_semi_sync_master_status :值为ON,表示半同步复制目前处于打开状态。

Rpl_semi_sync_master_yes_tx:值为0,表示主库当前尚未有任何一个事务是通过半同步复制到从库。

Rpl_semi_sync_master_no_tx:值为0,表示当前有0个事务不是半同步模式下从库及时响应的。

在主库上执行一个事务,然后再检查一下状态:

mysql> use mydb;
Database changed
mysql> show tables;
Empty set (0.00 sec) mysql> CREATE TABLE customers
-> (
-> cust_id int NOT NULL AUTO_INCREMENT,
-> cust_name char() NOT NULL ,
-> cust_address char() NULL ,
-> cust_city char() NULL ,
-> cust_state char() NULL ,
-> cust_zip char() NULL ,
-> cust_country char() NULL ,
-> cust_contact char() NULL ,
-> cust_email char() NULL ,
-> PRIMARY KEY (cust_id)
-> ) ENGINE=InnoDB;
Query OK, rows affected (0.10 sec) mysql> show tables;
+----------------+
| Tables_in_mydb |
+----------------+
| customers |
+----------------+
row in set (0.00 sec) mysql> show status like 'Rpl_semi_sync%';
+--------------------------------------------+-------+
| Variable_name | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients | |
| Rpl_semi_sync_master_net_avg_wait_time | |
| Rpl_semi_sync_master_net_wait_time | |
| Rpl_semi_sync_master_net_waits | |
| Rpl_semi_sync_master_no_times | |
| Rpl_semi_sync_master_no_tx | |
| Rpl_semi_sync_master_status | ON |
| Rpl_semi_sync_master_timefunc_failures | |
| Rpl_semi_sync_master_tx_avg_wait_time | |
| Rpl_semi_sync_master_tx_wait_time | |
| Rpl_semi_sync_master_tx_waits | |
| Rpl_semi_sync_master_wait_pos_backtraverse | |
| Rpl_semi_sync_master_wait_sessions | |
| Rpl_semi_sync_master_yes_tx | |
+--------------------------------------------+-------+
rows in set (0.00 sec)

此时会发现Rpl_semi_sync_master_yes_tx的值变为1,即刚才的CREATE事务通过半同步复制到从库上了,Rpl_semi_sync_master_yes_tx计数增加1。

到从库确认一下,新建的customers表确实被复制过去了:

2. 接下来模仿网络异常的场景下,主库在等待 rpl_semi_sync_master_timeout毫秒超时后,自动转成异步复制的场景。

在主库上确认半同步复制会等待20s超时:

mysql> show variables like 'rpl_semi_sync_master_timeout';
+------------------------------+-------+
| Variable_name | Value |
+------------------------------+-------+
| rpl_semi_sync_master_timeout | 20000 |    //单位:ms
+------------------------------+-------+
row in set (0.00 sec)

从库上通过iptables命令模拟从库宕机或者网络故障:

[root@server- ~]# iptables -A INPUT -s 138.138.82.10 -j DROP

在主库上执行一个事务并提交(默认提交即可),主库上的提交操作会被阻塞20秒

mysql> INSERT INTO customers(cust_id, cust_name, cust_address, cust_city, cust_state, cust_zip, cust_country, cust_contact, cust_email)
-> VALUES(, 'Coyote Inc.', '200 Maple Lane', 'Detroit', 'MI', '', 'USA', 'Y Lee', 'ylee@coyote.com');
Query OK, row affected (20.05 sec)   //回车后,会卡主(阻塞)20秒,然后才会跳出Query OK...这行,并显示用时20秒

在这个20秒阻塞过程中,新开一个窗口检查当前主库的线程,会发现提交操作在等待从库上半同步复制操作的响应:

mysql> show processlist\G
......
*************************** . row ***************************
Id:
User: root
Host: localhost
db: mydb
Command: Query
Time:
State: Waiting for semi-sync ACK from slave    //阻塞,等待从库确认
Info: INSERT INTO customers(cust_id, cust_name, cust_address, cust_city, cust_state, cust_zip, cust_countr
rows in set (0.00 sec)

阻塞结束后,再次查看半同步复制的一些状态值:

mysql> show status like 'Rpl_semi_sync%';
+--------------------------------------------+----------+
| Variable_name | Value |
+--------------------------------------------+----------+
| Rpl_semi_sync_master_clients | |
| Rpl_semi_sync_master_net_avg_wait_time | |
| Rpl_semi_sync_master_net_wait_time | |
| Rpl_semi_sync_master_net_waits | |
| Rpl_semi_sync_master_no_times | |
| Rpl_semi_sync_master_no_tx | 1 |//该值更新为,表示在半同步复制模式下,从库没有及时响应的事务增加1个
| Rpl_semi_sync_master_status | OFF |//表示主库上半同步复制已经关闭了
| Rpl_semi_sync_master_timefunc_failures | |
| Rpl_semi_sync_master_tx_avg_wait_time | |
| Rpl_semi_sync_master_tx_wait_time | |
| Rpl_semi_sync_master_tx_waits | |
| Rpl_semi_sync_master_wait_pos_backtraverse | |
| Rpl_semi_sync_master_wait_sessions | |
| Rpl_semi_sync_master_yes_tx | |//该值仍然为,表示刚才的事务并不是通过半同步复制完成的,所以半同步成功事务仍然为1个
+--------------------------------------------+----------+
rows in set (0.00 sec)

继续测试:如果从库正常连接上主库之后,主库是否会自动切换回半同步复制模式呢?

那么把之前从库上面的iptables限制条目去除:

[root@server- ~]# iptables -F
[root@server- ~]# iptables -nL   //查看一下确实没了
Chain INPUT (policy ACCEPT)
target prot opt source destination Chain FORWARD (policy ACCEPT)
target prot opt source destination Chain OUTPUT (policy ACCEPT)
target prot opt source destination

然后在从库上查看slave状态:

mysql> show slave status\G
*************************** . row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 138.138.82.10
Master_User: repl_user
Master_Port:
Connect_Retry:
Master_Log_File: master-bin.
Read_Master_Log_Pos:
Relay_Log_File: relay-bin.
Relay_Log_Pos:
Relay_Master_Log_File: master-bin.
Slave_IO_Running: Yes   
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno:
Last_Error:
Skip_Counter:
Exec_Master_Log_Pos:
Relay_Log_Space:
Until_Condition: None
Until_Log_File:
Until_Log_Pos:
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master:
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno:
Last_IO_Error:
Last_SQL_Errno:
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id:
Master_UUID: 8086bac0-a428-11e8-8bf9-00505691656b
Master_Info_File: /var/lib/mysql/master.info
SQL_Delay:
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
Master_Retry_Count:
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set:
Auto_Position:
row in set (0.00 sec) mysql> select * from customers;   //之前在主库插入时阻塞20.05秒的条目也复制过来了
+---------+-------------+----------------+-----------+------------+----------+--------------+--------------+-----------------+
| cust_id | cust_name | cust_address | cust_city | cust_state | cust_zip | cust_country | cust_contact | cust_email |
+---------+-------------+----------------+-----------+------------+----------+--------------+--------------+-----------------+
| | Coyote Inc. | Maple Lane | Detroit | MI | | USA | Y Lee | ylee@coyote.com |
+---------+-------------+----------------+-----------+------------+----------+--------------+--------------+-----------------+
row in set (0.00 sec)

以上显示说明在网络状态恢复后(去掉iptables),从库会自动尝试连接主库,几秒钟后I/O线程状态从Connecting变成了YES,并且主库和从库的数据一致了。

再次查看主库上半同步复制的状态值

mysql> show status like 'Rpl_semi_sync%';
+--------------------------------------------+----------+
| Variable_name | Value |
+--------------------------------------------+----------+
| Rpl_semi_sync_master_clients | |
| Rpl_semi_sync_master_net_avg_wait_time | |
| Rpl_semi_sync_master_net_wait_time | |
| Rpl_semi_sync_master_net_waits | |
| Rpl_semi_sync_master_no_times | |
| Rpl_semi_sync_master_no_tx | |
| Rpl_semi_sync_master_status | ON |
| Rpl_semi_sync_master_timefunc_failures | |
| Rpl_semi_sync_master_tx_avg_wait_time | |
| Rpl_semi_sync_master_tx_wait_time | |
| Rpl_semi_sync_master_tx_waits | |
| Rpl_semi_sync_master_wait_pos_backtraverse | |
| Rpl_semi_sync_master_wait_sessions | |
| Rpl_semi_sync_master_yes_tx | |
+--------------------------------------------+----------+
rows in set (0.00 sec)

以上发现Rpl_semi_sync_master_status的值自动从OFF变成ON,说明在检测到从库正常之后,主库到从库的复制方式会自动切换为半同步复制模式。

我们继续主库上做一个INSERT事务测试,确认当前的复制模式确实是半同步复制:

mysql> INSERT INTO customers(cust_id, cust_name, cust_address, cust_city, cust_state, cust_zip, cust_country, cust_contact)
-> VALUES(10002, 'Mouse House', '333 Fromage Lane', 'Columbus', 'OH', '', 'USA', 'Jerry Mouse');
Query OK, row affected (0.07 sec) mysql> show status like 'Rpl_semi_sync%';
+--------------------------------------------+----------+
| Variable_name | Value |
+--------------------------------------------+----------+
| Rpl_semi_sync_master_clients | |
| Rpl_semi_sync_master_net_avg_wait_time | |
| Rpl_semi_sync_master_net_wait_time | |
| Rpl_semi_sync_master_net_waits | |
| Rpl_semi_sync_master_no_times | |
| Rpl_semi_sync_master_no_tx | |
| Rpl_semi_sync_master_status | ON |
| Rpl_semi_sync_master_timefunc_failures | |
| Rpl_semi_sync_master_tx_avg_wait_time | |
| Rpl_semi_sync_master_tx_wait_time | |
| Rpl_semi_sync_master_tx_waits | |
| Rpl_semi_sync_master_wait_pos_backtraverse | |
| Rpl_semi_sync_master_wait_sessions | |
| Rpl_semi_sync_master_yes_tx | 2 |  //计数增加了1个,变为
+--------------------------------------------+----------+
rows in set (0.00 sec)

可以看出,以上的一个INSERT事务提交后,Rpl_semi_sync_master_yes_tx 值从1变成2,确认了刚才事务的复制事半同步复制。

测试结束;

小结

从半同步复制的流程会发现,半同步复制的“半”就体现在:虽然主库和从库的Binlog日志时同步的,但是主库并不等待从库应用这部分日志就返回提交结果,这部分操作是异步的,从库的数据并不是和主库实时同步的,所以只能成为半同步,而不是完全的实时同步。

补充

通过配置文件添加半同步插件和参数,操作如下:

在/etc/my.cnf中添加以下参数

plugin_load = "rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
rpl_semi_sync_master_enabled = 1
rpl_semi_sync_slave_enabled = 1
rpl_semi_sync_master_timeout = 20000

结束.