mysql change master导致gtid丢失

时间:2023-03-08 20:44:33

change master导致gtid丢失
从innobackupex恢复导致binlog的拉取位置会导致主备gtid不一致。
此类错误通过构造空事务方式无法修复。
此时就需要change master 方式指向失败事件的下一个位点。然后按位点的方式(master_auto_position=0)来拉binlog。

Slave_IO_State: Queueing master event to the relay log
Master_Host: 10.1.1.111
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000003
Read_Master_Log_Pos: 478283
Relay_Log_File: relay-bin.000002
Relay_Log_Pos: 361
Relay_Master_Log_File: mysql-bin.000003
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1050
Last_Error: Error 'Table 'kelvin' already exists' on query. Default database: 'test'. Query: 'CREATE TABLE `kelvin` (
`id` bigint(20) NOT NULL,
`username` varchar(10) NOT NULL DEFAULT 'kelvin',
`passwd` varchar(4000) NOT NULL DEFAULT 'kelvin',
`createdate` int(10) NOT NULL,
`groups` varchar(2) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8'
Skip_Counter: 0
Exec_Master_Log_Pos: 151
Relay_Log_Space: 478691
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1050
Last_SQL_Error: Error 'Table 'kelvin' already exists' on query. Default database: 'test'. Query: 'CREATE TABLE `kelvin` (
`id` bigint(20) NOT NULL,
`username` varchar(10) NOT NULL DEFAULT 'kelvin',
`passwd` varchar(4000) NOT NULL DEFAULT 'kelvin',
`createdate` int(10) NOT NULL,
`groups` varchar(2) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8'
Replicate_Ignore_Server_Ids:
Master_Server_Id: 113306
Master_UUID: 26e3db40-51d4-11e7-adc8-000c29a459b4
Master_Info_File: /opt/56/data/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State:
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp: 170616 00:36:53
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: 26e3db40-51d4-11e7-adc8-000c29a459b4:1-1612
Executed_Gtid_Set:
Auto_Position: 1
1 row in set (0.00 sec)

stop slave;
change master to master_log_file='Relay_Master_Log_File', master_log_pos=Exec_Master_Log_Pos+1,master_auto_position=0;
start slave;

show slave status \G

Slave_IO_State:
Master_Host: 10.1.1.111
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000003
Read_Master_Log_Pos: 152
Relay_Log_File: relay-bin.000002
Relay_Log_Pos: 314
Relay_Master_Log_File: mysql-bin.000003
Slave_IO_Running: No
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 152
Relay_Log_Space: 512
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 1236
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'log event entry exceeded max_allowed_packet; Increase max_allowed_packet on master; the first event 'mysql-bin.000003' at 152, the last event read from '/opt/56/binlog/mysql-bin.000003' at 152, the last byte read from '/opt/56/binlog/mysql-bin.000003' at 171.'
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 113306
Master_UUID: 26e3db40-51d4-11e7-adc8-000c29a459b4
Master_Info_File: /opt/56/data/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp: 170616 00:39:13
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set:
Auto_Position: 0
1 row in set (0.00 sec)

当你观察你会发现Master服务器不再要求Slave服务器需要拉才能同步数据的二进制日志。可能的原因包括主服务器过期系统变量 expire_logs_days — — 通过二进制日志或有人手动从Master服务器通过清除二进制日志命令或 rm -f 命令删除二进制日志或者可能是你有一些 cronjob 的档案较旧的二进制日志,要求磁盘空间等。所以,请确保你总是有需要的二进制日志存在于主服务器上,您可以更新您的程序,以保持Slave服务器需要通过监测"Relay_master_log_file"变量Slave显示的Slave状态输出的二进制日志。此外,如果设置了 expire_log_days 在 my.cnf 老 binlogs 自动过期并移除。这意味着当 MySQL 打开一个新的 binlog 文件,它会检查旧的 binlogs,且清除任何早比 expire_logs_days 的值 (单位为天)。略服务器添加一个功能,过期日志基于而年龄的 binlog 文件不是使用的文件的总数量。所以在该配置中,如果你得到穗交通,它可能导致 binlogs 要比你预期的更早消失。详细信息请查看限制 binlog 文件的数目。

CHANGE MASTER TO MASTER_LOG_FILE='Relay_Master_Log_File+1', MASTER_LOG_POS=4;

show slave status \G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 10.1.1.111
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000004
Read_Master_Log_Pos: 68299
Relay_Log_File: relay-bin.000002
Relay_Log_Pos: 68469
Relay_Master_Log_File: mysql-bin.000004
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 68299
Relay_Log_Space: 68667
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 113306
Master_UUID: 26e3db40-51d4-11e7-adc8-000c29a459b4
Master_Info_File: /opt/56/data/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: 26e3db40-51d4-11e7-adc8-000c29a459b4:10008-10152
Executed_Gtid_Set: 26e3db40-51d4-11e7-adc8-000c29a459b4:10008-10152
Auto_Position: 0
1 row in set (0.00 sec)

最后使用  pt-table-checksum 和 pt-table-sync 检查数据一致性问题