创建GTID主从连接:
mysql> change master to master_host='master_id', master_port=3306, master_user='user', master_password='password', master_auto_position=1;
报错显示:
Slave_IO_Running: Yes
Slave_SQL_Running: No
Last_Error: Error 'Can't drop database 'test'; database doesn't exist' on query. Default database: 'test'. Query: 'drop database test'
Retrieved_Gtid_Set: 988b8684-3e21-22e6-a801-24505689c77d:1-9
Executed_Gtid_Set:
两个新实例启动后我预先做了一些安全处理,然后就直接在从库执行change master,导致主库上的日志在从库里执行不了,需要把无法执行的日志都跳过!
机械的搬运老师上课讲的内容如下:
mysql> stop slave;
Query OK, 0 rows affected (0.00 sec)
mysql> set gtid_next="988b8684-3e21-22e6-a801-24505689c77d:9";
Query OK, 0 rows affected (0.00 sec)
mysql> begin;commit;
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.01 sec)
mysql> set gtid_next="AUTOMATIC";
Query OK, 0 rows affected (0.00 sec)
mysql> start slave;
Query OK, 0 rows affected (0.00 sec)
结果
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_Running: Yes
Slave_SQL_Running: No
Last_Error: Error 'Can't drop database 'test'; database doesn't exist' on query. Default database: 'test'. Query: 'drop database test'
Retrieved_Gtid_Set: 988b8684-3e21-22e6-a801-24505689c77d:1-9
Executed_Gtid_Set: 988b8684-3e21-22e6-a801-24505689c77d:9
没起作用!这是什么问题?
再去查看主库上的binlog日志,查找drop database test相关的日志:
# at 151
#160630 1:55:19 server id 623306 end_log_pos 199 CRC32 0x5954bb4c GTID [commit=yes]
SET @@SESSION.GTID_NEXT= '988b8684-3e21-22e6-a801-24505689c77d:1'/*!*/;
# at 199
#160630 1:55:19 server id 623306 end_log_pos 284 CRC32 0x6db10369 Query thread_id=1 exec_time=0 error_code=0
SET TIMESTAMP=1467222919/*!*/;
SET @@session.pseudo_thread_id=1/*!*/;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/;
SET @@session.sql_mode=1073741824/*!*/;
SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/;
/*!\C utf8 *//*!*/;
SET @@session.character_set_client=83,@@session.collation_connection=83,@@session.collation_server=83/*!*/;
SET @@session.lc_time_names=0/*!*/;
SET @@session.collation_database=DEFAULT/*!*/;
drop database test
/*!*/;
问题找到了,drop database test对应的事务号是1,而不是顺序显示的9,接下来就简单了,重复上面的操作,但把事务号9改成1:
mysql> stop slave;
Query OK, 0 rows affected (0.00 sec)
mysql> set gtid_next="988b8684-3e21-22e6-a801-24505689c77d:1";
Query OK, 0 rows affected (0.00 sec)
mysql> begin;commit;
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.01 sec)
mysql> set gtid_next="AUTOMATIC";
Query OK, 0 rows affected (0.00 sec)
mysql> start slave;
Query OK, 0 rows affected (0.00 sec)
再有相似的报错也同样处理,最后终于没问题了:
mysql> show slave status\G
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Retrieved_Gtid_Set: 988b8684-3e21-22e6-a801-24505689c77d:1-9
Executed_Gtid_Set: 988b8684-3e21-22e6-a801-24505689c77d:1-9
over
收获:知其然很重要,知其所以然更重要
要理解GTID代表的是什么,每个事务的提交都代表着一个GTID的生成,正如其全名:全局事务ID(global transaction identifier),所以如果想跳过导致错误的事务不执行的话,需要找到对应事务的gtid号,设置(set gtid_next="....")并提交空事务后重新启用自动模式后,再重启slave就可以,并不是每个导致错误的事务都是binlog中最后一个事务
另外,还要理解相关的参数:
Retrieved_Gtid_Set
The set of global transaction IDs corresponding to all transactions received by this slave. Empty if GTIDs are not in use.
This is the set of all GTIDs that exist or have existed in the relay logs
即IO线程接收到的日志
Executed_Gtid_Set
The set of global transaction IDs written in the binary log. This is the same as the value for the global gtid_executed system variable on this server, as well as the value for Executed_Gtid_Set in the output of SHOW MASTER STATUS on this server. Empty if GTIDs are not in use.
即SQL线程执行到的位置
理解了这个后就能明白,之前的处理还是太复杂,其实直接看show slave satus\G的结果,看这两个参数的值:
Slave_IO_Running: Yes
Slave_SQL_Running: No
Last_Error: Error 'Can't drop database 'test'; database doesn't exist' on query. Default database: 'test'. Query: 'drop database test'
Retrieved_Gtid_Set: 988b8684-3e21-22e6-a801-24505689c77d:1-9
Executed_Gtid_Set:
能够看到,Executed_Gtid_Set是空的,而且GTID是肯定开启了的,所以,说明日志传过来后压根还没开始执行,所以,第一个事务就已经被卡住了,首先应该跳过的事务号就是1,也不必再去看日志了
查资料过程中看到一个很好的GTID介绍文章,链接在此:http://mysqllover.com/?p=594