奇怪的GTID 下1236 的同步报错

一。 背景
版本: 主从都是 5.7.21
开启GTID + 多线程异步复制,双主的架构
binlog_gtid_simple_recovery = on
enforce_gtid_consistency = on
gtid_mode = on
 
二。告警
昨天线上同事在slave 节点 stop slave, start slave, 就发生了1236 的同步报错。为了不影响业务,改用auto_position=0 临时恢复主从,但是只要改回来auto_position=1 就又发生1236的报错。 报错的binlog 文件确定是在master 存在的并没有purge
 
三。 日志
===== Slave auto_position= 0 的时候 ====
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.64.67.30
                  Master_User: slave
                  Master_Port: 6331
                Connect_Retry: 60
              Master_Log_File: mysql-bin.004549
          Read_Master_Log_Pos: 7044470
               Relay_Log_File: mysql-relay-bin.000028
                Relay_Log_Pos: 58123976
        Relay_Master_Log_File: mysql-bin.004543
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: mysql
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: mysql.%
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 58123843
              Relay_Log_Space: 636230788
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 2031
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 11530439
                  Master_UUID: 00cc2886-6fb6-11e8-acd4-0894ef3573ea
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Waiting for dependent transaction to commit
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 00cc2886-6fb6-11e8-acd4-0894ef3573ea:1063702888-1067075642
            Executed_Gtid_Set: 00cc2886-6fb6-11e8-acd4-0894ef3573ea:1-1065793153,
9e18ef59-84d7-11e8-9bcf-246e966c35b0:1-30124
                Auto_Position: 0
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)
 
 
====== Slave 在 auto_position= 1 的 status ==========
*************************** 1. row ***************************
               Slave_IO_State: 
                  Master_Host: 10.64.67.30
                  Master_User: slave
                  Master_Port: 6331
                Connect_Retry: 60
              Master_Log_File: mysql-bin.004543
          Read_Master_Log_Pos: 64864979
               Relay_Log_File: mysql-relay-bin.000001 (本地是没有的)
                Relay_Log_Pos: 4
        Relay_Master_Log_File: mysql-bin.004543
             Slave_IO_Running: No
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: mysql
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: mysql.%
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 64864979
              Relay_Log_Space: 194
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 1236
                Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.'
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 11530439
                  Master_UUID: 00cc2886-6fb6-11e8-acd4-0894ef3573ea
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 180719 16:40:15
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: 00cc2886-6fb6-11e8-acd4-0894ef3573ea:1-1065807665,
9e18ef59-84d7-11e8-9bcf-246e966c35b0:1-30140
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)
 
 
 
===== Master 状态======
UUID 是 00cc2886-6fb6-11e8-acd4-0894ef3573ea
 
db_monitor@0:(none) 04:40:54>select @@global.gtid_purged;
+-----------------------------------------------------------------------------------------------------------------------------------------------+
| @@global.gtid_purged                                                                                                                          |
+-----------------------------------------------------------------------------------------------------------------------------------------------+
| 00878f21-6fb6-11e8-9092-246e966c35b0:1-258698,
00cc2886-6fb6-11e8-acd4-0894ef3573ea:1-728438011,
9e18ef59-84d7-11e8-9bcf-246e966c35b0:1-32250 |
 
### Master binlog 都在######
-rw-r----- 1 mysql mysql 104861798 Jul 19 15:29 mysql-bin.004537
-rw-r----- 1 mysql mysql 104864693 Jul 19 15:36 mysql-bin.004538
-rw-r----- 1 mysql mysql 104865523 Jul 19 15:42 mysql-bin.004539
-rw-r----- 1 mysql mysql 104861547 Jul 19 15:49 mysql-bin.004540
-rw-r----- 1 mysql mysql 104861516 Jul 19 15:56 mysql-bin.004541
-rw-r----- 1 mysql mysql 104863756 Jul 19 16:02 mysql-bin.004542
-rw-r----- 1 mysql mysql 104863621 Jul 19 16:08 mysql-bin.004543
-rw-r----- 1 mysql mysql 104865037 Jul 19 16:14 mysql-bin.004544
-rw-r----- 1 mysql mysql 104860063 Jul 19 16:20 mysql-bin.004545
-rw-r----- 1 mysql mysql 104868570 Jul 19 16:26 mysql-bin.004546
-rw-r----- 1 mysql mysql 104862480 Jul 19 16:32 mysql-bin.004547
-rw-r----- 1 mysql mysql 104861326 Jul 19 16:39 mysql-bin.004548
-rw-r----- 1 mysql mysql 104871605 Jul 19 16:45 mysql-bin.004549
-rw-r----- 1 mysql mysql 104864103 Jul 19 16:50 mysql-bin.004550
-rw-r----- 1 mysql mysql 104863224 Jul 19 16:56 mysql-bin.004551
-rw-r----- 1 mysql mysql 104862329 Jul 19 17:02 mysql-bin.004552
-rw-r----- 1 mysql mysql 104861815 Jul 19 17:08 mysql-bin.004553
-rw-r----- 1 mysql mysql 104861768 Jul 19 17:14 mysql-bin.004554
-rw-r----- 1 mysql mysql 104870066 Jul 19 17:19 mysql-bin.004555
-rw-r----- 1 mysql mysql 104857986 Jul 19 17:25 mysql-bin.004556
-rw-r----- 1 mysql mysql 104865507 Jul 19 17:30 mysql-bin.004557
-rw-r----- 1 mysql mysql  63390343 Jul 19 17:33 mysql-bin.004558
 
 
问: 原因可能是什么, 怎么恢复GTID 的复制?
 
 
 
已邀请:

要回复问题请先登录注册