• MySQL GTID 错误处理汇总


    MySQL GTID是在传统的mysql主从复制的基础之上演化而来的产物,即通过UUID加上事务ID的方式来确保每一个事物的唯一性。这样的操作方式使得我们不再需要关心所谓的log_file和log_Pos,只是简单的告诉从库,从哪个服务器上去找主库就OK了。简化了主从的搭建以及failover的过程,同时比传统的复制更加安全可靠。由于GTID是连续没有空洞的,因此主从库出现数据冲突时,可以通过注入空事物的方式进行跳过。本文主要讲述GTID主从架构的错误处理方式。

    http://blog.csdn.net/leshami/article/details/52778480

    一、GTID的相关特性

    配置MySQL GTID 主从复制 
    基于mysqldump搭建gtid主从

    二、GTID如何跳过事务冲突

        很多无法预料的情形导致mysql主从发生事务冲突,主从失败或停止的情形,即需要修复主从
        对于GTID方式的主从架构而言,更多的是处理事务冲突来修复主从
        GTID不支持通过传统设置sql_slave_skip_counter方法来跳过事务
        方法:通过注入空事务来填补事务空洞,等同于传统复制的(set global sql_slave_skip_counter = 1)
        步骤:
                stop slave;
                set gtid_next='xxxxxxx:N'; --指定下一个事务执行的版本,即想要跳过的GTID
                begin;
                commit;  --注入一个空事物
                set gtid_next='AUTOMATIC' --自动的寻找GTID事务。
                start slave; --开始同步
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12

    三、GTID事务冲突的几种常见类型

        1、主库新增记录,从库提示主键冲突
        2、主库对象可更新,从库无对应的对象可更新
        3、主库对象可删除,从库无对应的对象可删除
        4、通过延迟从修复主库意外删除的对象
        5、主库日志被purged的情形
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6

    四、示例演示

    当前演示的主从架构图
    # mysqlrplshow --master=root:pass@192.168.1.233:3306 --discover-slaves-login=root:pass --verbose
    WARNING: Using a password on the command line interface can be insecure.
    # master on 192.168.1.233: ... connected.
    # Finding slaves for master: 192.168.1.233:3306
    
    # Replication Topology Graph
    192.168.1.233:3306 (MASTER)
       |
       +--- 192.168.1.245:3306 [IO: Yes, SQL: Yes] - (SLAVE)
       |
       +--- 192.168.1.247:3306 [IO: Yes, SQL: Yes] - (SLAVE)
    
    (root@192.168.1.233)[tempdb]>show slave hosts;
    +-----------+---------------+------+-----------+--------------------------------------+
    | Server_id | Host          | Port | Master_id | Slave_UUID                           |
    +-----------+---------------+------+-----------+--------------------------------------+
    |       245 | 192.168.1.245 | 3306 |       233 | 78336cdc-8cfb-11e6-ba9f-000c29328504 |
    |       247 | 192.168.1.247 | 3306 |       233 | 13a26fc1-555a-11e6-b5e0-000c292e1642 |
    +-----------+---------------+------+-----------+--------------------------------------+
    
    --演示的mysql版本
    (root@192.168.1.233)[tempdb]>show variables like 'version';
    +---------------+------------+
    | Variable_name | Value      |
    +---------------+------------+
    | version       | 5.7.12-log |
    +---------------+------------+
    
    --查看gtid是否开启
    (root@192.168.1.233)[tempdb]>show variables like '%gtid_mode%';
    +---------------+-------+
    | Variable_name | Value |
    +---------------+-------+
    | gtid_mode     | ON    |
    +---------------+-------+
    
    --主库上面可以看到基于gtid的dump线程,如下
    (root@192.168.1.233)[tempdb]>show processlist;
    +----+------+-----------------------+--------+------------------+------+
    | Id | User | Host                  | db     | Command          | Time |
    +----+------+-----------------------+--------+------------------+------+
    | 17 | repl | node245.edq.com:52685 | NULL   | Binlog Dump GTID | 2738 |
    | 18 | repl | node247.edq.com:33516 | NULL   | Binlog Dump GTID | 2690 |
    | 24 | root | localhost             | tempdb | Query            |    0 |
    +----+------+-----------------------+--------+------------------+------+
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46

    1、从库报主键重复(Errno: 1062)

    (root@Master)[tempdb]>create table t1 (
                -> id tinyint not null primary key,ename varchar(20),blog varchar(50));
    
    (root@Master)[tempdb]>insert into t1 
                -> values(1,'leshami','http://blog.csdn.net/leshami');
    
    (root@Master)[tempdb]>insert into t1 
                -> values(2,'robin','http://blog.csdn.net/robinson_0612');
    
    (root@Master)[tempdb]>set sql_log_bin=off;
    
    (root@Master)[tempdb]>delete from t1 where ename='robin';
    
    (root@Master)[tempdb]>set sql_log_bin=on;
    
    (root@Master)[tempdb]>insert into t1 
                -> values(2,'robin','http://blog.csdn.net/robinson_0612');
    
    -- 从库状态报错,提示重复的primary key
    (root@Slave)[tempdb]>show slave status G
    *************************** 1. row ***************************
    Last_Errno: 1062
    Last_Error: Could not execute Write_rows event on table tempdb.t1; Duplicate entry '2' for key 'PRIMARY', 
                            Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; 
                            the event's master log node233-binlog.000004, end_log_pos 4426
    Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-90
     Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-89
         Auto_Position: 1
    
    -- 如下解决方案,可以通过删除重库的这条记录
    (root@Slave)[tempdb]>stop slave;
    
    (root@Slave)[tempdb]>delete from t1 where ename='robin';
    
    (root@Slave)[tempdb]>start slave;
    
    (root@Slave)[tempdb]>show slave status G
    *************************** 1. row ***************************
               Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-90
                Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-90,
     78336cdc-8cfb-11e6-ba9f-000c29328504:1  --这里多了一个GTID,注意这个是从库上执行的,这里的UUID跟IP 245的UUID一致
                    Auto_Position: 1
             Replicate_Rewrite_DB: 
                     Channel_Name: 
               Master_TLS_Version: 
    
    (root@Slave)[tempdb]>show variables like '%uuid%';
    +---------------+--------------------------------------+
    | Variable_name | Value                                |
    +---------------+--------------------------------------+
    | server_uuid   | 78336cdc-8cfb-11e6-ba9f-000c29328504 |
    +---------------+--------------------------------------+
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52

    2、从库报找不到对应的被更新的记录(Errno: 1032)

    --首先在从库上删除leshami这条记录
    (root@Slave)[tempdb]>delete from t1 where ename='leshami';
    
    --接下来再主库尝试更新leshami这条记录
    (root@Master)[tempdb]>update t1 set 
                -> blog='http://blog.csdn.net/robinson_0612' where ename='leshami';
    
    Query OK, 1 row affected (0.02 sec)
    Rows matched: 1  Changed: 1  Warnings: 0
    
    -- 查看从库状态
    (root@Slave)[tempdb]>show slave status G
    *************************** 1. row ***************************
    Last_SQL_Errno: 1032
    Last_SQL_Error: Could not execute Update_rows event on table tempdb.t1; Can't find record in 't1',
                                    Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND;
                                the event's master log node233-binlog.000004, end_log_pos 4769
    Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-91
    Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-90,
            78336cdc-8cfb-11e6-ba9f-000c29328504:1-2
    
    -- 通过mysqlbinlog在主服务器上寻找报错的binglog日志文件及位置,找到对应的SQL语句,如下所示
    -- update中的where后面的部分为更新前的数据,set部分为更新后的数据,因此可以将更新前的数据插入到从库
    # mysqlbinlog --no-defaults -v -v --base64-output=DECODE-ROWS /data/node233-binlog.000004|grep -A '10' 4769
    #161009 13:46:34 server id 233 end_log_pos 4769 CRC32 0xb60df74e Update_rows: table id 147 flags: STMT_END_F
    ### UPDATE `tempdb`.`t1`
    ### WHERE
    ###   @1=1 /* TINYINT meta=0 nullable=0 is_null=0 */
    ###   @2='leshami' /* VARSTRING(20) meta=20 nullable=1 is_null=0 */
    ###   @3='http://blog.csdn.net/leshami' /* VARSTRING(50) meta=50 nullable=1 is_null=0 */
    ### SET
    ###   @1=1 /* TINYINT meta=0 nullable=0 is_null=0 */
    ###   @2='leshami' /* VARSTRING(20) meta=20 nullable=1 is_null=0 */
    ###   @3='http://blog.csdn.net/robinson_0612' /* VARSTRING(50) meta=50 nullable=1 is_null=0 */
    # at 4769
    #161009 13:46:34 server id 233  end_log_pos 4800 CRC32 0xa9669811       Xid = 1749
    COMMIT/*!*/;
    SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog */ /*!*/;
    DELIMITER ;
    # End of log file
    /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
    /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;   
    
    (root@Slave)[tempdb]>select * from t1;
    +----+-------+------------------------------------+
    | id | ename | blog                               |
    +----+-------+------------------------------------+
    |  2 | robin | http://blog.csdn.net/robinson_0612 |
    +----+-------+------------------------------------+
    
    (root@Slave)[tempdb]>stop slave sql_thread;
    
    (root@Slave)[tempdb]>insert into t1 values(1,'leshami','http://blog.csdn.net/leshami');
    
    (root@Slave)[tempdb]>start slave sql_thread;
    
    (root@Slave)[tempdb]>show slave status G
    *************************** 1. row ***************************
               Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-91
                Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-91,
                                   78336cdc-8cfb-11e6-ba9f-000c29328504:1-3
                    Auto_Position: 1
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62

    3、从库找不到对应的被删除的记录(Errno: 1032)

    -- 如果是在主库上删除记录,而从库上找不到对应的记录,则可以直接跳过该事务
    -- 下面我们首选在从库上删除一条记录
    (root@Slave)[tempdb]>delete from t1 where ename='robin';
    
    -- 接下来在主库上删除该记录
    (root@Master)[tempdb]>delete from t1 where ename='robin';
    
    -- 从库端提示无法找到对应的记录
    (root@Slave)[tempdb]>show slave status G
    *************************** 1. row ***************************
    Last_SQL_Error: Could not execute Delete_rows event on table tempdb.t1; Can't find record in 't1',
                    Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; 
                    the event's master log node233-binlog.000004, end_log_pos 5070
    Last_SQL_Error_Timestamp: 161009 15:08:06
        Master_SSL_Crl: 
    Master_SSL_Crlpath: 
    Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-92
     Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-91,
                        78336cdc-8cfb-11e6-ba9f-000c29328504:1-4
         Auto_Position: 1      
    
    -- 下面通过注入空事务来跳过
    (root@Slave)[tempdb]>stop slave sql_thread;
    
    (root@Slave)[tempdb]>set gtid_next='1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:92';
    
    (root@Slave)[tempdb]>begin;commit;
    
    (root@Slave)[tempdb]>set gtid_next='AUTOMATIC';
    
    (root@Slave)[tempdb]>start slave sql_thread;
    
    (root@Slave)[tempdb]>show slave status G
    *************************** 1. row ***************************
               Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-92
                Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-92,
                                   78336cdc-8cfb-11e6-ba9f-000c29328504:1-4
                    Auto_Position: 1
             Replicate_Rewrite_DB: 
                     Channel_Name: 
               Master_TLS_Version: 
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41

    4、延迟从修复主库意外truncate

    -- 主库上面新增表及记录             
    (root@Master)[tempdb]>create table t2 (id tinyint not null primary key, 
            -> ename varchar(20),blog varchar(50));
    
    (root@Master)[tempdb]>insert into t2  
                -> values(1,'leshami','http://blog.csdn.net/leshami');
    
    (root@Master)[tempdb]>insert into t2  
                -> values(2,'robin','http://blog.csdn.net/robinson_0612');
    
    (root@Master)[tempdb]>select * from t2;
    +----+---------+------------------------------------+
    | id | ename   | blog                               |
    +----+---------+------------------------------------+
    |  1 | leshami | http://blog.csdn.net/leshami       |
    |  2 | robin   | http://blog.csdn.net/robinson_0612 |
    +----+---------+------------------------------------+
    
    --先将从库配置为延迟从
    (root@Slave)[tempdb]>stop slave sql_thread;
    Query OK, 0 rows affected (0.01 sec)
    
    (root@Slave)[tempdb]>CHANGE MASTER TO MASTER_DELAY = 300;
    Query OK, 0 rows affected (0.00 sec)
    
    (root@Slave)[tempdb]>start slave sql_thread;
    Query OK, 0 rows affected (0.02 sec)
    
    (root@Slave)[tempdb]>show slave status G
    *************************** 1. row ***************************
                 Slave_IO_Running: Yes
                Slave_SQL_Running: Yes
                        SQL_Delay: 300  
    
    root@Slave)[tempdb]>show slave status G
    *************************** 1. row ***************************
               Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-99
                Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-99,
                                   78336cdc-8cfb-11e6-ba9f-000c29328504:1-4
                    Auto_Position: 1
    
    --查看主库上的binglog gtid
    (root@Master)[tempdb]>show master statusG
    *************************** 1. row ***************************
                 File: node233-binlog.000004
             Position: 6970
         Binlog_Do_DB: 
     Binlog_Ignore_DB: 
    Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-99
    1 row in set (0.00 sec)
    
    --在主库上truncate t2
    (root@Master)[tempdb]>truncate table t2;
    Query OK, 0 rows affected (0.03 sec)
    
    --再次查看主库上的binglog gtid,有99变成了100,这个100即是我们需要跳过的ID
    (root@Master)[tempdb]>show master statusG
    *************************** 1. row ***************************
                 File: node233-binlog.000004
             Position: 7121
         Binlog_Do_DB: 
     Binlog_Ignore_DB: 
    Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-100
    1 row in set (0.00 sec)
    
    --从库上跳过被意外truncate的事务
    (root@Slave)[tempdb]>stop slave sql_thread;
    Query OK, 0 rows affected (0.01 sec)
    
    (root@Slave)[tempdb]>set gtid_next='1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:100';
    Query OK, 0 rows affected (0.00 sec)
    
    (root@Slave)[tempdb]>begin;commit;
    Query OK, 0 rows affected (0.00 sec)
    
    Query OK, 0 rows affected (0.01 sec)
    
    (root@Slave)[tempdb]>set gtid_next='AUTOMATIC';
    Query OK, 0 rows affected (0.00 sec)
    
    (root@Slave)[tempdb]>start slave sql_thread;
    Query OK, 0 rows affected (0.02 sec)
    
    (root@Slave)[tempdb]>show slave status G
    *************************** 1. row ***************************
                   Slave_IO_State: Waiting for master to send event
                      Master_Host: Master
                      Master_User: repl
                      Master_Port: 3306
                    Connect_Retry: 60
                  Master_Log_File: node233-binlog.000004
              Read_Master_Log_Pos: 7121
                   Relay_Log_File: node245-relay-bin.000003
                    Relay_Log_Pos: 2982
            Relay_Master_Log_File: node233-binlog.000004
                 Slave_IO_Running: Yes
                Slave_SQL_Running: Yes
                 ...........................         
               Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-100
                Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-100,
                                                                 78336cdc-8cfb-11e6-ba9f-000c29328504:1-4
                    Auto_Position: 1
    
    -- 很多时候我们并不知道表何时被truncate,因此可以从binlog日志得到其gtid
    -- 如下所示,可以得到这串 SET @@SESSION.GTID_NEXT= '1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:100'
    -- 100即为这个truncate对应的gtid的事务号
    # mysqlbinlog --no-defaults -v -v --base64-output=DECODE-ROWS /data/node233-binlog.000004|grep -i 
    > "truncate table t2" -A3 -B10  
    ###   @3='http://blog.csdn.net/robinson_0612' /* VARSTRING(50) meta=50 nullable=1 is_null=0 */
    # at 6939
    #161009 18:04:58 server id 233  end_log_pos 6970 CRC32 0x71c5121c     Xid = 1775
    COMMIT/*!*/;
    # at 6970
    #161009 18:08:42 server id 233 end_log_pos 7035 CRC32 0x00ba9437 GTID last_committed=26 sequence_number=27
    SET @@SESSION.GTID_NEXT= '1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:100'/*!*/;
    # at 7035
    #161009 18:08:42 server id 233 end_log_pos 7121 CRC32 0x5a8b9723 Query thread_id=26 exec_time=0 error_code=0
    SET TIMESTAMP=1476007722/*!*/;
    truncate table t2
    /*!*/;
    SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog */ /*!*/;
    DELIMITER ;
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87
    • 88
    • 89
    • 90
    • 91
    • 92
    • 93
    • 94
    • 95
    • 96
    • 97
    • 98
    • 99
    • 100
    • 101
    • 102
    • 103
    • 104
    • 105
    • 106
    • 107
    • 108
    • 109
    • 110
    • 111
    • 112
    • 113
    • 114
    • 115
    • 116
    • 117
    • 118
    • 119
    • 120
    • 121
    • 122

    5、主库binlog被purge的情形(Errno: 1236)

    -- 首先停止从库,模拟从库被意外宕机
    (root@Slave)[tempdb]>stop slave;
    Query OK, 0 rows affected (0.08 sec)
    
    --在主库上进行相应的操作
    --此时主库上的gtid_purged为空
    (root@Master)[tempdb]>show variables like '%gtid_purged%';
    +---------------+-------+
    | Variable_name | Value |
    +---------------+-------+
    | gtid_purged   |       |
    +---------------+-------+
    
    --查看主库binlog
    (root@Master)[tempdb]>show binary logs;
    +-----------------------+-----------+
    | Log_name              | File_size |
    +-----------------------+-----------+
    | node233-binlog.000001 |   1362104 |
    | node233-binlog.000002 |      1331 |
    | node233-binlog.000003 |       217 |
    | node233-binlog.000004 |      7121 |
    +-----------------------+-----------+
    
    (root@Master)[tempdb]>select * from t1;
    +----+---------+------------------------------------+
    | id | ename   | blog                               |
    +----+---------+------------------------------------+
    |  1 | leshami | http://blog.csdn.net/robinson_0612 |
    |  2 | robin   | http://blog.csdn.net/leshami       |
    +----+---------+------------------------------------+
    
    --从主库删除记录
    (root@Master)[tempdb]>delete from t1;
    
    --切换日志
    (root@Master)[tempdb]>flush logs;
    
    --新增记录
    (root@Master)[tempdb]>insert into t1 values(1,
        -> 'xuputi','http://blog.csdn.net/leshami');
    
    (root@Master)[tempdb]>show binary logs;
    +-----------------------+-----------+
    | Log_name              | File_size |
    +-----------------------+-----------+
    | node233-binlog.000001 |   1362104 |
    | node233-binlog.000002 |      1331 |
    | node233-binlog.000003 |       217 |
    | node233-binlog.000004 |      7513 |
    | node233-binlog.000005 |       490 |
    +-----------------------+-----------+
    
    --清理binlog
    (root@Master)[tempdb]>purge binary logs to 'node233-binlog.000005';
    Query OK, 0 rows affected (0.01 sec)
    
    (root@Master)[tempdb]>show binary logs;
    +-----------------------+-----------+
    | Log_name              | File_size |
    +-----------------------+-----------+
    | node233-binlog.000005 |       490 |
    +-----------------------+-----------+
    
    --此时可以看到相应的gtid_purged值
    (root@Master)[tempdb]>show variables like '%gtid_purged%';
    +---------------+--------------------------------------------+
    | Variable_name | Value                                      |
    +---------------+--------------------------------------------+
    | gtid_purged   | 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-101 |
    +---------------+--------------------------------------------+
    
    --下面启动从库
    (root@Slave)[tempdb]>start slave;
    Query OK, 0 rows affected (0.00 sec)
    
    --从库状态提示有日志被purged
    (root@Slave)[tempdb]>show slave statusG
    *************************** 1. row ***************************
                   Slave_IO_State: 
                      Master_Host: Master
                      Master_User: repl
                      Master_Port: 3306
                    Connect_Retry: 60
                  Master_Log_File: node233-binlog.000004
              Read_Master_Log_Pos: 7121
                   Relay_Log_File: node245-relay-bin.000003
                    Relay_Log_Pos: 3133
            Relay_Master_Log_File: node233-binlog.000004
                 Slave_IO_Running: No
                Slave_SQL_Running: Yes
                        ...............
                    Last_IO_Errno: 1236
                    Last_IO_Error: Got fatal error 1236 from master when reading data from binary log:
                    'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, 
                     but the master has purged binary logs containing GTIDs that the slave requires.'
                           ..................
               Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-100
                Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-100,
                                   78336cdc-8cfb-11e6-ba9f-000c29328504:1-4
                    Auto_Position: 1
    
    -- 从库上gtid_purged参数,此时为75
    (root@Slave)[tempdb]>show variables like '%gtid_purged%';
    +---------------+-------------------------------------------+
    | Variable_name | Value                                     |
    +---------------+-------------------------------------------+
    | gtid_purged   | 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-75 |
    +---------------+-------------------------------------------+                
    
    --停止从库
    (root@Slave)[tempdb]>stop slave;
    Query OK, 0 rows affected (0.01 sec)
    
    --下面尝试使用gtid_purged进行跳过事务,,如下,提示仅仅当GLOBAL.GTID_EXECUTED为空才能被设置
    (root@Slave)[tempdb]>set global gtid_purged = '1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-101';
    ERROR 1840 (HY000): @@GLOBAL.GTID_PURGED can only be set when @@GLOBAL.GTID_EXECUTED is empty.
    
    --如下查看,已经存在被执行的gtid,即gtid_executed肯定是不为空,且这些gtid记录在从库的binary log中
    (root@Slave)[tempdb]>show global variables like '%gtid_executed%'G
    *************************** 1. row ***************************
    Variable_name: gtid_executed
            Value: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-100,
                   78336cdc-8cfb-11e6-ba9f-000c29328504:1-4
    *************************** 2. row ***************************
    Variable_name: gtid_executed_compression_period
            Value: 1000
    
    --下面我们在从库上reset master,即清空从库binlog
    (root@Slave)[tempdb]>reset master;
    Query OK, 0 rows affected (0.05 sec)
    
    --再次查看gtid_executed已经为空值
    (root@Slave)[tempdb]>show global variables like '%gtid_executed%'G
    *************************** 1. row ***************************
    Variable_name: gtid_executed
            Value: 
    *************************** 2. row ***************************
    Variable_name: gtid_executed_compression_period
            Value: 1000
    
    --此时再次设置gtid_purged的值
    (root@Slave)[tempdb]>set global gtid_purged = '1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-101';
    Query OK, 0 rows affected (0.01 sec)
    
    --启动从库
    (root@Slave)[tempdb]>start slave;
    Query OK, 0 rows affected (0.03 sec)
    
    --提示有重复记录,如下所示
    --是由于我们在从库停止期间delete这个事务没有被从库的relay log接受到
    --其次主从的binlog又被purged,而且从库启动后,执行了gtid_purged,因此主库上新增的记录在从库上提示主键重复
    (root@Slave)[tempdb]>show slave status G
    *************************** 1. row ***************************
                   Slave_IO_State: Waiting for master to send event
                      Master_Host: Master
                      Master_User: repl
                      Master_Port: 3306
                    Connect_Retry: 60
                  Master_Log_File: node233-binlog.000005
              Read_Master_Log_Pos: 490
                   Relay_Log_File: node245-relay-bin.000004
                    Relay_Log_Pos: 417
            Relay_Master_Log_File: node233-binlog.000005
                 Slave_IO_Running: Yes
                Slave_SQL_Running: No
                    ................
                   Last_SQL_Error: Could not execute Write_rows event on table tempdb.t1; 
     Duplicate entry '1' for key 'PRIMARY', Error_code: 1062;
     handler error HA_ERR_FOUND_DUPP_KEY; the event's master log node233-binlog.000005, end_log_pos 459
               Retrieved_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:76-100:102
                Executed_Gtid_Set: 1b64c25d-8d2b-11e6-9ac0-000c29b82d0d:1-101
                    Auto_Position: 1
    
    --在从库上删除id为1的记录
    (root@Slave)[tempdb]>delete from t1 where id=1;
    Query OK, 1 row affected (0.05 sec)
    
    --启动从库的sql_thread线程
    (root@Slave)[tempdb]>start slave sql_thread;
    Query OK, 0 rows affected (0.02 sec)
    
    --再次查看正常
    (root@Slave)[tempdb]>show slave status G
    *************************** 1. row ***************************
                   Slave_IO_State: Waiting for master to send event
                      Master_Host: Master
                      Master_User: repl
                      Master_Port: 3306
                    Connect_Retry: 60
                  Master_Log_File: node233-binlog.000005
              Read_Master_Log_Pos: 490
                   Relay_Log_File: node245-relay-bin.000004
                    Relay_Log_Pos: 713
            Relay_Master_Log_File: node233-binlog.000005
                 Slave_IO_Running: Yes
                Slave_SQL_Running: Yes
    
    --上面的这个示例,主要是演示我们使用gtid_purged方式来达到跳过事务的目的
    --事实上,主从的数据已经不一致了,应根据实际的需要考虑是否进行相应的修复

    五、小结

    1、GTID是全局事务ID,简化了主从架构的部署使得从库不再需要关心log_file和log_pos 
    2、由于事务ID的唯一性,使得将其他从库的GTID应用到其它从库成为可能,即提供了便利的failover 
    3、GTID是连续的,非空洞性的,因此,对于冲突的情形,需要注入空的事务来实现 
    4、可以通过配置延迟从来避免主库上意外的删除对象导致的人为错误

  • 相关阅读:
    Ubuntu18.04 更新gcc和g++的版本号
    mkimage uboot
    ubuntu终端进入secure boot 修改为disable
    apt install g++8riscv64linuxgnu
    计算机专用英语词汇1702个词汇表
    快收藏!高手Linux运维管理必备工具大全,你会吗?
    Cilium 架构设计与概念解析
    prometheus添加集群
    openkruise详解
    Windows11安装Ubantu 18.04 LTS
  • 原文地址:https://www.cnblogs.com/moss_tan_jun/p/7881567.html
Copyright © 2020-2023  润新知