• 腾讯云CDB回档失败浅析


    Ⅰ、先看问题

    先简单介绍下cdb的回档功能,回档分为极速、快速、普通,分别对应指定表、指定库、整个实例回档。

    控制台报错回档任务执行失败

    提示信息:rollback table failed:SQL thread error(1146):Error 'Table xxx doesn't exist' on query. Default database: xxx, Query: 'xxxxxx'
    

    从字面意思看是某个表不存在导致sql线程中断(和sql线程什么关系? 这和回档功能原理相关,此处先忽略)
    这种回档失败只会存在于极速和快速两种模式下,普通回档不会有问题。

    用大白话讲,我控制台选择极速回档a表,binlog中的记录涉及到其他表的操作,比如:delete from a where (select xxx from b);

    临时解决方案
    选择普通回档,整个实例回档,抽取需要的数据,当然这个过程就比较慢了

    长期解决方案
    设置binlog_format为row,或者设置transaction_isolation为read-committed

    Ⅱ、借题发挥

    知道问题怎么解决还远远不够,为什么这么设置就没问题,我们需要借助这个问题回顾一下binlog相关知识点,事务隔离级别简单提一下

    binlog的作用

    三个主要作用为:复制、恢复、两阶段提交中担任重要角色,保证主从数据一致性

    binlog_format

    statement:记录原生sql
    row:记录每一行记录的变更
    mixed:默认记录为statement,特殊场景触发格式切换为row,如uuid等不确定性函数

    statement格式优势为日志文件空间小,劣势是主从一致性得不到保证
    row格式优势为较好地保证主从数据一致性,劣势是日志文件占空间大
    关于一致性这里不展开说明,mixed格式基本可以满足大部分业务场景,所以cdb默认为mixed,但是这里十分建议大家用row

    测试

    测试数据与sql
    这里用了一个普通sql和一个跨表sql

    数据
    (root@localhost) [test]> select * from t;
    +------+------+
    | id   | name |
    +------+------+
    |    1 | a    |
    |    2 | b    |
    |    3 | c    |
    |    4 | d    |
    |    5 | e    |
    +------+------+
    5 rows in set (0.00 sec)
    
    (root@localhost) [test]> select * from tt;
    +------+------+
    | id   | name |
    +------+------+
    |    1 | a    |
    |    2 | b    |
    |    3 | c    |
    |    4 | d    |
    |    5 | e    |
    +------+------+
    5 rows in set (0.00 sec)
    
    sql
    (root@localhost) [test]> delete from t where id in (select id from tt where id < 3);
    Query OK, 2 rows affected (0.01 sec)
    
    (root@localhost) [test]> delete from t where id = 5;
    Query OK, 1 row affected (0.02 sec)
    

    四种回档场景

    - read-committed repeatable-read
    row
    mixed ×

    这个表可直观地反应了如何解决回档报错的问题

    分析binlog
    不想看过程请直接拉到最下面看结论

    transaction_isolation binlog_format
    read-committed row
    BEGIN
    /*!*/;
    # at 331
    #190124 11:41:58 server id 1  end_log_pos 378 CRC32 0x80a164cd 	Table_map: `test`.`t` mapped to number 108
    # at 378
    #190124 11:41:58 server id 1  end_log_pos 427 CRC32 0x7ee92842 	Delete_rows: table id 108 flags: STMT_END_F
    ### DELETE FROM `test`.`t`
    ### WHERE
    ###   @1=1
    ###   @2='a'
    ### DELETE FROM `test`.`t`
    ### WHERE
    ###   @1=2
    ###   @2='b'
    # at 427
    #190124 11:41:58 server id 1  end_log_pos 458 CRC32 0x97b7e158 	Xid = 26
    COMMIT/*!*/;
    
    BEGIN
    /*!*/;
    # at 595
    #190124 11:42:06 server id 1  end_log_pos 642 CRC32 0xf0a5f266 	Table_map: `test`.`t` mapped to number 108
    # at 642
    #190124 11:42:06 server id 1  end_log_pos 684 CRC32 0x42239094 	Delete_rows: table id 108 flags: STMT_END_F
    ### DELETE FROM `test`.`t`
    ### WHERE
    ###   @1=5
    ###   @2='e'
    # at 684
    #190124 11:42:06 server id 1  end_log_pos 715 CRC32 0xb95abaf4 	Xid = 27
    COMMIT/*!*/;
    
    transaction_isolation binlog_format
    repeatable-read row
    BEGIN
    /*!*/;
    # at 331
    #190124 12:18:50 server id 1  end_log_pos 378 CRC32 0xc4d70096 	Table_map: `test`.`t` mapped to number 108
    # at 378
    #190124 12:18:50 server id 1  end_log_pos 427 CRC32 0x6d794dea 	Delete_rows: table id 108 flags: STMT_END_F
    ### DELETE FROM `test`.`t`
    ### WHERE
    ###   @1=1
    ###   @2='a'
    ### DELETE FROM `test`.`t`
    ### WHERE
    ###   @1=2
    ###   @2='b'
    # at 427
    #190124 12:18:50 server id 1  end_log_pos 458 CRC32 0x3f3946c1 	Xid = 10
    COMMIT/*!*/;
    
    BEGIN
    /*!*/;
    # at 595
    #190124 12:18:58 server id 1  end_log_pos 642 CRC32 0x1ecaec0b 	Table_map: `test`.`t` mapped to number 108
    # at 642
    #190124 12:18:58 server id 1  end_log_pos 684 CRC32 0xda32a16e 	Delete_rows: table id 108 flags: STMT_END_F
    ### DELETE FROM `test`.`t`
    ### WHERE
    ###   @1=5
    ###   @2='e'
    # at 684
    #190124 12:18:58 server id 1  end_log_pos 715 CRC32 0x4fa0b638 	Xid = 11
    COMMIT/*!*/;
    
    transaction_isolation binlog_format
    read-committed mixed
    BEGIN
    /*!*/;
    # at 331
    #190124 12:26:37 server id 1  end_log_pos 378 CRC32 0x6cac93f1 	Table_map: `test`.`t` mapped to number 108
    # at 378
    #190124 12:26:37 server id 1  end_log_pos 427 CRC32 0x2ec3da0f 	Delete_rows: table id 108 flags: STMT_END_F
    ### DELETE FROM `test`.`t`
    ### WHERE
    ###   @1=1
    ###   @2='a'
    ### DELETE FROM `test`.`t`
    ### WHERE
    ###   @1=2
    ###   @2='b'
    # at 427
    #190124 12:26:37 server id 1  end_log_pos 458 CRC32 0xa4d92d55 	Xid = 24
    COMMIT/*!*/;
    
    BEGIN
    /*!*/;
    # at 595
    #190124 12:26:42 server id 1  end_log_pos 642 CRC32 0xa2926b8d 	Table_map: `test`.`t` mapped to number 108
    # at 642
    #190124 12:26:42 server id 1  end_log_pos 684 CRC32 0x05059ae7 	Delete_rows: table id 108 flags: STMT_END_F
    ### DELETE FROM `test`.`t`
    ### WHERE
    ###   @1=5
    ###   @2='e'
    # at 684
    #190124 12:26:42 server id 1  end_log_pos 715 CRC32 0x86e936fe 	Xid = 25
    COMMIT/*!*/;
    
    transaction_isolation binlog_format
    repeatable-read mixed
    BEGIN
    /*!*/;
    # at 338
    #190124 12:36:35 server id 1  end_log_pos 470 CRC32 0xfb5e71cd 	Query	thread_id=2	exec_time=0	error_code=0
    use `test`/*!*/;
    SET TIMESTAMP=1548304595/*!*/;
    delete from t where id in (select id from tt where id < 3)
    /*!*/;
    # at 470
    #190124 12:36:35 server id 1  end_log_pos 501 CRC32 0xb0ab1a2a 	Xid = 10
    COMMIT/*!*/;
    
    BEGIN
    /*!*/;
    # at 645
    #190124 12:36:42 server id 1  end_log_pos 745 CRC32 0x264f35c7 	Query	thread_id=2	exec_time=0	error_code=0
    SET TIMESTAMP=1548304602/*!*/;
    delete from t where id = 5
    /*!*/;
    # at 745
    #190124 12:36:42 server id 1  end_log_pos 776 CRC32 0x6eb54ec8 	Xid = 11
    COMMIT/*!*/;
    

    我们可以看到前三种情况binlog都是row格式,记录每行记录的变化,而最后一种情况却记录了原生sql,就这个例子你只回档t表,binlog要select tt表这样就有问题了,而新购的腾讯云cdb默认就是最后一种情况,所以有一定概率出现回档失败

    原生MySQL5.7默认为row+repeatable-read,腾讯云cdb默认为mixed+repeatable-read

    当然,看完几块binlog的话也能感觉到mixed占空间少很多,对不对?嘿嘿

    问题来了?
    到这里我们知道,说白了,只要是row格式的binlog就基本上不会有问题,那为什么说用read-committed的事务隔离级别也可以解决问题呢,细心的同学应该也发现了测试过程中,事务隔离级别设置为read-committed,binlog_format为mixed,这个delete语句并没有涉及不确定函数等,但也被强行以row格式记录下来了

    If you are using InnoDB tables and the transaction isolation level is READ COMMITTED or READ UNCOMMITTED, only row-based logging can be used. 
    
    官网摘录,意思就是在read-committed的事务隔离级别下,binlog_format强行row,和你怎么设置无关
    

    至于MySQL为什么这么做,这里不做分析,可以学习91洲际哥博客中事务相关章节

    Ⅲ、小结

    cdb回档失败,建议设置transaction-isolation为read-committed或者设置binlog_format为row

  • 相关阅读:
    Linux中设置tomcat启动内存
    Linux访问Windows磁盘实现共享
    mysql 5.7 ONLY_FULL_GROUP_BY问题
    MAVENCOMPILERPLUGIN编译错误FATAL ERROR: UNABLE TO FIND PACKAGE JAVA.LANG IN CLASSPATH OR BOOTCLASSPATH
    Linux共享Window文件夹权限问题
    Linux之解决每次git pull/git push都需输入密码设置
    20192427李睿智汇编语言学习笔记(14章)
    双亲委派机制
    字体颜色渐变
    视频播放器
  • 原文地址:https://www.cnblogs.com/---wunian/p/10314674.html
Copyright © 2020-2023  润新知