• 5.7.17版本mysqlbinlog实时拉取的二进制日志不完整的原因分析


    问题描述:

    同事使用mysqlbinlog工具的--read-from-remote-server --raw选项,从远程实例实时拉取二进制日志时,发现得到的二进制日志文件大小与远程实例上的源文件大小不相同,并且使用mysqlbinlog解析时会报错。

    测试环境版本信息如下:

    MySQL版本:5.7.17 log MySQL Community Server (GPL)  通用tar包安装

    Mysqlbinlog版本:5.7.17 自带版本,mysqlbinlog Ver 3.4 for linux-glibc2.5 at x86_64

    操作系统版本:CentOS Linux release 7.6.1810 (Core)

    分析过程:

    1、对比拉取到的二进制日志文件、源文件的大小,发现拉取到的二进制日志文件比源文件小,并且该文件大小:7315456/4096,刚好可以整除,源库上进行多次更新后,多次观察,发现该数值都可以被4096整除。

    因为Linux块大小是4096 Bytes,所以,我先做个假设:mysqlbinlog工具以4096字节为单位进行写入

    [root@192_168_129_128 tmp]# ll /usr/local/mysql-5.7.17-linux-glibc2.5-x86_64/data/mybinlog.000006 ; ll mybinlog.000006
    -rw-r----- 1 mysql mysql 7318723 May 12 16:55 /usr/local/mysql-5.7.17-linux-glibc2.5-x86_64/data/mybinlog.000006
    -rw-r----- 1 root root 7315456 May 12 16:55 mybinlog.000006

    2、我在8.0.19版本中测试,问题却不能复现。

    3、写一个简单的shell脚本,将mysqlbinlog拉取进程放到后台执行,获取到上一步的pid之后,使用pstack命令和strace命令,分别查看该进程的函数调用和系统调用

    #!/bin/bash 
    
    nohup mysqlbinlog --read-from-remote-server --host=192.168.129.128 --user=dba --password=123456 --raw --to-last-log binlog.000002 --stop-never &
    
    pid=`ps -ef | grep mysqlbinlog | grep -v 'grep' | awk '{print $2}'`
    
    pstack $pid > pstack.log
    
    strace -f -t -p $pid -o strace.log

    pstack命令的输出结果如下,虽然没有直接的线索,但是后续可以利用这些函数栈名,去源码中找线索:

    #0  0x00007faf58f60a2d in recv () from /lib64/libpthread.so.0
    #1  0x000000000045e9f9 in inline_mysql_socket_recv (flags=0, n=5826, buf=0x1d959d1, mysql_socket=..., src_line=123, src_file=0x550918 "/export/home/pb2/build/sb_0-21378219-1480347226.17/mysql-5.7.17/vio/viosocket.c") at /export/home/pb2/build/sb_0-21378219-1480347226.17/mysql-5.7.17/include/mysql/psi/mysql_socket.h:823
    #2  vio_read (vio=0x1d90e60, buf=0x1d959d1 "27399199-10542799900-96491182343-85303866742-80196460272-89060617578-51177070778-10421134155;80308169113-21291571753-18876715410-91134905277-85771492482360j03", size=5826) at /export/home/pb2/build/sb_0-21378219-1480347226.17/mysql-5.7.17/vio/viosocket.c:123
    #3  0x0000000000434873 in net_read_raw_loop (count=5826, net=0x1d8e550) at /export/home/pb2/build/sb_0-21378219-1480347226.17/mysql-5.7.17/sql/net_serv.cc:672
    #4  net_read_packet (net=0x1d8e550, complen=0x7fff6d023f88) at /export/home/pb2/build/sb_0-21378219-1480347226.17/mysql-5.7.17/sql/net_serv.cc:859
    #5  0x0000000000434a7c in my_net_read (net=0x1d8e550) at /export/home/pb2/build/sb_0-21378219-1480347226.17/mysql-5.7.17/sql/net_serv.cc:899
    #6  0x000000000043985b in cli_safe_read_with_ok (mysql=0x1d8e550, parse_ok=0 '00', is_data_packet=0x0) at /export/home/pb2/build/sb_0-21378219-1480347226.17/mysql-5.7.17/sql-common/client.c:1040
    #7  0x0000000000426b36 in dump_remote_log_entries (logname=0x7fff6d0277e9 "binlog.000002", print_event_info=0x7fff6d0250f0) at /export/home/pb2/build/sb_0-21378219-1480347226.17/mysql-5.7.17/client/mysqlbinlog.cc:2664
    #8  dump_single_log (logname=0x7fff6d0277e9 "binlog.000002", print_event_info=0x7fff6d0250f0) at /export/home/pb2/build/sb_0-21378219-1480347226.17/mysql-5.7.17/client/mysqlbinlog.cc:2334
    #9  dump_multiple_logs (argc=1, argv=<optimized out>) at /export/home/pb2/build/sb_0-21378219-1480347226.17/mysql-5.7.17/client/mysqlbinlog.cc:2374
    #10 0x0000000000427821 in main (argc=1, argv=0x1d58468) at /export/home/pb2/build/sb_0-21378219-1480347226.17/mysql-5.7.17/client/mysqlbinlog.cc:3436

    strace命令的输出结果如下:

    ....................省略若干行....................
    20528 11:24:25 recvfrom(3, "6813690-49945547265-44609719076-"..., 6605, 0, NULL, NULL) = 6605
    20528 11:24:25 lseek(4, 0, SEEK_CUR)    = 3424256
    20528 11:24:25 write(4, "553290165-79173089006-0778194955"..., 4096) = 4096
    20528 11:24:25 write(4, "2360E	p3w71247594556-464035229"..., 4096) = 4096
    20528 11:24:25 recvfrom(3, " 211K200266^20/20137320d4211"..., 16384, 0, NULL, NULL) = 1523
    20528 11:24:25 lseek(4, 0, SEEK_CUR)    = 3432448
    20528 11:24:25 lseek(4, 0, SEEK_CUR)    = 3432448
    20528 11:24:25 lseek(4, 0, SEEK_CUR)    = 3432448
    20528 11:24:25 lseek(4, 0, SEEK_CUR)    = 3432448
    20528 11:24:25 lseek(4, 0, SEEK_CUR)    = 3432448
    20528 11:24:25 lseek(4, 0, SEEK_CUR)    = 3432448
    20528 11:24:25 lseek(4, 0, SEEK_CUR)    = 3432448
    20528 11:24:25 lseek(4, 0, SEEK_CUR)    = 3432448
    20528 11:24:25 lseek(4, 0, SEEK_CUR)    = 3432448
    20528 11:24:25 lseek(4, 0, SEEK_CUR)    = 3432448
    20528 11:24:25 lseek(4, 0, SEEK_CUR)    = 3432448
    20528 11:24:25 lseek(4, 0, SEEK_CUR)    = 3432448
    20528 11:24:25 lseek(4, 0, SEEK_CUR)    = 3432448
    20528 11:24:25 lseek(4, 0, SEEK_CUR)    = 3432448
    20528 11:24:25 lseek(4, 0, SEEK_CUR)    = 3432448
    20528 11:24:25 lseek(4, 0, SEEK_CUR)    = 3432448
    20528 11:24:25 lseek(4, 0, SEEK_CUR)    = 3432448
    20528 11:24:25 lseek(4, 0, SEEK_CUR)    = 3432448
    20528 11:24:25 lseek(4, 0, SEEK_CUR)    = 3432448
    20528 11:24:25 lseek(4, 0, SEEK_CUR)    = 3432448
    20528 11:24:25 lseek(4, 0, SEEK_CUR)    = 3432448
    20528 11:24:25 lseek(4, 0, SEEK_CUR)    = 3432448
    20528 11:24:25 recvfrom(3, 

    strace到这里就没有后续的输出了。

    其中,recvfrom系统调用,是经socket接收数据;lseek是一个用于改变读写一个文件时读写指针位置的一个系统调用;write把buf中nbyte写入文件描述符handle所指的文档,成功时返回写的字节数,错误时返回-1.

    分析strace日志中的多条write记录,发现每次write的写入字节数都是4096,看起来与前面的假设吻合

    但是,每达到4096字节才写入磁盘,那么拉取到的binlog几乎不可能与源库一致。而mysqlbinlog的help显示,--read-from-remote-server参数,实质上是调用MySQL Server的BINLOG-DUMP-NON-GTIDS接口,远程实例应该是实时发送binlog events才对。

    -R, --read-from-remote-server 
                          Read binary logs from a MySQL server. This is an alias
                          for read-from-remote-master=BINLOG-DUMP-NON-GTIDS.

    好吧,从pstack日志中得到的函数名,我们去源码中逐个查看,最终与本问题相关的是client/mysqlbinlog.cc中的dump_remote_log_entries()函数,如下:

    /**
      Requests binlog dump from a remote server and prints the events it
      receives.
    
      @param[in,out] print_event_info Parameters and context state
      determining how to print.
      @param[in] logname Name of input binlog.
    
      @retval ERROR_STOP An error occurred - the program should terminate.
      @retval OK_CONTINUE No error, the program should continue.
      @retval OK_STOP No error, but the end of the specified range of
      events to process has been reached and the program should terminate.
    */
    static Exit_status dump_remote_log_entries(PRINT_EVENT_INFO *print_event_info,
                                               const char* logname)
    {
      ....................省略若干行....................
      for (;;)
      {
        const char *error_msg= NULL;
        Log_event *ev= NULL;
        Log_event_type type= binary_log::UNKNOWN_EVENT;
    
        len= cli_safe_read(mysql, NULL);
        if (len == packet_error)
        {
          error("Got error reading packet from server: %s", mysql_error(mysql));
          DBUG_RETURN(ERROR_STOP);
        }
        if (len < 8 && net->read_pos[0] == 254)
          break; // end of data
        DBUG_PRINT("info",( "len: %lu  net->read_pos[5]: %d
    ",
                len, net->read_pos[5]));
        /*
          In raw mode We only need the full event details if it is a 
          ROTATE_EVENT or FORMAT_DESCRIPTION_EVENT
        */
    
     
        ....................省略若干行....................
        if (raw_mode || (type != binary_log::LOAD_EVENT))
        {
    
     
    ....................省略若干行....................
          if (raw_mode)
          {
            DBUG_EXECUTE_IF("simulate_result_file_write_error",
                            DBUG_SET("+d,simulate_fwrite_error"););
            if (my_fwrite(result_file, net->read_pos + 1 , len - 1, MYF(MY_NABP)))
                /*可以看到,cli_safe_read()方法读取到的binlog event,都会调用my_fwrite函数进行写入,my_fwrite是对fwrite()函数的封装
                这里并没有以4096字节为单位写入,而是读多少就写入多少
                这就无法解释了,代码逻辑显示每次拿到数据之后,都会写入磁盘,为什么实际上却不是呢?*/
            {
              error("Could not write into log file '%s'", log_file_name);
              retval= ERROR_STOP;
            }
            if (ev)
              reset_temp_buf_and_delete(ev);
          }
    ....................省略若干行....................
    
          if (retval != OK_CONTINUE)
            DBUG_RETURN(retval);
        }
        else
        {
    ....................省略若干行....................
        old_off+= len-1;
      }
    
      DBUG_RETURN(OK_CONTINUE);
    }

    请看我用中文注释在上述源码中的分析。

    代码逻辑没有问题,那就有可能是BUG了。。。

    结论:

    通过上述分析,怀疑遇到了BUG,于是去官方的Release Notes中查找,最终在https://dev.mysql.com/doc/relnotes/mysql/5.7/en/news-5-7-19.html 中找到如下说明:

    Replication:

    mysqlbinlog, if invoked with the --raw option, does not flush the output file until the process terminates. But if also invoked with the --stop-never option,

    the process never terminates, thus nothing is ever written to the output file. Now the output is flushed after each event. (Bug #24609402)

     但是这个BUG在BUG system中找不到,应该是需要有权限的MOS账号才能查看。

  • 相关阅读:
    30.3 Collections 集合体系的工具类
    30.2 案例:ArrayList本身数据可以重复,写出去重的方法
    30 HashSet
    29.3 ArrayList、List、LinkedList(链表)优缺点
    foreach-- for循环的增强型:for(类型 变量 : 数组对象) { 使用变量 }
    <E> 泛型
    29.2 Iterator 迭代器ConcurrentModificationException:并发修改异常处理
    29.2 Iterator 迭代器
    29.1 collection层次结构 中的根接口
    29 collection 集合体系结构
  • 原文地址:https://www.cnblogs.com/dba-john/p/12877608.html
Copyright © 2020-2023  润新知