• MySQL之高可用MHA


    MHA工作原理

    1. MHA利用 SELECT 1 As Value 指令判断master服务器的健康性,一旦master 宕机,MHA 从宕机崩溃的master保存二进制日志事件(binlog events)
    2. 识别含有最新更新的slave
    3. 应用差异的中继日志(relay log)到其他的slave
    4. 应用从master保存的二进制日志事件(binlog events)
    5. 提升一个slave为新的master
    6. 使其他的slave连接新的master进行复制
    

    将 i(1)--->i(2)--->i(x) 全部组成一个二进制日志

    注意:为了尽可能的减少主库硬件损坏宕机造成的数据丢失,因此在配置MHA的同时建议配置成MySQL的半同步复制

    案例:实现 MHA 实战案例

    注意:CentOS8系统运行报错,不推荐使用

    环境:四台主机
    172.31.0.17 CentOS7 MHA管理端 
    172.31.0.28 CentOS8 MySQL8.0 Master
    172.31.0.38 CentOS8 MySQL8.0 Slave1
    172.31.0.48 CentOS8 MySQL8.0 Slave2
    
    在管理节点上安装两个包mha4mysql-manager和mha4mysql-node

    说明:

    mha4mysql-manager-0.56-0.el6.noarch.rpm 不支持CentOS 8,只支持CentOS7 以下版本
    mha4mysql-manager-0.58-0.el7.centos.noarch.rpm 支持MySQL5.7和MySQL8.0 ,但和CentOS8
    版本上的Mariadb -10.3.17不兼容
    [root@centos8 ~]# ls
    anaconda-ks.cfg  mha4mysql-manager-0.58-0.el7.centos.noarch.rpm  mha4mysql-node-0.58-0.el7.centos.noarch.rpm  original-ks.cfg
    
    [root@centos8 ~]# yum install mha4mysql-manager-0.58-0.el7.centos.noarch.rpm -y mha4mysql-node-0.58-0.el7.centos.noarch.rpm 
    

    在所有MySQL服务器上安装mha4mysql-node包,
    此包支持CentOS 8,7,6

    [root@sz-kx-centos8 ~]# yum install -y mha4mysql-node-0.58-0.el7.centos.noarch.rpm
    

    在所有节点实现相互之间ssh key验证

    MHA管理端
    [root@centos8 ~]# yum install rsync -y
    [05:47:52 root@centos8 ~]# ssh-keygen 
    Generating public/private rsa key pair.
    Enter file in which to save the key (/root/.ssh/id_rsa): 
    Created directory '/root/.ssh'.
    Enter passphrase (empty for no passphrase): 
    Enter same passphrase again: 
    Your identification has been saved in /root/.ssh/id_rsa.
    Your public key has been saved in /root/.ssh/id_rsa.pub.
    The key fingerprint is:
    SHA256:GA6eD2oTXm2a30Nq3oo0VjiEUfPy9YS/h9vIjfnkdFo root@centos8.longxuan.vip
    The key's randomart image is:
    +---[RSA 3072]----+
    |  ..o            |
    |   o o   .       |
    |  . + o o .      |
    |   o O + +       |
    |  . B B S o      |
    | . + O  .  o     |
    |  = * .o  o + E  |
    | . + +oo.. % +   |
    |    .o+.o.*.*    |
    +----[SHA256]-----+
    [05:48:02 root@centos8 ~]# ssh-copy-id 127.0.0.1
    /usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
    The authenticity of host '127.0.0.1 (127.0.0.1)' can't be established.
    ECDSA key fingerprint is SHA256:UxQsAjgLsmA4tpc7HO0xU9txsXgxqhyba9KbywIvZTA.
    Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
    /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
    /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
    root@127.0.0.1's password: 
    
    Number of key(s) added: 1
    
    Now try logging into the machine, with:   "ssh '127.0.0.1'"
    and check to make sure that only the key(s) you wanted were added.
    
    [05:51:01 root@centos8 ~]# rsync -av .ssh 172.31.0.18:/root/
    [05:51:01 root@centos8 ~]# rsync -av .ssh 172.31.0.38:/root/
    [05:51:01 root@centos8 ~]# rsync -av .ssh 172.31.0.48:/root/
    

    在管理节点建立配置文件

    注意: 此文件的行尾不要加空格等符号

    [root@centos8 ~]# mkdir /etc/mastermha/
    
    [root@centos8 ~]# vim /etc/mastermha/app1.cnf
    [server default]
    user=mhauser
    password=centos
    manager_workdir=/data/mastermha/app1/
    manager_log=/data/mastermha/app1/manager.log
    remote_workdir=/data/mastermha/app1/
    ssh_user=root
    repl_user=repluser
    repl_password=123456
    ping_interval=1
    master_ip_failover_script=/usr/local/bin/master_ip_failover
    report_script=/usr/local/bin/sendmail.sh
    check_repl_delay=0
    master_binlog_dir=/data/mysql/
    [server1]
    hostname=172.31.0.28
    candidate_master=1
    [server2]
    hostname=172.31.0.38
    candidate_master=1
    [server3]
    hostname=172.31.0.48
    

    说明: 主库宕机谁来接管新的master

    1. 所有从节点日志都是一致的,默认会以配置文件的顺序去选择一个新主
    2. 从节点日志不一致,自动选择最接近于主库的从库充当新主
    3. 如果对于某节点设定了权重(candidate_master=1),权重节点会优先选择。但是此节点日志量落后主库超过100M日志的话,也不会被选择。可以配合check_repl_delay=0,关闭日志量的检查,强制选择候选节点
    

    相关文件脚本

    # 安装邮件软件用于报警
    [root@centos8 ~]# yum install postfix mailx -y
    # 启动
    [root@centos8 ~]# systemctl start postfix.service
    # 配置邮件
    [root@centos8 ~]# vim /etc/mail.rc
    set from=llxuan@163.com
    set smtp=smtp.163.com
    set smtp-auth-user=llxuan@163.com
    set smtp-auth-password=xxxxxxxxxxx   # smtp授权码
    
    # 报警脚本
    [root@centos8 ~]# cat /usr/local/bin/sendmail.sh
    #!/bin/bash
    echo "MySQL is down" | mail -s "MHA Warning" llxuan@162.com
    

    授权

    [root@centos8 ~]# chmod +x /usr/local/bin/sendmail.sh
    

    相关脚本

    [06:02:52 root@centos8 ~]# vim /usr/local/bin/master_ip_failover
     
    #!/usr/bin/env perl
    use strict;
    use warnings FATAL => 'all';
    use Getopt::Long;
    my (
    $command, $ssh_user, $orig_master_host, $orig_master_ip,
    $orig_master_port, $new_master_host, $new_master_ip, $new_master_port
    );
    my $vip = '172.31.0.100/16';
    my $gateway = '172.31.0.254';
    my $interface = 'eth0';
    my $key = "1";
    my $ssh_start_vip = "/sbin/ifconfig $interface:$key $vip;/sbin/arping -I $interface -c 3 -s $vip $gateway >/dev/null 2>&1";
    my $ssh_stop_vip = "/sbin/ifconfig $interface:$key down";
    GetOptions(
    'command=s' => $command,
    'ssh_user=s' => $ssh_user,
    'orig_master_host=s' => $orig_master_host,
    'orig_master_ip=s' => $orig_master_ip,
    'orig_master_port=i' => $orig_master_port,
    'new_master_host=s' => $new_master_host,
    'new_master_ip=s' => $new_master_ip,
    'new_master_port=i' => $new_master_port,
    );
    exit &main();
    sub main {
    print "
    
    IN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===
    
    ";
    if ( $command eq "stop" || $command eq "stopssh" ) {
    # $orig_master_host, $orig_master_ip, $orig_master_port are passed.
    # If you manage master ip address at global catalog database,
    # invalidate orig_master_ip here.
    my $exit_code = 1;
    eval {
    print "Disabling the VIP on old master: $orig_master_host 
    ";
    &stop_vip();
    $exit_code = 0;
    };
    if ($@) {
    warn "Got Error: $@
    ";
    exit $exit_code;
    }
    exit $exit_code;
    }
    elsif ( $command eq "start" ) {
    # all arguments are passed.
    # If you manage master ip address at global catalog database,
    # activate new_master_ip here.
    # You can also grant write access (create user, set read_only=0, etc) here.
    my $exit_code = 10;
    eval {
    print "Enabling the VIP - $vip on the new master - $new_master_host 
    ";
    &start_vip();
    $exit_code = 0;
    };
    if ($@) {
    warn $@;
    exit $exit_code;
    }
    exit $exit_code;
    }
    elsif ( $command eq "status" ) {
    print "Checking the Status of the script.. OK 
    ";
    `ssh $ssh_user@$orig_master_host " $ssh_start_vip "`;
    exit 0;
    }
    else {
    &usage();
    exit 1;
    }
    }
    # A simple system call that enable the VIP on the new master
    sub start_vip() {
    `ssh $ssh_user@$new_master_host " $ssh_start_vip "`;
    }
    # A simple system call that disable the VIP on the old_master
    sub stop_vip() {
    `ssh $ssh_user@$orig_master_host " $ssh_stop_vip "`;
    }
    sub usage {
    print
    "Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port
    ";
    }
    

    脚本授权

    [root@centos8 ~]# chmod +x /usr/local/bin/master_ip_failover
    

    实现Master

    [root@sz-kx-centos8 ~]# yum install mysql-server -y
    
    [root@sz-kx-centos8 ~]# mkdir /data/mysql/
    [root@sz-kx-centos8 ~]# chown mysql.mysql /data/mysql/
    
    [root@sz-kx-centos8 ~]# vim /etc/my.cnf
    [mysqld]
    server-id=28
    log-bin=/data/mysql/mysql-bin
    skip-name-resolve=1
    general-log
    
    [root@sz-kx-centos8 ~]# systemctl restart mysqld
    # 查询二进制日志位置
    mysql> show master logs;
    
    # 创建主从复制用户并授权
    mysql> create user repluser@'172.31.0.%' identified by '123456';
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> grant replication slave on *.* to repluser@'172.31.0.%';
    Query OK, 0 rows affected (0.01 sec)
    
    # 创建mha用户并授权
    mysql> create user mhauser@'172.31.0.%' identified by 'centos';
    Query OK, 0 rows affected (0.00 sec)
    
    mysql> grant all on *.* to mhauser@'172.31.0.%';
    Query OK, 0 rows affected (0.01 sec)
    
    # 使用标签做个VIP地址
    [root@sz-kx-centos8 ~]# ifconfig eth0:1 172.31.0.100/16
    [root@sz-kx-centos8 ~]# ifconfig 
    eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
            inet 172.31.0.18  netmask 255.255.0.0  broadcast 172.31.255.255
            inet6 fe80::20c:29ff:fe43:49b  prefixlen 64  scopeid 0x20<link>
            ether 00:0c:29:43:04:9b  txqueuelen 1000  (Ethernet)
            RX packets 42588  bytes 55076155 (52.5 MiB)
            RX errors 0  dropped 0  overruns 0  frame 0
            TX packets 20092  bytes 1443183 (1.3 MiB)
            TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
    
    eth0:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
            inet 172.31.0.100  netmask 255.255.0.0  broadcast 172.31.255.255
            ether 00:0c:29:43:04:9b  txqueuelen 1000  (Ethernet)
    

    实现两台slave

    [root@centos8 ~]# yum install mysql-server -y
    [root@centos8 ~]# mkdir /data/mysql -p
    [root@centos8 ~]# chown mysql.mysql /data/mysql/
    
    [root@centos8 ~]# vim /etc/my.cnf
    [mysqld]
    server-id=48
    log-bin=/data/mysql/mysql-bin
    read-only
    relay_log_purge=0
    skip_name_resolve=1
    general_log
    
    [root@centos8 ~]# systemctl start mysqld
    
    # 添加主的二进制日志,注意:如果之后重新添加不能添加之前的,只能添加当前的
    CHANGE MASTER TO
      MASTER_HOST='172.31.0.28',
      MASTER_USER='repluser',
      MASTER_PASSWORD='123456',
      MASTER_PORT=3306,
      MASTER_LOG_FILE='mysql-bin.000002',
      MASTER_LOG_POS=156;
    
    mysql> start slave;
    Query OK, 0 rows affected (0.05 sec)
    

    检查MHA的环境

    # ssh互信检测
    [root@centos8 ~]# masterha_check_ssh --conf=/etc/mastermha/app1.cnf 
    Sat May 22 06:32:13 2021 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
    Sat May 22 06:32:13 2021 - [info] Reading application default configuration from /etc/mastermha/app1.cnf..
    Sat May 22 06:32:13 2021 - [info] Reading server configuration from /etc/mastermha/app1.cnf..
    Sat May 22 06:32:13 2021 - [info] Starting SSH connection tests..
    Sat May 22 06:32:14 2021 - [debug] 
    Sat May 22 06:32:13 2021 - [debug]  Connecting via SSH from root@172.31.0.18(172.31.0.18:22) to root@172.31.0.38(172.31.0.38:22)..
    Sat May 22 06:32:13 2021 - [debug]   ok.
    Sat May 22 06:32:13 2021 - [debug]  Connecting via SSH from root@172.31.0.18(172.31.0.18:22) to root@172.31.0.48(172.31.0.48:22)..
    Warning: Permanently added '172.31.0.48' (ECDSA) to the list of known hosts.
    Sat May 22 06:32:14 2021 - [debug]   ok.
    Sat May 22 06:32:14 2021 - [debug] 
    Sat May 22 06:32:13 2021 - [debug]  Connecting via SSH from root@172.31.0.38(172.31.0.38:22) to root@172.31.0.18(172.31.0.18:22)..
    Sat May 22 06:32:14 2021 - [debug]   ok.
    Sat May 22 06:32:14 2021 - [debug]  Connecting via SSH from root@172.31.0.38(172.31.0.38:22) to root@172.31.0.48(172.31.0.48:22)..
    Sat May 22 06:32:14 2021 - [debug]   ok.
    Sat May 22 06:32:15 2021 - [debug] 
    Sat May 22 06:32:14 2021 - [debug]  Connecting via SSH from root@172.31.0.48(172.31.0.48:22) to root@172.31.0.18(172.31.0.18:22)..
    Sat May 22 06:32:14 2021 - [debug]   ok.
    Sat May 22 06:32:14 2021 - [debug]  Connecting via SSH from root@172.31.0.48(172.31.0.48:22) to root@172.31.0.38(172.31.0.38:22)..
    Sat May 22 06:32:15 2021 - [debug]   ok.
    Sat May 22 06:32:15 2021 - [info] All SSH connection tests passed successfully.
    Use of uninitialized value in exit at /usr/bin/masterha_check_ssh line 44.
    
    # 主从复制检测
    [root@centos8 ~]# masterha_check_repl --conf=/etc/mastermha/app1.cnf 
    Sat May 22 18:00:02 2021 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
    Sat May 22 18:00:02 2021 - [info] Reading application default configuration from /etc/mastermha/app1.cnf..
    Sat May 22 18:00:02 2021 - [info] Reading server configuration from /etc/mastermha/app1.cnf..
    Sat May 22 18:00:02 2021 - [info] Starting SSH connection tests..
    Sat May 22 18:00:03 2021 - [debug] 
    Sat May 22 18:00:02 2021 - [debug]  Connecting via SSH from root@172.31.0.28(172.31.0.28:22) to root@172.31.0.48(172.31.0.48:22)..
    Sat May 22 18:00:02 2021 - [debug]   ok.
    Sat May 22 18:00:02 2021 - [debug]  Connecting via SSH from root@172.31.0.28(172.31.0.28:22) to root@172.31.0.38(172.31.0.38:22)..
    Sat May 22 18:00:02 2021 - [debug]   ok.
    Sat May 22 18:00:03 2021 - [debug] 
    Sat May 22 18:00:02 2021 - [debug]  Connecting via SSH from root@172.31.0.48(172.31.0.48:22) to root@172.31.0.28(172.31.0.28:22)..
    Sat May 22 18:00:02 2021 - [debug]   ok.
    Sat May 22 18:00:02 2021 - [debug]  Connecting via SSH from root@172.31.0.48(172.31.0.48:22) to root@172.31.0.38(172.31.0.38:22)..
    Sat May 22 18:00:03 2021 - [debug]   ok.
    Sat May 22 18:00:04 2021 - [debug] 
    Sat May 22 18:00:03 2021 - [debug]  Connecting via SSH from root@172.31.0.38(172.31.0.38:22) to root@172.31.0.28(172.31.0.28:22)..
    Sat May 22 18:00:03 2021 - [debug]   ok.
    Sat May 22 18:00:03 2021 - [debug]  Connecting via SSH from root@172.31.0.38(172.31.0.38:22) to root@172.31.0.48(172.31.0.48:22)..
    Sat May 22 18:00:03 2021 - [debug]   ok.
    Sat May 22 18:00:04 2021 - [info] All SSH connection tests passed successfully.
    [root@localhost ~]# masterha_check_repl --conf=/etc/mastermha/app1.cnf 
    Sat May 22 18:00:08 2021 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
    Sat May 22 18:00:08 2021 - [info] Reading application default configuration from /etc/mastermha/app1.cnf..
    Sat May 22 18:00:08 2021 - [info] Reading server configuration from /etc/mastermha/app1.cnf..
    Sat May 22 18:00:08 2021 - [info] MHA::MasterMonitor version 0.58.
    Sat May 22 18:00:09 2021 - [info] GTID failover mode = 0
    Sat May 22 18:00:09 2021 - [info] Dead Servers:
    Sat May 22 18:00:09 2021 - [info] Alive Servers:
    Sat May 22 18:00:09 2021 - [info]   172.31.0.28(172.31.0.28:3306)
    Sat May 22 18:00:09 2021 - [info]   172.31.0.48(172.31.0.48:3306)
    Sat May 22 18:00:09 2021 - [info]   172.31.0.38(172.31.0.38:3306)
    Sat May 22 18:00:09 2021 - [info] Alive Slaves:
    Sat May 22 18:00:09 2021 - [info]   172.31.0.48(172.31.0.48:3306)  Version=8.0.21 (oldest major version between slaves) log-bin:enabled
    Sat May 22 18:00:09 2021 - [info]     Replicating from 172.31.0.28(172.31.0.28:3306)
    Sat May 22 18:00:09 2021 - [info]     Primary candidate for the new Master (candidate_master is set)
    Sat May 22 18:00:09 2021 - [info]   172.31.0.38(172.31.0.38:3306)  Version=8.0.21 (oldest major version between slaves) log-bin:enabled
    Sat May 22 18:00:09 2021 - [info]     Replicating from 172.31.0.28(172.31.0.28:3306)
    Sat May 22 18:00:09 2021 - [info] Current Alive Master: 172.31.0.28(172.31.0.28:3306)
    Sat May 22 18:00:09 2021 - [info] Checking slave configurations..
    Sat May 22 18:00:09 2021 - [info] Checking replication filtering settings..
    Sat May 22 18:00:09 2021 - [info]  binlog_do_db= , binlog_ignore_db= 
    Sat May 22 18:00:09 2021 - [info]  Replication filtering check ok.
    Sat May 22 18:00:09 2021 - [info] GTID (with auto-pos) is not supported
    Sat May 22 18:00:09 2021 - [info] Starting SSH connection tests..
    Sat May 22 18:00:12 2021 - [info] All SSH connection tests passed successfully.
    Sat May 22 18:00:12 2021 - [info] Checking MHA Node version..
    Sat May 22 18:00:12 2021 - [info]  Version check ok.
    Sat May 22 18:00:12 2021 - [info] Checking SSH publickey authentication settings on the current master..
    Sat May 22 18:00:13 2021 - [info] HealthCheck: SSH to 172.31.0.28 is reachable.
    Sat May 22 18:00:13 2021 - [info] Master MHA Node version is 0.58.
    Sat May 22 18:00:13 2021 - [info] Checking recovery script configurations on 172.31.0.28(172.31.0.28:3306)..
    Sat May 22 18:00:13 2021 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/data/mysql/ --output_file=/data/mastermha/app1//save_binary_logs_test --manager_version=0.58 --start_file=mysql-bin.000002 
    Sat May 22 18:00:13 2021 - [info]   Connecting to root@172.31.0.28(172.31.0.28:22).. 
      Creating /data/mastermha/app1 if not exists.. Creating directory /data/mastermha/app1.. done.
       ok.
      Checking output directory is accessible or not..
       ok.
      Binlog found at /data/mysql/, up to mysql-bin.000002
    Sat May 22 18:00:13 2021 - [info] Binlog setting check done.
    Sat May 22 18:00:13 2021 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
    Sat May 22 18:00:13 2021 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mhauser' --slave_host=172.31.0.48 --slave_ip=172.31.0.48 --slave_port=3306 --workdir=/data/mastermha/app1/ --target_version=8.0.21 --manager_version=0.58 --relay_dir=/var/lib/mysql --current_relay_log=centos8-relay-bin.000002  --slave_pass=xxx
    Sat May 22 18:00:13 2021 - [info]   Connecting to root@172.31.0.48(172.31.0.48:22).. 
    Creating directory /data/mastermha/app1/.. done.
      Checking slave recovery environment settings..
        Relay log found at /var/lib/mysql, up to centos8-relay-bin.000002
        Temporary relay log file is /var/lib/mysql/centos8-relay-bin.000002
        Checking if super_read_only is defined and turned on.. not present or turned off, ignoring.
        Testing mysql connection and privileges..
    mysql: [Warning] Using a password on the command line interface can be insecure.
     done.
        Testing mysqlbinlog output.. done.
        Cleaning up test file(s).. done.
    Sat May 22 18:00:13 2021 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mhauser' --slave_host=172.31.0.38 --slave_ip=172.31.0.38 --slave_port=3306 --workdir=/data/mastermha/app1/ --target_version=8.0.21 --manager_version=0.58 --relay_dir=/var/lib/mysql --current_relay_log=centos8-relay-bin.000002  --slave_pass=xxx
    Sat May 22 18:00:13 2021 - [info]   Connecting to root@172.31.0.38(172.31.0.38:22).. 
    Creating directory /data/mastermha/app1/.. done.
      Checking slave recovery environment settings..
        Relay log found at /var/lib/mysql, up to centos8-relay-bin.000002
        Temporary relay log file is /var/lib/mysql/centos8-relay-bin.000002
        Checking if super_read_only is defined and turned on.. not present or turned off, ignoring.
        Testing mysql connection and privileges..
    mysql: [Warning] Using a password on the command line interface can be insecure.
     done.
        Testing mysqlbinlog output.. done.
        Cleaning up test file(s).. done.
    Sat May 22 18:00:14 2021 - [info] Slaves settings check done.
    Sat May 22 18:00:14 2021 - [info] 
    172.31.0.28(172.31.0.28:3306) (current master)
     +--172.31.0.48(172.31.0.48:3306)
     +--172.31.0.38(172.31.0.38:3306)
    
    Sat May 22 18:00:14 2021 - [info] Checking replication health on 172.31.0.48..
    Sat May 22 18:00:14 2021 - [info]  ok.
    Sat May 22 18:00:14 2021 - [info] Checking replication health on 172.31.0.38..
    Sat May 22 18:00:14 2021 - [info]  ok.
    Sat May 22 18:00:14 2021 - [info] Checking master_ip_failover_script status:
    Sat May 22 18:00:14 2021 - [info]   /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=172.31.0.28 --orig_master_ip=172.31.0.28 --orig_master_port=3306 
    
    
    IN SCRIPT TEST====/sbin/ifconfig eth0:1 down==/sbin/ifconfig eth0:1 172.31.0.100/16;/sbin/arping -I eth0 -c 3 -s 172.31.0.100/16 172.31.0.254 >/dev/null 2>&1===
    
    Checking the Status of the script.. OK 
    Sat May 22 18:00:14 2021 - [info]  OK.
    Sat May 22 18:00:14 2021 - [warning] shutdown_script is not defined.
    Sat May 22 18:00:14 2021 - [info] Got exit code 0 (Not master dead).
    
    MySQL Replication Health is OK.
    
    # 查看状态
    [root@centos8 ~]# masterha_check_status --conf=/etc/mastermha/app1.cnf 
    app1 is stopped(2:NOT_RUNNING).
    
    # 启动
    [root@centos8 ~]# masterha_manager --conf=/etc/mastermha/app1.cnf &> /dev/null
    
    # master查看到健康性检查
    [root@sz-kx-centos8 ~]# tail -f /var/lib/mysql/centos8.log
    2021-05-22T18:05:00.408005Z	   24 Query	SELECT 1 As Value
    2021-05-22T18:05:01.408492Z	   24 Query	SELECT 1 As Value
    2021-05-22T18:05:02.409002Z	   24 Query	SELECT 1 As Value
    2021-05-22T18:05:03.409469Z	   24 Query	SELECT 1 As Value
    2021-05-22T18:05:04.410620Z	   24 Query	SELECT 1 As Value
    2021-05-22T18:05:05.411095Z	   24 Query	SELECT 1 As Value
    
    # 查看状态
    [root@localhost ~]# masterha_check_status --conf=/etc/mastermha/app1.cnf
    app1 (pid:27237) is running(0:PING_OK), master:172.31.0.28
    
    

    模拟故障

    # 当 master down机后,mha管理程序自动退出
    
    # 追踪日志
    [root@localhost ~]# tail /data/mastermha/app1/manager.log -f
    Sat May 22 18:08:32 2021 - [warning] Got error on MySQL select ping: 1053 (Server shutdown in progress)
    Sat May 22 18:08:32 2021 - [info] Executing SSH check script: save_binary_logs --command=test --start_pos=4 --binlog_dir=/data/mysql/ --output_file=/data/mastermha/app1//save_binary_logs_test --manager_version=0.58 --binlog_prefix=mysql-bin
    Sat May 22 18:08:32 2021 - [info] HealthCheck: SSH to 172.31.0.28 is reachable.
    Sat May 22 18:08:33 2021 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '172.31.0.28' (111))
    Sat May 22 18:08:33 2021 - [warning] Connection failed 2 time(s)..
    Sat May 22 18:08:34 2021 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '172.31.0.28' (111))
    Sat May 22 18:08:34 2021 - [warning] Connection failed 3 time(s)..
    Sat May 22 18:08:35 2021 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '172.31.0.28' (111))
    Sat May 22 18:08:35 2021 - [warning] Connection failed 4 time(s)..
    Sat May 22 18:08:35 2021 - [warning] Master is not reachable from health checker!
    Sat May 22 18:08:35 2021 - [warning] Master 172.31.0.28(172.31.0.28:3306) is not reachable!
    Sat May 22 18:08:35 2021 - [warning] SSH is reachable.
    Sat May 22 18:08:35 2021 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/mastermha/app1.cnf again, and trying to connect to all servers to check server status..
    Sat May 22 18:08:35 2021 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
    Sat May 22 18:08:35 2021 - [info] Reading application default configuration from /etc/mastermha/app1.cnf..
    Sat May 22 18:08:35 2021 - [info] Reading server configuration from /etc/mastermha/app1.cnf..
    Sat May 22 18:08:36 2021 - [info] GTID failover mode = 0
    Sat May 22 18:08:36 2021 - [info] Dead Servers:
    Sat May 22 18:08:36 2021 - [info]   172.31.0.28(172.31.0.28:3306)
    Sat May 22 18:08:36 2021 - [info] Alive Servers:
    Sat May 22 18:08:36 2021 - [info]   172.31.0.48(172.31.0.48:3306)
    Sat May 22 18:08:36 2021 - [info]   172.31.0.38(172.31.0.38:3306)
    Sat May 22 18:08:36 2021 - [info] Alive Slaves:
    Sat May 22 18:08:36 2021 - [info]   172.31.0.48(172.31.0.48:3306)  Version=8.0.21 (oldest major version between slaves) log-bin:enabled
    Sat May 22 18:08:36 2021 - [info]     Replicating from 172.31.0.28(172.31.0.28:3306)
    Sat May 22 18:08:36 2021 - [info]     Primary candidate for the new Master (candidate_master is set)
    Sat May 22 18:08:36 2021 - [info]   172.31.0.38(172.31.0.38:3306)  Version=8.0.21 (oldest major version between slaves) log-bin:enabled
    Sat May 22 18:08:36 2021 - [info]     Replicating from 172.31.0.28(172.31.0.28:3306)
    Sat May 22 18:08:36 2021 - [info] Checking slave configurations..
    Sat May 22 18:08:36 2021 - [info] Checking replication filtering settings..
    Sat May 22 18:08:36 2021 - [info]  Replication filtering check ok.
    Sat May 22 18:08:36 2021 - [info] Master is down!
    Sat May 22 18:08:36 2021 - [info] Terminating monitoring script.
    Sat May 22 18:08:36 2021 - [info] Got exit code 20 (Master dead).
    Sat May 22 18:08:36 2021 - [info] MHA::MasterFailover version 0.58.
    Sat May 22 18:08:36 2021 - [info] Starting master failover.
    Sat May 22 18:08:36 2021 - [info] 
    Sat May 22 18:08:36 2021 - [info] * Phase 1: Configuration Check Phase..
    Sat May 22 18:08:36 2021 - [info] 
    Sat May 22 18:08:37 2021 - [info] GTID failover mode = 0
    Sat May 22 18:08:37 2021 - [info] Dead Servers:
    Sat May 22 18:08:37 2021 - [info]   172.31.0.28(172.31.0.28:3306)
    Sat May 22 18:08:37 2021 - [info] Checking master reachability via MySQL(double check)...
    Sat May 22 18:08:37 2021 - [info]  ok.
    Sat May 22 18:08:37 2021 - [info] Alive Servers:
    Sat May 22 18:08:37 2021 - [info]   172.31.0.48(172.31.0.48:3306)
    Sat May 22 18:08:37 2021 - [info]   172.31.0.38(172.31.0.38:3306)
    Sat May 22 18:08:37 2021 - [info] Alive Slaves:
    Sat May 22 18:08:37 2021 - [info]   172.31.0.48(172.31.0.48:3306)  Version=8.0.21 (oldest major version between slaves) log-bin:enabled
    Sat May 22 18:08:37 2021 - [info]     Replicating from 172.31.0.28(172.31.0.28:3306)
    Sat May 22 18:08:37 2021 - [info]     Primary candidate for the new Master (candidate_master is set)
    Sat May 22 18:08:37 2021 - [info]   172.31.0.38(172.31.0.38:3306)  Version=8.0.21 (oldest major version between slaves) log-bin:enabled
    Sat May 22 18:08:37 2021 - [info]     Replicating from 172.31.0.28(172.31.0.28:3306)
    Sat May 22 18:08:37 2021 - [info] Starting Non-GTID based failover.
    Sat May 22 18:08:37 2021 - [info] 
    Sat May 22 18:08:37 2021 - [info] ** Phase 1: Configuration Check Phase completed.
    Sat May 22 18:08:37 2021 - [info] 
    Sat May 22 18:08:37 2021 - [info] * Phase 2: Dead Master Shutdown Phase..
    Sat May 22 18:08:37 2021 - [info] 
    Sat May 22 18:08:37 2021 - [info] Forcing shutdown so that applications never connect to the current master..
    Sat May 22 18:08:37 2021 - [info] Executing master IP deactivation script:
    Sat May 22 18:08:37 2021 - [info]   /usr/local/bin/master_ip_failover --orig_master_host=172.31.0.28 --orig_master_ip=172.31.0.28 --orig_master_port=3306 --command=stopssh --ssh_user=root  
    
    IN SCRIPT TEST====/sbin/ifconfig eth0:1 down==/sbin/ifconfig eth0:1 172.31.0.100/16;/sbin/arping -I eth0 -c 3 -s 172.31.0.100/16 172.31.0.254 >/dev/null 2>&1===
    
    Disabling the VIP on old master: 172.31.0.28 
    Sat May 22 18:08:37 2021 - [info]  done.
    Sat May 22 18:08:37 2021 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
    Sat May 22 18:08:37 2021 - [info] * Phase 2: Dead Master Shutdown Phase completed.
    Sat May 22 18:08:37 2021 - [info] 
    Sat May 22 18:08:37 2021 - [info] * Phase 3: Master Recovery Phase..
    Sat May 22 18:08:37 2021 - [info] 
    Sat May 22 18:08:37 2021 - [info] * Phase 3.1: Getting Latest Slaves Phase..
    Sat May 22 18:08:37 2021 - [info] 
    Sat May 22 18:08:37 2021 - [info] The latest binary log file/position on all slaves is mysql-bin.000002:1391
    Sat May 22 18:08:37 2021 - [info] Latest slaves (Slaves that received relay log files to the latest):
    Sat May 22 18:08:37 2021 - [info]   172.31.0.48(172.31.0.48:3306)  Version=8.0.21 (oldest major version between slaves) log-bin:enabled
    Sat May 22 18:08:37 2021 - [info]     Replicating from 172.31.0.28(172.31.0.28:3306)
    Sat May 22 18:08:37 2021 - [info]     Primary candidate for the new Master (candidate_master is set)
    Sat May 22 18:08:37 2021 - [info]   172.31.0.38(172.31.0.38:3306)  Version=8.0.21 (oldest major version between slaves) log-bin:enabled
    Sat May 22 18:08:37 2021 - [info]     Replicating from 172.31.0.28(172.31.0.28:3306)
    Sat May 22 18:08:37 2021 - [info] The oldest binary log file/position on all slaves is mysql-bin.000002:1391
    Sat May 22 18:08:37 2021 - [info] Oldest slaves:
    Sat May 22 18:08:37 2021 - [info]   172.31.0.48(172.31.0.48:3306)  Version=8.0.21 (oldest major version between slaves) log-bin:enabled
    Sat May 22 18:08:37 2021 - [info]     Replicating from 172.31.0.28(172.31.0.28:3306)
    Sat May 22 18:08:37 2021 - [info]     Primary candidate for the new Master (candidate_master is set)
    Sat May 22 18:08:37 2021 - [info]   172.31.0.38(172.31.0.38:3306)  Version=8.0.21 (oldest major version between slaves) log-bin:enabled
    Sat May 22 18:08:37 2021 - [info]     Replicating from 172.31.0.28(172.31.0.28:3306)
    Sat May 22 18:08:37 2021 - [info] 
    Sat May 22 18:08:37 2021 - [info] * Phase 3.2: Saving Dead Master's Binlog Phase..
    Sat May 22 18:08:37 2021 - [info] 
    Sat May 22 18:08:37 2021 - [info] Fetching dead master's binary logs..
    Sat May 22 18:08:37 2021 - [info] Executing command on the dead master 172.31.0.28(172.31.0.28:3306): save_binary_logs --command=save --start_file=mysql-bin.000002  --start_pos=1391 --binlog_dir=/data/mysql/ --output_file=/data/mastermha/app1//saved_master_binlog_from_172.31.0.28_3306_20210522180836.binlog --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.58
      Creating /data/mastermha/app1 if not exists..    ok.
     Concat binary/relay logs from mysql-bin.000002 pos 1391 to mysql-bin.000002 EOF into /data/mastermha/app1//saved_master_binlog_from_172.31.0.28_3306_20210522180836.binlog ..
     Binlog Checksum enabled
      Dumping binlog format description event, from position 0 to 156.. ok.
      No need to dump effective binlog data from /data/mysql//mysql-bin.000002 (pos starts 1391, filesize 1391). Skipping.
     Binlog Checksum enabled
     /data/mastermha/app1//saved_master_binlog_from_172.31.0.28_3306_20210522180836.binlog has no effective data events.
    Event not exists.
    Sat May 22 18:08:38 2021 - [info] Additional events were not found from the orig master. No need to save.
    Sat May 22 18:08:38 2021 - [info] 
    Sat May 22 18:08:38 2021 - [info] * Phase 3.3: Determining New Master Phase..
    Sat May 22 18:08:38 2021 - [info] 
    Sat May 22 18:08:38 2021 - [info] Finding the latest slave that has all relay logs for recovering other slaves..
    Sat May 22 18:08:38 2021 - [info] All slaves received relay logs to the same position. No need to resync each other.
    Sat May 22 18:08:38 2021 - [info] Searching new master from slaves..
    Sat May 22 18:08:38 2021 - [info]  Candidate masters from the configuration file:
    Sat May 22 18:08:38 2021 - [info]   172.31.0.48(172.31.0.48:3306)  Version=8.0.21 (oldest major version between slaves) log-bin:enabled
    Sat May 22 18:08:38 2021 - [info]     Replicating from 172.31.0.28(172.31.0.28:3306)
    Sat May 22 18:08:38 2021 - [info]     Primary candidate for the new Master (candidate_master is set)
    Sat May 22 18:08:38 2021 - [info]  Non-candidate masters:
    Sat May 22 18:08:38 2021 - [info]  Searching from candidate_master slaves which have received the latest relay log events..
    Sat May 22 18:08:38 2021 - [info] New master is 172.31.0.48(172.31.0.48:3306)
    Sat May 22 18:08:38 2021 - [info] Starting master failover..
    Sat May 22 18:08:38 2021 - [info] 
    From:
    172.31.0.28(172.31.0.28:3306) (current master)
     +--172.31.0.48(172.31.0.48:3306)
     +--172.31.0.38(172.31.0.38:3306)
    
    To:
    172.31.0.48(172.31.0.48:3306) (new master)
     +--172.31.0.38(172.31.0.38:3306)
    Sat May 22 18:08:38 2021 - [info] 
    Sat May 22 18:08:38 2021 - [info] * Phase 3.4: New Master Diff Log Generation Phase..
    Sat May 22 18:08:38 2021 - [info] 
    Sat May 22 18:08:38 2021 - [info]  This server has all relay logs. No need to generate diff files from the latest slave.
    Sat May 22 18:08:38 2021 - [info] 
    Sat May 22 18:08:38 2021 - [info] * Phase 3.5: Master Log Apply Phase..
    Sat May 22 18:08:38 2021 - [info] 
    Sat May 22 18:08:38 2021 - [info] *NOTICE: If any error happens from this phase, manual recovery is needed.
    Sat May 22 18:08:38 2021 - [info] Starting recovery on 172.31.0.48(172.31.0.48:3306)..
    Sat May 22 18:08:38 2021 - [info]  This server has all relay logs. Waiting all logs to be applied.. 
    Sat May 22 18:08:38 2021 - [info]   done.
    Sat May 22 18:08:38 2021 - [info]  All relay logs were successfully applied.
    Sat May 22 18:08:38 2021 - [info] Getting new master's binlog name and position..
    Sat May 22 18:08:38 2021 - [info]  mysql-bin.000002:1426
    Sat May 22 18:08:38 2021 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='172.31.0.48', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000002', MASTER_LOG_POS=1426, MASTER_USER='repluser', MASTER_PASSWORD='xxx';
    Sat May 22 18:08:38 2021 - [info] Executing master IP activate script:
    Sat May 22 18:08:38 2021 - [info]   /usr/local/bin/master_ip_failover --command=start --ssh_user=root --orig_master_host=172.31.0.28 --orig_master_ip=172.31.0.28 --orig_master_port=3306 --new_master_host=172.31.0.48 --new_master_ip=172.31.0.48 --new_master_port=3306 --new_master_user='mhauser'   --new_master_password=xxx
    Unknown option: new_master_user
    Unknown option: new_master_password
    
    IN SCRIPT TEST====/sbin/ifconfig eth0:1 down==/sbin/ifconfig eth0:1 172.31.0.100/16;/sbin/arping -I eth0 -c 3 -s 172.31.0.100/16 172.31.0.254 >/dev/null 2>&1===
    
    Enabling the VIP - 172.31.0.100/16 on the new master - 172.31.0.48 
    Sat May 22 18:08:38 2021 - [info]  OK.
    Sat May 22 18:08:38 2021 - [info] Setting read_only=0 on 172.31.0.48(172.31.0.48:3306)..
    Sat May 22 18:08:38 2021 - [info]  ok.
    Sat May 22 18:08:38 2021 - [info] ** Finished master recovery successfully.
    Sat May 22 18:08:38 2021 - [info] * Phase 3: Master Recovery Phase completed.
    Sat May 22 18:08:38 2021 - [info] 
    Sat May 22 18:08:38 2021 - [info] * Phase 4: Slaves Recovery Phase..
    Sat May 22 18:08:38 2021 - [info] 
    Sat May 22 18:08:38 2021 - [info] * Phase 4.1: Starting Parallel Slave Diff Log Generation Phase..
    Sat May 22 18:08:38 2021 - [info] 
    Sat May 22 18:08:38 2021 - [info] -- Slave diff file generation on host 172.31.0.38(172.31.0.38:3306) started, pid: 27571. Check tmp log /data/mastermha/app1//172.31.0.38_3306_20210522180836.log if it takes time..
    Sat May 22 18:08:39 2021 - [info] 
    Sat May 22 18:08:39 2021 - [info] Log messages from 172.31.0.38 ...
    Sat May 22 18:08:39 2021 - [info] 
    Sat May 22 18:08:38 2021 - [info]  This server has all relay logs. No need to generate diff files from the latest slave.
    Sat May 22 18:08:39 2021 - [info] End of log messages from 172.31.0.38.
    Sat May 22 18:08:39 2021 - [info] -- 172.31.0.38(172.31.0.38:3306) has the latest relay log events.
    Sat May 22 18:08:39 2021 - [info] Generating relay diff files from the latest slave succeeded.
    Sat May 22 18:08:39 2021 - [info] 
    Sat May 22 18:08:39 2021 - [info] * Phase 4.2: Starting Parallel Slave Log Apply Phase..
    Sat May 22 18:08:39 2021 - [info] 
    Sat May 22 18:08:39 2021 - [info] -- Slave recovery on host 172.31.0.38(172.31.0.38:3306) started, pid: 27573. Check tmp log /data/mastermha/app1//172.31.0.38_3306_20210522180836.log if it takes time..
    Sat May 22 18:08:40 2021 - [info] 
    Sat May 22 18:08:40 2021 - [info] Log messages from 172.31.0.38 ...
    Sat May 22 18:08:40 2021 - [info] 
    Sat May 22 18:08:39 2021 - [info] Starting recovery on 172.31.0.38(172.31.0.38:3306)..
    Sat May 22 18:08:39 2021 - [info]  This server has all relay logs. Waiting all logs to be applied.. 
    Sat May 22 18:08:39 2021 - [info]   done.
    Sat May 22 18:08:39 2021 - [info]  All relay logs were successfully applied.
    Sat May 22 18:08:39 2021 - [info]  Resetting slave 172.31.0.38(172.31.0.38:3306) and starting replication from the new master 172.31.0.48(172.31.0.48:3306)..
    Sat May 22 18:08:39 2021 - [info]  Executed CHANGE MASTER.
    Sat May 22 18:08:39 2021 - [info]  Slave started.
    Sat May 22 18:08:40 2021 - [info] End of log messages from 172.31.0.38.
    Sat May 22 18:08:40 2021 - [info] -- Slave recovery on host 172.31.0.38(172.31.0.38:3306) succeeded.
    Sat May 22 18:08:40 2021 - [info] All new slave servers recovered successfully.
    Sat May 22 18:08:40 2021 - [info] 
    Sat May 22 18:08:40 2021 - [info] * Phase 5: New master cleanup phase..
    Sat May 22 18:08:40 2021 - [info] 
    Sat May 22 18:08:40 2021 - [info] Resetting slave info on the new master..
    Sat May 22 18:08:40 2021 - [info]  172.31.0.48: Resetting slave info succeeded.
    Sat May 22 18:08:40 2021 - [info] Master failover to 172.31.0.48(172.31.0.48:3306) completed successfully.
    Sat May 22 18:08:40 2021 - [info] 
    
    ----- Failover Report -----
    
    app1: MySQL Master failover 172.31.0.28(172.31.0.28:3306) to 172.31.0.48(172.31.0.48:3306) succeeded
    
    Master 172.31.0.28(172.31.0.28:3306) is down!
    
    Check MHA Manager logs at localhost.localdomain:/data/mastermha/app1/manager.log for details.
    
    Started automated(non-interactive) failover.
    Invalidated master IP address on 172.31.0.28(172.31.0.28:3306)
    The latest slave 172.31.0.48(172.31.0.48:3306) has all relay logs for recovery.
    Selected 172.31.0.48(172.31.0.48:3306) as a new master.
    172.31.0.48(172.31.0.48:3306): OK: Applying all logs succeeded.
    172.31.0.48(172.31.0.48:3306): OK: Activated master IP address.
    172.31.0.38(172.31.0.38:3306): This host has the latest relay log events.
    Generating relay diff files from the latest slave succeeded.
    172.31.0.38(172.31.0.38:3306): OK: Applying all logs succeeded. Slave started, replicating from 172.31.0.48(172.31.0.48:3306)
    172.31.0.48(172.31.0.48:3306): Resetting slave info succeeded.
    Master failover to 172.31.0.48(172.31.0.48:3306) completed successfully.
    Sat May 22 18:08:40 2021 - [info] Sending mail..
    sh: /usr/local/bin/sendmail.sh: No such file or directory
    Sat May 22 18:08:40 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterFailover.pm, ln2089] Failed to send mail with return code 127:0
    

    再次检查状态

    [root@localhost ~]# masterha_check_status --conf=/etc/mastermha/app1.cnf
    app1 is stopped(2:NOT_RUNNING).
    

    原来的master追踪日志检测也会停止

    [root@centos8 ~]# tail -f /var/lib/mysql/centos8.log
    

    验证VIP漂移至新的Master上

    [root@centos8 ~]# ip a
    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        inet 127.0.0.1/8 scope host lo
           valid_lft forever preferred_lft forever
        inet6 ::1/128 scope host 
           valid_lft forever preferred_lft forever
    2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
        link/ether 00:0c:29:16:9a:81 brd ff:ff:ff:ff:ff:ff
        inet 172.31.0.48/16 brd 172.31.255.255 scope global noprefixroute eth0
           valid_lft forever preferred_lft forever
        inet 172.31.0.100/16 brd 172.31.255.255 scope global secondary eth0:1
           valid_lft forever preferred_lft forever
        inet6 fe80::20c:29ff:fe16:9a81/64 scope link 
    

    报错:

    # 检查主从复制repl报错
    [root@centos8 ~]# masterha_check_repl --conf=/etc/mastermha/app1.cnf
    Sat May 22 19:11:34 2021 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
    Sat May 22 19:11:34 2021 - [info] Reading application default configuration from /etc/mastermha/app1.cnf..
    Sat May 22 19:11:34 2021 - [info] Reading server configuration from /etc/mastermha/app1.cnf..
    Sat May 22 19:11:34 2021 - [info] MHA::MasterMonitor version 0.58.
    Sat May 22 19:11:36 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. Redundant argument in sprintf at /usr/share/perl5/vendor_perl/MHA/NodeUtil.pm line 201.
    Sat May 22 19:11:36 2021 - [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
    Sat May 22 19:11:36 2021 - [info] Got exit code 1 (Not master dead).
    
    MySQL Replication Health is NOT OK!
    

    思路:

    一般情况下是主从关系没有搭建成功,首先要保证主库数据要和其他从库数据保持一致,主库和主备的配置文件配置要正确,都要开启半同步复制,主库要授权从库同步数据的用户权限,从库进行相应配置
    

    mysql8.0版本尚未得到mha4mysql的支持,改源码(这次没有用到改源码,是因为使用了CentOS8,感觉MHA对于CentOS8很不友好)

    [root@centos8 ~]# grep -rn 'sub parse_mysql_major_version($)' /usr/share/perl5/vendor_perl/MHA/
    /usr/share/perl5/vendor_perl/MHA/NodeUtil.pm:199:sub parse_mysql_major_version($)
    # 原代码
    #sub parse_mysql_major_version($) {
    #  my $str = shift;
    #  my $result = sprintf( '%03d%03d', $str =~ m/(d+)/g );
    #  return $result;
    #}
    
    # 改动后代码
    sub parse_mysql_major_version($) {
    my $str = shift;
      $str =~ /(d+).(d+)/;
      my $strmajor = "$1.$2";
      my $result = sprintf( '%03d%03d', $strmajor =~ m/(d+)/g );
      return $result;
    }
    

    CentOS7 mha4安装失败

    --> Finished Dependency Resolution
    Error: Package: mha4mysql-manager-0.58-0.el7.centos.noarch (/mha4mysql-manager-0.58-0.el7.centos.
               Requires: perl(Log::Dispatch)
    Error: Package: mha4mysql-manager-0.58-0.el7.centos.noarch (/mha4mysql-manager-0.58-0.el7.centos.
               Requires: perl(Parallel::ForkManager)
    Error: Package: mha4mysql-manager-0.58-0.el7.centos.noarch (/mha4mysql-manager-0.58-0.el7.centos.
               Requires: perl(Log::Dispatch::File)
    Error: Package: mha4mysql-manager-0.58-0.el7.centos.noarch (/mha4mysql-manager-0.58-0.el7.centos.
               Requires: perl(Log::Dispatch::Screen)
     You could try using --skip-broken to work around the problem
     You could try running: rpm -Va --nofiles --nodigest
    

    解决方法:

    [root@localhost ~]# yum install epel-release -y
    # 重新安装即可
    [root@localhost ~]# yum instll mha4mysql-*.rpm -y
    
  • 相关阅读:
    Vue 多环境的配置 look
    01 java基本类型和包装类型的区别? look
    03 java自动装箱与拆箱了解吗?原理是什么? look
    Windows下MySQL的安装和删除 look
    02 java包装类型的缓存机制 look
    test
    keepalived 主备搭建及配置
    rename批量重命名文件名
    keepalived执行stop命令无法退出进程问题
    职场PUA
  • 原文地址:https://www.cnblogs.com/xuanlv-0413/p/14799801.html
Copyright © 2020-2023  润新知