• MHA手动切换 原创1(主故障)


    MHA提供了3种方式用于实现故障转移,分别自动故障转移,需要启用MHA监控;

    在无监控的情况下的手动故障转移以及基于在线手动切换。

    三种方式可以应对MySQL主从故障的任意场景。本文主要描述在无监控的情形是手动实现故障转移。供大家参考。

          有关MHA的其他两种切换方式,可以参考:
                MHA 在线切换过程
                MHA 自动故障转移步骤及过程剖析

    1、手动故障转移的特点
        a、在监控节点未启用masterha_manager
        b、master库已经宕机或者转移到高性能服务器
        c、手动故障转移支持交互或非交互两种模式
        d、切换样例:$ masterha_master_switch --master_state=dead --conf=/etc/app1.cnf --dead_master_host=host1

    2、masterha_master_switch切换的几个参数
    --master_state=dead
          强制参数为"dead" 或者 "alive". dead为手动故障转移,alive为在线切换。
      
    --dead_master_host=(hostname)
          强制参数为主机名,另2个--dead_master_ip --dead_master_port(缺省3306)可选。
      
    --new_master_host=(hostname)
          可选参数,用于指定新master,如果未指定则按candidate_master参数设定值。
      
    --interactive=(0|1)
          可选参数,指定是否交互。缺省为1,表明交互




    1.server1:
    service mysql.server stop


    2.monitor:
    [root@monitor tmp]# masterha_master_switch --master_state=dead --conf=/etc/masterha/app1.conf --dead_master_host=server1 --dead_master_port=3306 --new_master_host=slave1 --new_master_port=3306 --dead_master_ip=<dead_master_ip> is not set. Using 10.24.220.232. Mon May 16 09:19:38 2016 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Mon May 16 09:19:38 2016 - [info] Reading application default configuration from /etc/masterha/app1.conf.. Mon May 16 09:19:38 2016 - [info] Reading server configuration from /etc/masterha/app1.conf.. Mon May 16 09:19:38 2016 - [info] MHA::MasterFailover version 0.56. Mon May 16 09:19:38 2016 - [info] Starting master failover. Mon May 16 09:19:38 2016 - [info] Mon May 16 09:19:38 2016 - [info] * Phase 1: Configuration Check Phase.. Mon May 16 09:19:38 2016 - [info] Mon May 16 09:19:38 2016 - [debug] Connecting to servers.. Mon May 16 09:19:39 2016 - [debug] Connected to: slave1(10.24.220.70:3306), user=root Mon May 16 09:19:39 2016 - [debug] Number of slave worker threads on host slave1(10.24.220.70:3306): 0 Mon May 16 09:19:39 2016 - [debug] Connected to: slave2(10.169.214.33:3306), user=root Mon May 16 09:19:39 2016 - [debug] Number of slave worker threads on host slave2(10.169.214.33:3306): 0 Mon May 16 09:19:39 2016 - [debug] Comparing MySQL versions.. Mon May 16 09:19:39 2016 - [debug] Comparing MySQL versions done. Mon May 16 09:19:39 2016 - [debug] Connecting to servers done. Mon May 16 09:19:39 2016 - [info] GTID failover mode = 1 Mon May 16 09:19:39 2016 - [info] Dead Servers: Mon May 16 09:19:39 2016 - [info] server1(10.24.220.232:3306) Mon May 16 09:19:39 2016 - [info] Checking master reachability via MySQL(double check)... Mon May 16 09:19:39 2016 - [info] ok. Mon May 16 09:19:39 2016 - [info] Alive Servers: Mon May 16 09:19:39 2016 - [info] slave1(10.24.220.70:3306) Mon May 16 09:19:39 2016 - [info] slave2(10.169.214.33:3306) Mon May 16 09:19:39 2016 - [info] Alive Slaves: Mon May 16 09:19:39 2016 - [info] slave1(10.24.220.70:3306) Version=5.7.11-log (oldest major version between slaves) log-bin:enabled Mon May 16 09:19:39 2016 - [info] GTID ON Mon May 16 09:19:39 2016 - [debug] Relay log info repository: FILE Mon May 16 09:19:39 2016 - [info] Replicating from 10.24.220.232(10.24.220.232:3306) Mon May 16 09:19:39 2016 - [info] Primary candidate for the new Master (candidate_master is set) Mon May 16 09:19:39 2016 - [info] slave2(10.169.214.33:3306) Version=5.7.11-log (oldest major version between slaves) log-bin:enabled Mon May 16 09:19:39 2016 - [info] GTID ON Mon May 16 09:19:39 2016 - [debug] Relay log info repository: FILE Mon May 16 09:19:39 2016 - [info] Replicating from 10.24.220.232(10.24.220.232:3306) Mon May 16 09:19:39 2016 - [info] Not candidate for the new Master (no_master is set) Master server1(10.24.220.232:3306) is dead. Proceed? (yes/NO): yes Mon May 16 09:19:47 2016 - [info] Starting GTID based failover. Mon May 16 09:19:47 2016 - [info] Mon May 16 09:19:47 2016 - [info] ** Phase 1: Configuration Check Phase completed. Mon May 16 09:19:47 2016 - [info] Mon May 16 09:19:47 2016 - [info] * Phase 2: Dead Master Shutdown Phase.. Mon May 16 09:19:47 2016 - [info] Mon May 16 09:19:47 2016 - [debug] SSH connection test to server1, option -o StrictHostKeyChecking=no -o PasswordAuthentication=no -o BatchMode=yes -o ConnectTimeout=5, timeout 5 Mon May 16 09:19:47 2016 - [debug] Stopping IO thread on slave2(10.169.214.33:3306).. Mon May 16 09:19:47 2016 - [debug] Stopping IO thread on slave1(10.24.220.70:3306).. Mon May 16 09:19:47 2016 - [debug] Stop IO thread on slave2(10.169.214.33:3306) done. Mon May 16 09:19:47 2016 - [debug] Stop IO thread on slave1(10.24.220.70:3306) done. Mon May 16 09:19:48 2016 - [info] HealthCheck: SSH to server1 is reachable. Mon May 16 09:19:49 2016 - [info] Forcing shutdown so that applications never connect to the current master.. Mon May 16 09:19:49 2016 - [warning] master_ip_failover_script is not set. Skipping invalidating dead master IP address. Mon May 16 09:19:49 2016 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master. Mon May 16 09:19:49 2016 - [info] * Phase 2: Dead Master Shutdown Phase completed. Mon May 16 09:19:49 2016 - [info] Mon May 16 09:19:49 2016 - [info] * Phase 3: Master Recovery Phase.. Mon May 16 09:19:49 2016 - [info] Mon May 16 09:19:49 2016 - [info] * Phase 3.1: Getting Latest Slaves Phase.. Mon May 16 09:19:49 2016 - [info] Mon May 16 09:19:49 2016 - [debug] Fetching current slave status.. Mon May 16 09:19:49 2016 - [debug] Fetching current slave status done. Mon May 16 09:19:49 2016 - [info] The latest binary log file/position on all slaves is log.000005:528 Mon May 16 09:19:49 2016 - [info] Retrieved Gtid Set: 191f7a9f-ffa2-11e5-a825-00163e00242a:1-4 Mon May 16 09:19:49 2016 - [info] Latest slaves (Slaves that received relay log files to the latest): Mon May 16 09:19:49 2016 - [info] slave1(10.24.220.70:3306) Version=5.7.11-log (oldest major version between slaves) log-bin:enabled Mon May 16 09:19:49 2016 - [info] GTID ON Mon May 16 09:19:49 2016 - [debug] Relay log info repository: FILE Mon May 16 09:19:49 2016 - [info] Replicating from 10.24.220.232(10.24.220.232:3306) Mon May 16 09:19:49 2016 - [info] Primary candidate for the new Master (candidate_master is set) Mon May 16 09:19:49 2016 - [info] slave2(10.169.214.33:3306) Version=5.7.11-log (oldest major version between slaves) log-bin:enabled Mon May 16 09:19:49 2016 - [info] GTID ON Mon May 16 09:19:49 2016 - [debug] Relay log info repository: FILE Mon May 16 09:19:49 2016 - [info] Replicating from 10.24.220.232(10.24.220.232:3306) Mon May 16 09:19:49 2016 - [info] Not candidate for the new Master (no_master is set) Mon May 16 09:19:49 2016 - [info] The oldest binary log file/position on all slaves is log.000005:528 Mon May 16 09:19:49 2016 - [info] Retrieved Gtid Set: 191f7a9f-ffa2-11e5-a825-00163e00242a:1-4 Mon May 16 09:19:49 2016 - [info] Oldest slaves: Mon May 16 09:19:49 2016 - [info] slave1(10.24.220.70:3306) Version=5.7.11-log (oldest major version between slaves) log-bin:enabled Mon May 16 09:19:49 2016 - [info] GTID ON Mon May 16 09:19:49 2016 - [debug] Relay log info repository: FILE Mon May 16 09:19:49 2016 - [info] Replicating from 10.24.220.232(10.24.220.232:3306) Mon May 16 09:19:49 2016 - [info] Primary candidate for the new Master (candidate_master is set) Mon May 16 09:19:49 2016 - [info] slave2(10.169.214.33:3306) Version=5.7.11-log (oldest major version between slaves) log-bin:enabled Mon May 16 09:19:49 2016 - [info] GTID ON Mon May 16 09:19:49 2016 - [debug] Relay log info repository: FILE Mon May 16 09:19:49 2016 - [info] Replicating from 10.24.220.232(10.24.220.232:3306) Mon May 16 09:19:49 2016 - [info] Not candidate for the new Master (no_master is set) Mon May 16 09:19:49 2016 - [info] Mon May 16 09:19:49 2016 - [info] * Phase 3.3: Determining New Master Phase.. Mon May 16 09:19:49 2016 - [info] Mon May 16 09:19:49 2016 - [info] slave1 can be new master. Mon May 16 09:19:49 2016 - [info] New master is slave1(10.24.220.70:3306) Mon May 16 09:19:49 2016 - [info] Starting master failover.. Mon May 16 09:19:49 2016 - [info] From: server1(10.24.220.232:3306) (current master) +--slave1(10.24.220.70:3306) +--slave2(10.169.214.33:3306) To: slave1(10.24.220.70:3306) (new master) +--slave2(10.169.214.33:3306) Starting master switch from server1(10.24.220.232:3306) to slave1(10.24.220.70:3306)? (yes/NO): yes Mon May 16 09:20:43 2016 - [info] New master decided manually is slave1(10.24.220.70:3306) Mon May 16 09:20:43 2016 - [info] Mon May 16 09:20:43 2016 - [info] * Phase 3.3: New Master Recovery Phase.. Mon May 16 09:20:43 2016 - [info] Mon May 16 09:20:43 2016 - [info] Waiting all logs to be applied.. Mon May 16 09:20:43 2016 - [info] done. Mon May 16 09:20:43 2016 - [debug] Stopping slave IO/SQL thread on slave1(10.24.220.70:3306).. Mon May 16 09:20:43 2016 - [debug] done. Mon May 16 09:20:43 2016 - [info] Getting new master's binlog name and position.. Mon May 16 09:20:43 2016 - [info] log.000001:818 Mon May 16 09:20:43 2016 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='slave1 or 10.24.220.70', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='xxx'; Mon May 16 09:20:43 2016 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: log.000001, 818, 191f7a9f-ffa2-11e5-a825-00163e00242a:1-4 Mon May 16 09:20:43 2016 - [warning] master_ip_failover_script is not set. Skipping taking over new master IP address. Mon May 16 09:20:43 2016 - [info] ** Finished master recovery successfully. Mon May 16 09:20:43 2016 - [info] * Phase 3: Master Recovery Phase completed. Mon May 16 09:20:43 2016 - [info] Mon May 16 09:20:43 2016 - [info] * Phase 4: Slaves Recovery Phase.. Mon May 16 09:20:43 2016 - [info] Mon May 16 09:20:43 2016 - [info] Mon May 16 09:20:43 2016 - [info] * Phase 4.1: Starting Slaves in parallel.. Mon May 16 09:20:43 2016 - [info] Mon May 16 09:20:43 2016 - [info] -- Slave recovery on host slave2(10.169.214.33:3306) started, pid: 7774. Check tmp log /var/log/masterha/app1/slave2_3306_20160516091938.log if it takes time.. Mon May 16 09:20:44 2016 - [info] Mon May 16 09:20:44 2016 - [info] Log messages from slave2 ... Mon May 16 09:20:44 2016 - [info] Mon May 16 09:20:43 2016 - [info] Resetting slave slave2(10.169.214.33:3306) and starting replication from the new master slave1(10.24.220.70:3306).. Mon May 16 09:20:43 2016 - [debug] Stopping slave IO/SQL thread on slave2(10.169.214.33:3306).. Mon May 16 09:20:43 2016 - [debug] done. Mon May 16 09:20:43 2016 - [info] Executed CHANGE MASTER. Mon May 16 09:20:43 2016 - [debug] Starting slave IO/SQL thread on slave2(10.169.214.33:3306).. Mon May 16 09:20:44 2016 - [debug] done. Mon May 16 09:20:44 2016 - [info] Slave started. Mon May 16 09:20:44 2016 - [info] gtid_wait(191f7a9f-ffa2-11e5-a825-00163e00242a:1-4) completed on slave2(10.169.214.33:3306). Executed 0 events. Mon May 16 09:20:44 2016 - [info] End of log messages from slave2. Mon May 16 09:20:44 2016 - [info] -- Slave on host slave2(10.169.214.33:3306) started. Mon May 16 09:20:44 2016 - [info] All new slave servers recovered successfully. Mon May 16 09:20:44 2016 - [info] Mon May 16 09:20:44 2016 - [info] * Phase 5: New master cleanup phase.. Mon May 16 09:20:44 2016 - [info] Mon May 16 09:20:44 2016 - [info] Resetting slave info on the new master.. Mon May 16 09:20:44 2016 - [debug] Clearing slave info.. Mon May 16 09:20:44 2016 - [debug] Stopping slave IO/SQL thread on slave1(10.24.220.70:3306).. Mon May 16 09:20:44 2016 - [debug] done. Mon May 16 09:20:44 2016 - [debug] SHOW SLAVE STATUS shows new master does not replicate from anywhere. OK. Mon May 16 09:20:44 2016 - [info] slave1: Resetting slave info succeeded. Mon May 16 09:20:44 2016 - [info] Master failover to slave1(10.24.220.70:3306) completed successfully. Mon May 16 09:20:44 2016 - [debug] Disconnected from slave1(10.24.220.70:3306) Mon May 16 09:20:44 2016 - [debug] Disconnected from slave2(10.169.214.33:3306) Mon May 16 09:20:44 2016 - [info] ----- Failover Report ----- app1: MySQL Master failover server1(10.24.220.232:3306) to slave1(10.24.220.70:3306) succeeded Master server1(10.24.220.232:3306) is down! Check MHA Manager logs at monitor for details. Started manual(interactive) failover. Selected slave1(10.24.220.70:3306) as a new master. slave1(10.24.220.70:3306): OK: Applying all logs succeeded. slave2(10.169.214.33:3306): OK: Slave started, replicating from slave1(10.24.220.70:3306) slave1(10.24.220.70:3306): Resetting slave info succeeded. Master failover to slave1(10.24.220.70:3306) completed successfully.
  • 相关阅读:
    Neko's loop HDU-6444(网络赛1007)
    Parameters
    SETLOCAL
    RD / RMDIR Command
    devenv 命令用法
    Cannot determine the location of the VS Common Tools folder.
    'DEVENV' is not recognized as an internal or external command,
    How to change Visual Studio default environment setting
    error signing assembly unknown error
    What is the Xcopy Command?:
  • 原文地址:https://www.cnblogs.com/zengkefu/p/5496984.html
Copyright © 2020-2023  润新知