基于MHA的MySQL高可用方案

基于MHA的MySQL高可用方案

一、前期环境部署

1、配置所有主机名称

111：hostname server01

bash

112：hostname server02

bash

113：hostname server03

bash

114：hostname server04

bash

115：hostname server05

bash

2、配置所有主机名映射

115：vim /etc/hosts 添加以下内容：

192.168.200.111 server01

192.168.200.112 server02

192.168.200.113 server03

192.168.200.114 server04

192.168.200.115 server05

发送给其他主机：

scp /etc/hosts 192.168.200.111:/etc/

scp /etc/hosts 192.168.200.112:/etc/

scp /etc/hosts 192.168.200.113:/etc/

scp /etc/hosts 192.168.200.114:/etc/

3、所有主机关闭防火墙和安全机制

systemctl stop iptables

systemctl stop firewalld

setenforce 0

4、上传mha-manager 和 mha-node

115：上传三个

111、112、113、114上传两个

二、安装MHA node

1、所有主机安装 MHA node 及相关 perl 依赖包

安装 epel 源:

rpm -ivh epel-release-latest-7.noarch.rpm

yum install -y perl-DBD-MySQL.x86_64 perl-DBI.x86_64 perl-CPAN perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker

注意：安装后建议检查一下所需软件包是否全部安装：
rpm -q perl-DBD-MySQL.x86_64 perl-DBI.x86_64 perl-CPAN perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker

2、所有主机上安装 MHA Node

tar xf mha4mysql-node-0.56.tar.gz

cd mha4mysql-node-0.56/

perl Makefile.PL

make && make install

3、MHA Node 安装完后会在 /usr/local/bin 生成以下脚本

ls -l /usr/local/bin/

三、安装MHA Manger（115）

注意：安装 MHA Manger 之前也需要安装 MHA Node

1、首先安装 MHA Manger 依赖的 perl 模块（我这里使用 yum 安装）

yum install -y perl perl-Log-Dispatch perl-Parallel-ForkManager perl-DBD-MySQL perl-DBI perl-Time-HiRes

yum -y install perl-Config-Tiny-2.14-7.el7.noarch.rpm

检查有没有安装上：rpm -q perl perl-Log-Dispatch perl-Parallel-ForkManager perl-DBD-MySQL perl-DBI perl-Time-HiRes perl-Config-Tiny

2、安装 MHA Manger 软件包

tar xf mha4mysql-manager-0.56.tar.gz

cd mha4mysql-manager-0.56/

perl Makefile.PL

make && make install

3、安装完成之后会比之前多出一些的脚本文件

ls -l /usr/local/bin/

四、配置 SSH 密钥对验证

服务器之间需要实现密钥对验证。关于配置密钥对验证可看下面步骤。但是有一点需要

注意：不能禁止 password 登陆，否则会出现错误

1、服务器先生成一个密钥对（115）

ssh-keygen -t rsa

2.、把自己的公钥传给对方

ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.200.111

ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.200.112

ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.200.113

ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.200.114

注意：Server05 需要连接每个主机测试，因为第一次连接的时候需要输入 yes，影响后期故

障切换时，对于每个主机的 SSH 控制。

ssh server02

ssh server03

ssh server04

3、Primary Master（111）

ssh-keygen -t rsa

ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.200.112

ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.200.113

ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.200.114

4、Secondary Master（112）

ssh-keygen -t rsa

ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.200.111

ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.200.113

ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.200.114

5、slave1（113）

ssh-keygen -t rsa

ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.200.111

ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.200.112

ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.200.114

6、salve2（114）

ssh-keygen -t rsa

ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.200.111

ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.200.112

ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.200.113

五、安装mysql

111-114：

yum -y install mariadb mariadb-server mariadb-devel

systemctl start mariadb

netstat -lnpt | grep :3306

设置数据库初始密码：

mysqladmin -u root password 123

六、搭建主从复制环境

注意：binlog-do-db 和 replicate-ignore-db 设置必须相同。 MHA 在启动时候会检测过滤规则，如果过滤规则不同，MHA 将不启动监控和故障转移功能。

1、修改 mysql 主机的配置文件

（1）Primary Master（111）

vim /etc/my.cnf 添加以下内容

[mysqld]

server-id = 1

log-bin=master-bin

log-slave-updates=true

relay_log_purge=0

重启mariadb：

systemctl restart mariadb

（2）Secondary Master（112）

vim /etc/my.cnf 添加以下内容

[mysqld]

server-id=2

log-bin=master-bin

log-slave-updates=true

relay_log_purge=0

重启mariadb：

systemctl restart mariadb

（3）slave1（113）

vim /etc/my.cnf 添加以下内容

[mysqld]

server-id=3

log-bin=mysql-bin

relay-log=slave-relay-bin

log-slave-updates=true

relay_log_purge=0

重启mariadb：

systemctl restart mariadb

（4）slave2（114）

vim /etc/my.cnf 添加以下内容

[mysqld]

server-id=4

log-bin=mysql-bin

relay-log=slave-relay-bin

log-slave-updates=true

relay_log_purge=0

重启mariadb：

systemctl restart mariadb

2、对旧数据进行备份（没有则忽略）

3、mysql 服务器创建复制授权用户（111）

grant replication slave on *.* to 'repl'@'192.168.200.%' identified by '123';

flush privileges;

4、查看主库备份时的 binlog 名称和位置

MariaDB [(none)]> show master status;

5、执行复制的相关命令

stop slave;

CHANGE MASTER TO

MASTER_HOST='192.168.200.111',

MASTER_USER='repl',

MASTER_PASSWORD='1234',

MASTER_LOG_FILE='master-bin.000001',

MASTER_LOG_POS=474;

start slave;

检查 IO 和 SQL 线程是否为yes：

show slave statusG

主从同步故障处理

Slave_IO_Running: No

Slave_SQL_Running: Yes

-----------------------------------忽略部分信息-----------------------------------

Last_IO_Errno: 1236

Last_IO_Error: Got fatal error 1236 from master when reading data from

binary log: 'Could not find first log file name in binary log index file'

-----------------------------------忽略部分信息-----------------------------------

处理方式：

stop slave;

reset slave;

set global sql_slave_skip_counter =1 ;

start slave;

6、三台 slave 服务器设置 read_only 状态

从库对外只提供读服务，只所以没有写进 mysql 配置文件，是因为随时 server02 会提升为master

[root@server02 ~]# mysql -uroot -p123456 -e 'set global read_only=1'

[root@server03 ~]# mysql -uroot -p123456 -e 'set global read_only=1'

[root@server04 ~]# mysql -uroot -p123456 -e 'set global read_only=1'

7、创建监控用户（111-114）

grant all privileges on *.* to 'root'@'192.168.200.%' identified by '123';

flush privileges;

为自己的主机名授权：

grant all privileges on *.* to 'root'@'server04' identified by '123456';

flush privileges;

七、配置MHA环境

1、创建 MHA 的工作目录及相关配置文件（115）

在软件包解压后的目录里面有样例配置文件

mkdir /etc/masterha

cp mha4mysql-manager-0.56/samples/conf/app1.cnf /etc/masterha

2、修改 app1.cnf 配置文件

/usr/local/bin/master_ip_failover 脚本需要根据自己环境修改 ip 和网卡名称等。

vim /etc/masterha/app1.cnf

[server default]

#设置 manager 的工作日志

manager_workdir=/var/log/masterha/app1

#设置 manager 的日志,这两条都是默认存在的

manager_log=/var/log/masterha/app1/manager.log

#设置 master 默认保存 binlog 的位置,以便 MHA 可以找到 master 日志

master_binlog_dir=/var/lib/mysql

#设置自动 failover 时候的切换脚本

master_ip_failover_script= /usr/local/bin/master_ip_failover

#设置 mysql 中 root 用户的密码

password=123456

user=root

#ping 包的时间间隔

ping_interval=1

#设置远端 mysql 在发生切换时保存 binlog 的具体位置

remote_workdir=/tmp

#设置复制用户的密码和用户名

repl_password=123456

repl_user=repl

[server1]

hostname=server01

port=3306

[server2]

hostname=server02

candidate_master=1

port=3306

check_repl_delay=0

[server3]

hostname=server03

port=3306

[server4]

hostname=server04

port=3306

3、配置故障转移脚本（115）

vim /usr/local/bin/master_ip_failover

#!/usr/bin/env perl

use strict;

use warnings FATAL => 'all';

use Getopt::Long;

my (

$command, $ssh_user, $orig_master_host, $orig_master_ip,

$orig_master_port, $new_master_host, $new_master_ip, $new_master_port,

);

my $vip = '192.168.200.100'; # 写入 VIP

my $key = "1"; #非 keepalived 方式切换脚本使用的

my $ssh_start_vip = "/sbin/ifconfig ens32:$key $vip";

my $ssh_stop_vip = "/sbin/ifconfig ens32:$key down"; #那么这里写服务的开关命令

$ssh_user = "root";

GetOptions(

'command=s' => $command,

'ssh_user=s' => $ssh_user,

'orig_master_host=s' => $orig_master_host,

'orig_master_ip=s' => $orig_master_ip,

'orig_master_port=i' => $orig_master_port,

'new_master_host=s' => $new_master_host,

'new_master_ip=s' => $new_master_ip,

'new_master_port=i' => $new_master_port,

);

exit &main();

sub main {

print " IN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip=== ";

if ( $command eq "stop" || $command eq "stopssh" ) {

# $orig_master_host, $orig_master_ip, $orig_master_port are passed.

# If you manage master ip address at global catalog database,

# invalidate orig_master_ip here.

my $exit_code = 1;

#eval {

# print "Disabling the VIP on old master: $orig_master_host ";

# &stop_vip();

# $exit_code = 0;

#};

eval {

print "Disabling the VIP on old master: $orig_master_host ";

#my $ping=`ping -c 1 10.0.0.13 | grep "packet loss" | awk -F',' '{print $3}' | awk '{print $1}'`;

#if ( $ping le "90.0%"&& $ping gt "0.0%" ){

#$exit_code = 0;

#}

#else {

&stop_vip();

# updating global catalog, etc

$exit_code = 0;

#}

};

if ($@) {

warn "Got Error: $@ ";

exit $exit_code;

}

exit $exit_code;

}

elsif ( $command eq "start" ) {

# all arguments are passed.

# If you manage master ip address at global catalog database,

# activate new_master_ip here.

# You can also grant write access (create user, set read_only=0, etc) here.

my $exit_code = 10;

eval {

print "Enabling the VIP - $vip on the new master - $new_master_host ";

&start_vip();

$exit_code = 0;

};

if ($@) {

warn $@;

exit $exit_code;

}

exit $exit_code;

}

elsif ( $command eq "status" ) {

print "Checking the Status of the script.. OK ";

`ssh $ssh_user@$orig_master_ip " $ssh_start_vip "`;

exit 0;

}

else {

&usage();

exit 1;

}

}

# A simple system call that enable the VIP on the new master

sub start_vip() {

`ssh $ssh_user@$new_master_host " $ssh_start_vip "`;

}

# A simple system call that disable the VIP on the old_master

sub stop_vip() {

`ssh $ssh_user@$orig_master_host " $ssh_stop_vip "`;

}

sub usage {

print

"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host

--orig_master_ip=ip --orig_master_port=port --

new_master_host=host --new_master_ip=ip --new_master_port=port "; }

添加执行权限：

chmod +x /usr/local/bin/master_ip_failover

4、设置从库 relay log 的清除方式（112-114）

mysql -uroot -p123456 -e 'set global relay_log_purge=0;'

注意：

MHA 在故障切换的过程中，从库的恢复过程依赖于 relay log 的相关信息，所以这里要

将 relay log 的自动清除设置为 OFF，采用手动清除 relay log 的方式。在默认情况下，从服务

器上的中继日志会在 SQL 线程执行完毕后被自动删除。但是在 MHA 环境中，这些中继日志

在恢复其他从服务器时可能会被用到，因此需要禁用中继日志的自动清除功能。定期清除中

继日志需要考虑到复制延时的问题。在 ext3 的文件系统下，删除大的文件需要一定的时间，

会导致严重的复制延时。为了避免复制延时，需要暂时为中继日志创建硬链接，因为在 linux

系统中通过硬链接删除大文件速度会很快。（在 mysql 数据库中，删除大表时，通常也采用

建立硬链接的方式）

5、手动清除中继日志

purge_relay_logs --user=root --password=123 --disable_relay_log_purge --port=3306 --workdir=/tmp

6、检查 MHA ssh 通信状态（115）

masterha_check_ssh --conf=/etc/masterha/app1.cnf

最后会返回 successfully 表示没有问题

7、检查整个集群的状态

masterha_check_repl --conf=/etc/masterha/app1.cnf

最后会出现这一句：MySQL Replication Health is OK.

八、VIP配置管理

通过命令方式管理 VIP 地址：

打开在前面编辑过的文件/etc/masterha/app1.cnf，检查如下行是否正确，再检查集群状态。

[root@server05 ~]# grep -n 'master_ip_failover_script' /etc/masterha/app1.cnf

1、Primary Master（111）

[root@server01 ~]# ip a | grep en016777728

2、Server05修改故障转移脚本（115）

[root@server05 ~]# head -13 /usr/local/bin/master_ip_failover

#!/usr/bin/env perl

use strict;

use warnings FATAL => 'all';

use Getopt::Long;

my (

$command, $ssh_user, $orig_master_host, $orig_master_ip,

$orig_master_port, $new_master_host, $new_master_ip, $new_master_port,

);

my $vip = '192.168.200.100'; # 写入 VIP

my $key = "1"; #非 keepalived 方式切换脚本使用的

my $ssh_start_vip = "/sbin/ifconfig ens32:$key $vip"; #若是使用 keepalived

my $ssh_stop_vip = "/sbin/ifconfig ens32:$key down"; #那么这里写服务的开关命令

/usr/local/bin/master_ip_failover 文件的内容意思是当主库发生故障时，会触发 MHA 切

换，MHA manager 会停掉主库上的 ens32:1 接口，触发虚拟 ip 漂移到备选从库，从而完成

切换。

3、检查 manager 状态（115）

masterha_check_status --conf=/etc/masterha/app1.cnf

app1 is stopped(2:NOT_RUNNING).

4、Server05开启 manager 监控（115）

[root@server05 ~]# nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover< /dev/null >/var/log/masterha/app1/manager.log 2>&1 &

5、查看 Server05 监控是否正常（115）

[root@monitor ~]# masterha_check_status --conf=/etc/masterha/app1.cnf

app1 (pid:65837) is running(0:PING_OK), master:server01

发现已经将 VIP：192.168.200.100 绑定在网卡 en016777728，可以查看一下：

[root@server01 ~]# ip a | grep ens32

九、模拟主库故障（111）

[root@server01 ~]# systemctl stop mariadb

[root@server01 ~]# netstat -lnpt | grep :3306

[root@server01 ~]# ip a | grep ens32

查看113、114

MariaDB [(none)]> show slave statusG

Server05 监控已经自动关闭（115）

查看监控配置文件已经发生了变化（115）（server01 的配置已被删除）

[root@server05 ~]# cat /etc/masterha/app1.cnf

十、故障主库修复及 VIP 切回测试

1、Primary Master（111）

[root@server01 ~]# systemctl start mariadb

[root@server01 ~]# netstat -lnpt | grep :3306

tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN

6131/mysqld

2、Primary Master指向新的主库（111）

[root@server01 ~]# mysql -u root -p123456

stop slave;

CHANGE MASTER TO

MASTER_HOST='192.168.200.112',

MASTER_USER='repl',

MASTER_PASSWORD='123456';

start slave;

show slave statusG

3、Server05修改监控配置文件添加 server1 配置（115）

[root@server05 ~]# vim /etc/masterha/app1.cnf

[server01]

hostname=server01

port=3306

4、检查集群状态（115）

[root@server05 ~]# masterha_check_repl --conf=/etc/masterha/app1.cnf

最后看到

MySQL Replication Health is OK.

5、Server05开启监控（115）

[root@server05 ~]# nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover< /dev/null >/var/log/masterha/app1/manager.log 2>&1 &

6、Secondary Master关闭现有主库 mysql（112）

[root@server02 ~]# systemctl stop mariadb

[root@server02 ~]# netstat -lnpt | grep :3306

7、Primary Maste（111）

[root@server01 ~]# ip a | grep ens32

发现VIP去到了112上

查看113、114的状态：

MariaDB [(none)]> show slave statusG

Server05配置文件变化（已经移除故障机 server2 配置）

[root@server05 ~]# cat /etc/masterha/app1.cnf

8、修复 Secondary Master主机（112）

[root@server02 ~]# systemctl start mariadb

[root@server02 ~]# netstat -lnpt | grep :3306

9、Secondary Master指向新的主库（112）

[root@server02 ~]# mysql -u root -p123

stop slave;

CHANGE MASTER TO

MASTER_HOST='192.168.200.111',

MASTER_USER='repl',

MASTER_PASSWORD='123';

start slave;

show slave statusG

MariaDB [(none)]> show slave statusG

10、Server05修改监控配置文件添加 server2 配置（115）

[root@server05 ~]# vim /etc/masterha/app1.cnf

[server2]

hostname=server02

candidate_master=1

port=3306

check_repl_delay=0

11、Server05检查集群状态（115）

[root@server05 ~]# masterha_check_repl --conf=/etc/masterha/app1.cnf

MySQL Replication Health is OK.
相关阅读:
【hihocoder 1477】闰秒
 【codeforces 768F】Barrels and boxes
【codeforces 767E】Change-free
【codeforces 810A】Straight «A»
【codeforces 810B】Summer sell-off
【codeforces 810C】Do you want a date?
【codeforces 757E】Bash Plays with Functions
【codeforces 749D】Leaving Auction
Java数据结构与算法(5)
使用Xshell远程连接管理Linux实践
原文地址：https://www.cnblogs.com/990624lty-jhc/p/11750444.html