无论是传统行业,还是互联网行业,数据可用性都是至关重要的,虽然现在已经步入大数据时代,nosql比较流行,但是作为数据持久化及事务性的关系型数据库依然是项目首选,比如mysql。
现在几乎所有的公司项目,不说可用性必须达到5个9,至少也要要求,数据库出现问题,不能丢失数据,能够快速响应异常处理,下面使用mha来搭建mysql高可用集群(基于centos7+mysql5.7):
一、MHA简介
MHA(Master HA)是一款开源的 MySQL 的高可用程序,它为 MySQL 主从复制架构提供了 automating master failover 功能。MHA 在监控到 master 节点故障时,会提升其中拥有最新数据的 slave 节点成为新的master 节点,在此期间,MHA 会通过于其它从节点获取额外信息来避免一致性方面的问题。MHA 还提供了 master 节点的在线切换功能,即按需切换 master/slave 节点。MHA 能够在30秒内实现故障切换,并能在故障切换中,最大可能的保证数据一致性。
MHA由两部分组成:MHA Manager(管理节点)和MHA Node(数据节点)。管理节点可以单独部署在一台独立的机器上来管理多个master-slave集群,也可以部署在一台slave节点上。数据节点运行在每台mysql服务器上。Manager会定期检查master,若出现故障时,会自动将最新数据的slave提升为新的master,然后将其他的slave指向新的master。整个故障转移程序完全透明。
目前MHA主要支持一主多从的架构。要搭建MHA,要求一个复制集群中必须最少有三台数据库服务器,一主二从,即一台充当master,一台充当备用master,另外一台充当从库。
二、搭建环境架构
1.环境配置:
操作系统版本:CentOS7
MySQL版本:5.7.28
VIP(虚IP):192.168.3.140
机器列表及功能:
IP | hostname | server_id | 角色及功能 |
192.168.3.142 | s142 | 142 | Monitor Host(监控复制组)/ Master(响应写请求) |
192.168.3.143 | s143 | 143 | Candidate Master(响应读请求) |
192.168.3.144 | s144 | 144 | Slave(响应读请求) |
三、MHA搭建步骤
1.在s142、s143、s144机器上安装mysql5.7
详细安装可以参照另一篇文章:centos7安装mysql-5.7.28 在此不再详述
s142 my.cnf配置信息:
[mysqld] log-bin=/usr/local/mysql/logs/mysql-bin.log expire-logs-days=1 max-binlog-size=500M innodb_log_file_size=256M
binlog_format=row server-id=142 gtid_mode=on enforce_gtid_consistency=1 log_slave_updates=1
relay_log_recovery=ON
relay_log=/usr/local/mysql/logs/mysql-relay-bin
relay_log_index=/usr/local/mysql/logs/mysql-relay-bin.index
log_error=/usr/local/mysql/logs/mysql-error.log
#### replication ####
log_slave_updates=1
replicate_wild_ignore_table=information_schema.%,performance_schema.%,sys.%
plugin_load="rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so" loose_rpl_semi_sync_master_enabled=1 loose_rpl_semi_sync_slave_enabled=1 loose_rpl_semi_sync_master_timeout=5000 basedir=/usr/local/mysql datadir=/usr/local/mysql/data socket=/usr/local/mysql/mysql.sock user=mysql default-storage-engine=InnoDB character-set-server=utf8 lower_case_table_names = 1 explicit_defaults_for_timestamp=true [mysqld_safe] log-error=/usr/local/mysql/mysql-error.log pid-file=/usr/local/mysql/mysqld.pid [client] socket=/usr/local/mysql/mysql.sock [mysql] default-character-set=utf8 socket=/usr/local/mysql/mysql.sock
s143 my.cnf配置信息:
[mysqld] log-bin=/usr/local/mysql/logs/mysql-bin.log expire-logs-days=1 max-binlog-size=500M innodb_log_file_size=256M
binlog_format=row server-id=143 gtid_mode=on enforce_gtid_consistency=1 log_slave_updates=1
relay_log_recovery=ON
relay_log=/usr/local/mysql/logs/mysql-relay-bin
relay_log_index=/usr/local/mysql/logs/mysql-relay-bin.index
log_error=/usr/local/mysql/logs/mysql-error.log
#### replication ####
log_slave_updates=1
replicate_wild_ignore_table=information_schema.%,performance_schema.%,sys.%
read_only=1
relay_log_purge=0
plugin_load="rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so" loose_rpl_semi_sync_master_enabled=1 loose_rpl_semi_sync_slave_enabled=1 loose_rpl_semi_sync_master_timeout=5000 basedir=/usr/local/mysql datadir=/usr/local/mysql/data socket=/usr/local/mysql/mysql.sock user=mysql default-storage-engine=InnoDB character-set-server=utf8 lower_case_table_names = 1 explicit_defaults_for_timestamp=true [mysqld_safe] log-error=/usr/local/mysql/mysql-error.log pid-file=/usr/local/mysql/mysqld.pid [client] socket=/usr/local/mysql/mysql.sock [mysql] default-character-set=utf8 socket=/usr/local/mysql/mysql.sock
s144 my.cnf 配置信息:
[mysqld] log-bin=/usr/local/mysql/logs/mysql-bin.log expire-logs-days=1 max-binlog-size=500M innodb_log_file_size=256M
binlog_format=row server-id=144 gtid_mode=on enforce_gtid_consistency=1 log_slave_updates=1
relay_log_recovery=ON
relay_log=/usr/local/mysql/logs/mysql-relay-bin
relay_log_index=/usr/local/mysql/logs/mysql-relay-bin.index
log_error=/usr/local/mysql/logs/mysql-error.log
#### replication ####
log_slave_updates=1
replicate_wild_ignore_table=information_schema.%,performance_schema.%,sys.%
read_only=1
relay_log_purge=0
basedir=/usr/local/mysql datadir=/usr/local/mysql/data socket=/usr/local/mysql/mysql.sock user=mysql default-storage-engine=InnoDB character-set-server=utf8 lower_case_table_names = 1 explicit_defaults_for_timestamp=true [mysqld_safe] log-error=/usr/local/mysql/mysql-error.log pid-file=/usr/local/mysql/mysqld.pid [client] socket=/usr/local/mysql/mysql.sock [mysql] default-character-set=utf8 socket=/usr/local/mysql/mysql.sock
2.创建复制用户及复制配置
在主节点上配置复制用户:
create user canal_repl_user;
grant replication slave on *.* to canal_repl_user identified by '111111';
flush privileges;
grant all on *.* to root identified by '111111';
在从节点上执行主从复制命令:
CHANGE MASTER TO MASTER_HOST='192.168.30.142', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='canal_repl_user', MASTER_PASSWORD='111111'; #master_log_file='master-bin.000001',#5.6后不需要指定 #master_log_pos=189;
#启动主从复制
START SLAVE
#查看主从复制信息
SHOW SLAVE STATUS
说明主从复制成功,可以在主库中创建一个库,看看从库是否同步
3.在每台机器上安装yum源头及MHA依赖的perl包
wget http://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
rpm -ivh epel-release-latest-7.noarch.rpm
yum install -y perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager
如图安装成功:
4.配置ssh免密登录
分别在s142/s143/s144机器上生成ssh秘钥:
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
将各自公钥id_rsa.pub发送到另外两台机器,并追加到 ~/.ssh/authorized_keys中:
s142:
mv id_rsa.pub id_rsa_142.pub
scp id_rsa_142.pub s143:~/.ssh/
scp id_rsa_142.pub s144:~/.ssh/
s143:
mv id_rsa.pub id_rsa_143.pub
scp id_rsa_143.pub s142:~/.ssh/
scp id_rsa_143.pub s144:~/.ssh/
s144:
mv id_rsa.pub id_rsa_144.pub
scp id_rsa_144.pub s142:~/.ssh/
scp id_rsa_144.pub s143:~/.ssh/
s142:
cat id_rsa_143.pub >> authorized_keys
cat id_rsa_144.pub >> authorized_keys
s143:
cat id_rsa_142.pub >> authorized_keys
cat id_rsa_144.pub >> authorized_keys
s144:
cat id_rsa_143.pub >> authorized_keys
cat id_rsa_142.pub >> authorized_keys
5.安装MHA
分别在s142、s143、s144上下载node安装包并安装:
wget https://qiniu.wsfnk.com/mha4mysql-node-0.58-0.el7.centos.noarch.rpm rpm -ivh mha4mysql-node-0.58-0.el7.centos.noarch.rpm
在s142上安装manager
wget https://qiniu.wsfnk.com/mha4mysql-manager-0.58-0.el7.centos.noarch.rpm rpm -ivh mha4mysql-manager-0.58-0.el7.centos.noarch.rpm
6.配置MHA Manager
6.1配置全局配置文件
新建 /etc/masterha_default.cnf (一定要是这个路径,不然后期masterha_check_ssh会提示未找到全局文件)
vim /etc/masterha_default.cnf [server default] user=root password=111111 ssh_user=root repl_user=canal_repl_user repl_password=111111 ping_interval=1 #master_binlog_dir=/usr/local/mysql/logs secondary_check_script=masterha_secondary_check -s s142 -s s143 -s s144 master_ip_failover_script="/opt/soft/mha/scripts/master_ip_failover" master_ip_online_change_script="/opt/soft/mha/scripts/master_ip_online_change" report_script="/opt/soft/mha/scripts/send_report"
6.2 配置主配置文件
新建/opt/soft/mha/app1/app1.cnf文件,并配置如下信息:
[server default] manager_workdir=/opt/soft/mha manager_log=/opt/soft/mha/manager.log password=111111 user=root ping_interval=1 repl_password=111111 repl_user=canal_repl_user #master_binlog_dir=/usr/local/mysql/logs #secondary_check_script=masterha_secondary_check -s s142 -s s143 -s s144 #master_ip_failover_script="/opt/soft/mha/scripts/master_ip_failover" #master_ip_online_change_script="/opt/soft/mha/scripts/master_ip_online_change" #report_script="/opt/soft/mha/scripts/send_report" #ssh用户 ssh_user=root [server1] hostname=s142 port=3306 master_binlog_dir=/usr/local/mysql/logs candidate_master=1 check_repl_delay=0 [server2] hostname=s143 port=3306 master_binlog_dir=/usr/local/mysql/logs candidate_master=1 check_repl_delay=0 [server3] hostname=s144 port=3306 master_binlog_dir=/usr/local/mysql/logs ignore_fail=1 no_master=1
6.4 配置VIP切换
为了防止脑裂发生,推荐生产环境采用脚本的方式来管理虚拟 ip,而不是使用 keepalived来完成。
vim /opt/soft/mha/scripts/master_ip_failover #!/usr/bin/env perl use strict; use warnings FATAL => 'all'; use Getopt::Long; my ( $command, $ssh_user, $orig_master_host, $orig_master_ip,$orig_master_port, $new_master_host, $new_master_ip,$new_master_port ); #定义VIP变量 my $vip = '192.168.30.140/24'; my $key = '1'; my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip"; my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down"; GetOptions( 'command=s' => $command, 'ssh_user=s' => $ssh_user, 'orig_master_host=s' => $orig_master_host, 'orig_master_ip=s' => $orig_master_ip, 'orig_master_port=i' => $orig_master_port, 'new_master_host=s' => $new_master_host, 'new_master_ip=s' => $new_master_ip, 'new_master_port=i' => $new_master_port, ); exit &main(); sub main { print " IN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip=== "; if ( $command eq "stop" || $command eq "stopssh" ) { my $exit_code = 1; eval { print "Disabling the VIP on old master: $orig_master_host "; &stop_vip(); $exit_code = 0; }; if ($@) { warn "Got Error: $@ "; exit $exit_code; } exit $exit_code; } elsif ( $command eq "start" ) { my $exit_code = 10; eval { print "Enabling the VIP - $vip on the new master - $new_master_host "; &start_vip(); $exit_code = 0; }; if ($@) { warn $@; exit $exit_code; } exit $exit_code; } elsif ( $command eq "status" ) { print "Checking the Status of the script.. OK "; exit 0; } else { &usage(); exit 1; } } sub start_vip() { `ssh $ssh_user@$new_master_host " $ssh_start_vip "`; } sub stop_vip() { return 0 unless ($ssh_user); `ssh $ssh_user@$orig_master_host " $ssh_stop_vip "`; } sub usage { print "Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port "; }
6.5 配置VIP脚本
vim /opt/soft/mha/scripts/master_ip_online_change #!/bin/bash source /root/.bash_profile vip=`echo '192.168.30.140/24'` #设置VIP key=`echo '1'` command=`echo "$1" | awk -F = '{print $2}'` orig_master_host=`echo "$2" | awk -F = '{print $2}'` new_master_host=`echo "$7" | awk -F = '{print $2}'` orig_master_ssh_user=`echo "${12}" | awk -F = '{print $2}'` new_master_ssh_user=`echo "${13}" | awk -F = '{print $2}'` #要求服务的网卡识别名一样,都为ens33(这里是) stop_vip=`echo "ssh root@$orig_master_host /usr/sbin/ifconfig ens33:$key down"` start_vip=`echo "ssh root@$new_master_host /usr/sbin/ifconfig ens33:$key $vip"` if [ $command = 'stop' ] then echo -e " **************************** " echo -e "Disabled thi VIP - $vip on old master: $orig_master_host " $stop_vip if [ $? -eq 0 ] then echo "Disabled the VIP successfully" else echo "Disabled the VIP failed" fi echo -e "*************************** " fi if [ $command = 'start' -o $command = 'status' ] then echo -e " ************************* " echo -e "Enabling the VIP - $vip on new master: $new_master_host " $start_vip if [ $? -eq 0 ] then echo "Enabled the VIP successfully" else echo "Enabled the VIP failed" fi echo -e "*************************** " fi
6.6.配置报警邮件脚本
首先配置邮件发送设置信息
#mail邮件发送程序,需要先配置好发送这信息 vim /etc/mail.rc set from=qixing@163.com set smtp=smtp.163.com set smtp-auth-user=qixing #拿163邮箱来说这个不是密码,而是授权码 set smtp-auth-password=qixing set smtp-auth=login
编写邮件发送脚本:
vim /opt/soft/mha/script/send_report #!/bin/bash source /root/.bash_profile # 解析变量 orig_master_host=`echo "$1" | awk -F = '{print $2}'` new_master_host=`echo "$2" | awk -F = '{print $2}'` new_slave_hosts=`echo "$3" | awk -F = '{print $2}'` subject=`echo "$4" | awk -F = '{print $2}'` body=`echo "$5" | awk -F = '{print $2}'` #定义收件人地址 email="qixing@163.com" tac /var/log/mha/app1/manager.log | sed -n 2p | grep 'successfully' > /dev/null if [ $? -eq 0 ] then messages=`echo -e "MHA $subject 主从切换成功 master:$orig_master_host --> $new_master_host $body 当前从库:$new_slave_hosts"` echo "$messages" | mail -s "Mysql 实例宕掉,MHA $subject 切换成功" $email >>/tmp/mailx.log 2>&1 else messages=`echo -e "MHA $subject 主从切换失败 master:$orig_master_host --> $new_master_host $body" ` echo "$messages" | mail -s ""Mysql 实例宕掉,MHA $subject 切换失败"" $email >>/tmp/mailx.log 2>&1 fi
6.7 将脚本赋予可执行权限
chmod +x /opt/soft/mha/scripts/master_ip_failover chmod +x /opt/soft/mha/scripts/master_ip_online_change chmod +x /opt/soft/mha/scripts/send_report
7.验证MHA配置信息是否正常
7.1 检查ssh配置:
masterha_check_ssh --conf=/opt/soft/mha/app1/app1.cnf
成功!!!
7.2 检查主从复制情况:
masterha_check_repl --conf=/opt/soft/mha/app1/app1.cnf
健康!!!
8.在master节点上绑定VIP,只需绑定一次,后续会随主备切换而自动切换
ifconfig ens33:1 192.168.30.140/24
如过遇到问题,需手动删除,可执行如下命令:
ifconfig ens33:1 del 192.168.30.140
或
ifconfig ens33:1 down #关闭vip
可以查看绑定VIP是否成功:
ip addr
说明绑定成功!
9.在MHA的manager节点上启动MHA管理进程
nohup masterha_manager --conf=/opt/soft/mha/app1/app1.cnf --ignore_last_failover /opt/soft/mha/app1/manager.log 2>&1 &
命令参数:
--remove_dead_master_conf 该参数代表当发生主从切换后,老的主库的ip将会从配置文件中移除。 --manger_log 日志存放位置 --ignore_last_failover 在缺省情况下,如果MHA检测到连续发生宕机,且两次宕机间隔不足8小时的话,则不会进行Failover,之所以这样限制是为了避免ping-pong效应。该参数代表忽略上次MHA触发切换产生的文件,默认情况下,MHA发生切换后会在日志目录,也就是上面设置的manager_workdir目录中产生app1.failover.complete文件,下次再次切换的时候如果发现该目录下存在该文件将不允许触发切换,除非在第一次切换后收到删除该文件,为了方便,这里设置为--ignore_last_failover。
观察manager.log日志,查看是否有成功,一般最后打印如下日志,说明成功:
Thu Jul 2 15:00:05 2020 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..
10.查看MHA状态
masterha_check_status --conf=/opt/soft/mha/app1/app1.cnf
说明MHA正在运行中,主节点是s142
11.停止MHA管理进程
masterha_stop --conf=/opt/soft/mha/app1/app1.cnf
manager.log日志会打印终止日志:
12.手动进行主备切换(在进行手动切换前要先停值manager进程)
masterha_master_switch --conf=/opt/soft/mha/app1/app1.cnf --master_state=alive --new_master_host=s143 --orig_master_is_new_slave --running_updates_limit=10000 --interactive=0
命令参数:
--master_state=dead
强制参数. 可选有: "dead" or "alive". 如果设置为 alive,将执行在线切主操作。
--dead_master_host=(hostname)
强制参数,--dead_master_ip 和 --dead_master_port 可选。
--interactive=(0|1)
1为交互模式(默认),会输入几个yes;0为非交互。
--ssh_reachable=(0|1|2)
否通过SSH可达。0表示不可达;2表示未知(默认)。
--skip_change_master
跳过CHANGE MASTER TO 操作
--skip_disable_read_only
跳过在新主上 SET GLOBAL read_only=0的操作。以便稍后手动操作。
--last_failover_minute=(minutes)
最近故障转移时间间隔(默认480),如果之前的故障转移是最近完成的(默认情况下是8小时),MHA Manager不会执行故障转移,因为问题很可能无法通过执行故障转移来解决。此参数的目的是避免乒乓故障转移问题。您可以通过更改此参数来更改时间标准
--ignore_last_failover
如果前面的故障转移失败,MHA不会启动故障转移,因为问题可能会再次发生。启动故障转移的正常步骤是手动删除在(manager_workdir)/(app_name).failover下创建的故障转移错误文件。如果设置该参数,将忽略这个错误文件,直接进行故障转移。
--remove_dead_master_conf
设置此选项后,如果故障转移成功完成,MHA Manager将自动从配置文件中删除失效主服务器的部分。
--wait_until_gtid_in_sync(0|1)
适用于GTID模式,设置为1表示MHA将等待所有slave追上新master的GTID,默认;0表示不等。
--orig_master_is_new_slave
如果原主库alive,设置该参数,将会使原master作为新主库的slave
说明切换成功!
13.常用的命令
SHOW SLAVE STATUS; #查看从库复制状态
SHOW MASTER STATUS; #查看当前binlog位点
SHOW SLAVE HOSTS; #查看从库列表