关于MHA
MHA(Master High Availability)是一款开源的mysql高可用程序,目前在mysql高可用方面是一个相对成熟的解决方案。MHA 搭建的前提是MySQL集群中已经搭建了MySql Replication环境,有了Master/Slave节点。MHA的主要作用就是监测到Master节点故障时会提升主从复制环境中拥有最新数据的Slave节点成为新的master节点。同时,在切换master期间,MHA会通过从其他的Slave节点来获取额外的信息来避免一致性的问题,整个的切换过程对于应用程序而言是完全透明的。MHA还提供了master节点在线切换功能,即按需切换master/slave节点
MHA Manager 和 MHA Node
MHA 服务有两个角色,MHA Manager 和 MHA Node
MHA Manager: 通常单独部署在一台独立机器上管理 master/slave 集群,每个master/slave 集群可以理解为一个application。
MHA Node: 运行在每台mysql 服务器(master/slave)上。它通过监控具备解析和清理logs功能来加快故障转移。
整体上的架构如下图所示
MHA 在自动切换的过程中会从宕掉的MySql master节点中保存二进制日志,以保证数据的完整性。但是如果master节点直接宕机了呢,或者网络直接不能联通了呢?MHA就没有办法获取master的二进制日志,也就没有办法保证数据的完整性了。这也就是为什么MHA应该与MySql主从复制结合起来。这样的话,只要有一个slave节点从master节点复制到了最新的日志,MHA就可以将最近的二进制日志应用到其他的slave节点上,这样就可以最大限度上保证数据的完整性。
MHA 自动切换的原理可以总结为下面几点.
- 从宕机崩溃的master保存二进制日志事件(binlog events);
- 识别含有最新更新的slave;
- 应用差异的中继日志(relay log)到其他的slave;
- 应用从master保存的二进制日志事件(binlog events);
- 提升一个slave为新的master;
- 使其他的slave连接新的master进行复制;
MHA 工具组件
MHA 提供了很多的程序组件,通过这些组件,我们 可以很方便的管理MHA集群。
Manager节点:
- masterha_check_ssh:MHA依赖的环境监测工具;
- masterha_check_ssh: MySql复制环境检测工具;
- masterha_manager: MHA 服务主程序;
- masterha_check_status: MHA运行状态探测工具;
- masterha_master_monitor: MySql master节点可用性检测工具;
- masterha_switch: master 节点切换工具;
- masterha_conf_host: 添加或删除配置的节点;
- masterha_stop: 关闭MHA服务的工具;
Node节点:
- save_binary_logs:保存和复制master节点的二进制日志;
- apply_diff_relay_logs: 识别差异的中继日志事件并应用于其他的slave;
- purge_relay_logs:清除中集日志(不会阻塞SQL线程);
环境如下
MHA manager 192.168.94.11 manager
master1 192.168.94.22 sqlm1
slave1 192.168.94.33 sqls1
slave2 192.168.94.44 sqls2
做好准备工作 , 准备好所有安装包以及依赖关系
关闭防火墙、SElinux
下载mha-manager和mha-node
http://downloads.mariadb.com/MHA/
下载源码、rpm包都可以
安装依赖包和mariadb
# 所有服务器相同操作 [root@manager ~]# yum -y install ntp perl cpan perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes mariadb mariadb-server # manager生成密钥对 , 把公钥发到下面 [root@manager ~]# ssh-keygen -t rsa [root@manager ~]# ssh-copy-id -i .ssh/id_rsa.pub 192.168.94.22 [root@manager ~]# ssh-copy-id -i .ssh/id_rsa.pub 192.168.94.33 [root@manager ~]# ssh-copy-id -i .ssh/id_rsa.pub 192.168.94.44 # 手动验证 , 第一次连接需要敲yes [root@manager ~]# ssh sqlm1 [root@manager ~]# ssh sqls1 [root@manager ~]# ssh sqls2 # sqlm1服务器生成秘钥对 , 发给sqls1、2 # sqls1服务器生成密钥对 , 发给sqlm1、sqls2 # sqls2服务器生成密钥对 , 发给sqlm1、sqls1 # 无需验证
配置mariadb配置文件
# 所有服务器做相同操作 同步时间 [root@manager ~]# ntpdate cn.pool.ntp.org [root@manager ~]# vim /etc/my.cnf server-id=22 log-bin=mysql-bin log-slave-updates=true relay_log_purge=0 [root@sqls1 ~]# vim /etc/my.cnf server-id=33 log-bin=mysql-bin relay-log=slave-relay-bin log-slave-updates=true relay_log_purge=0 [root@sqls2 ~]# vim /etc/my.cnf server-id=44 log-bin=mysql-bin relay-log=slave-relay-bin log-slave-updates=true relay_log_purge=0
授权主从复制用户
[root@sqlm1 ~]# systemctl start mariadb [root@sqlm1 ~]# mysql MariaDB [(none)]> grant replication slave on *.* to 'repl'@'192.168.94.%' identified by '123123'; MariaDB [(none)]> flush privileges; MariaDB [(none)]> show master statusG File: mysql-bin.000003 Position: 473 [root@sqls1 ~]# systemctl start mariadb [root@sqls1 ~]# mysql MariaDB [(none)]> grant replication slave on *.* to 'repl'@'192.168.94.%' identified by '123123'; MariaDB [(none)]> flush privileges; MariaDB [(none)]> change master to master_host='192.168.94.22',master_user='repl',master_password='123123',master_log_file='mysql-bin.000003',master_log_pos=473; MariaDB [(none)]> start slave; MariaDB [(none)]> show slave statusG [root@sqls2 ~]# systemctl start mariadb [root@sqls2 ~]# mysql MariaDB [(none)]> grant replication slave on *.* to 'repl'@'192.168.94.%' identified by '123123'; MariaDB [(none)]> flush privileges; MariaDB [(none)]> change master to master_host='192.168.94.22',master_user='repl',master_password='123123',master_log_file='mysql-bin.000003',master_log_pos=473; MariaDB [(none)]> start slave; MariaDB [(none)]> show slave statusG
下载安装mha-manager 、mha-node
# manager需要两个都安装 , 其他只需要安装node wget https://downloads.mariadb.com/MHA/mha4mysql-node-0.54-0.el6.noarch.rpm wget https://downloads.mariadb.com/MHA/mha4mysql-manager-0.55-0.el6.noarch.rpm rpm -ivh mha4mysql-node-0.54-0.el6.noarch.rpm rpm -ivh mha4mysql-manager-0.55-0.el6.noarch.rpm
在master上配置MHA文件
[root@manager ~]# mkdir -p /masterha/app1 [root@manager ~]# mkdir /etc/masterha [root@manager ~]# vim /etc/masterha/default.cnf [server default]
user=mha password=123123 manager_workdir=/masterha/app1 manager_log=/masterha/app1/manager.log remote_workdir=/masterha/app1 ssh_user=root repl_user=repl repl_password=123123 ping_interval=1 master_ip_failover_script="/usr/bin/masterha_ip_failover" [server1] hostname=sqlm1 master_binlog_dir=/var/lib/mysql candidate_master=1 [server2] hostname=sqls1 master_binlog_dir=/var/lib/mysql candidate_master=1 [server3] hostname=sqls2 master_binlog_dir=/var/lib/mysql
[root@manager ~]# vim /usr/bin/masterha_ip_failover
[root@manager ~]# chmod +x /usr/bin/masterha_ip_failover
masterha_ip_failover内容
#!/usr/bin/env perl use strict; use warnings FATAL => 'all'; use Getopt::Long; my ( $command, $ssh_user, $orig_master_host, $orig_master_ip, $orig_master_port, $new_master_host, $new_master_ip, $new_master_port ); my $vip = '192.168.94.111/24'; my $key = '1'; my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip"; my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down"; GetOptions( 'command=s' => $command, 'ssh_user=s' => $ssh_user, 'orig_master_host=s' => $orig_master_host, 'orig_master_ip=s' => $orig_master_ip, 'orig_master_port=i' => $orig_master_port, 'new_master_host=s' => $new_master_host, 'new_master_ip=s' => $new_master_ip, 'new_master_port=i' => $new_master_port, ); exit &main(); sub main { print " IN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip=== "; if ( $command eq "stop" || $command eq "stopssh" ) { my $exit_code = 1; eval { print "Disabling the VIP on old master: $orig_master_host "; &stop_vip(); $exit_code = 0; }; if ($@) { warn "Got Error: $@ "; exit $exit_code; } exit $exit_code; } elsif ( $command eq "start" ) { my $exit_code = 10; eval { print "Enabling the VIP - $vip on the new master - $new_master_host "; } } } &stop_vip(); &start_vip(); $exit_code = 0; }; if ($@) { warn $@; exit $exit_code; } exit $exit_code; } elsif ( $command eq "status" ) { print "Checking the Status of the script.. OK "; exit 0; } else { &usage(); exit 1; } } sub start_vip() { `ssh $ssh_user@$new_master_host " $ssh_start_vip "`; } sub stop_vip() { return 0 unless ($ssh_user); `ssh $ssh_user@$orig_master_host " $ssh_stop_vip "`; } sub usage { print "Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port "; }
授权mha用户给本地
# 所有node做相同操作 MariaDB [(none)]> grant all privileges on *.* to 'mha'@'sqlm1' identified by '123123'; MariaDB [(none)]> grant all privileges on *.* to 'mha'@'sqls1' identified by '123123'; MariaDB [(none)]> grant all privileges on *.* to 'mha'@'sqls2' identified by '123123'; MariaDB [(none)]> flush privileges;
masterha_check_ssh工具验证的ssh信任登录是否成功
[root@manager ~]# masterha_check_ssh --conf=/etc/masterha/default.cnf
masterha_check_repl工具验证的MySQL复制是否成功
[root@manager ~]# masterha_check_repl --conf=/etc/masterha/default.cnf
测试
[root@manager ~]# masterha_manager --conf=/etc/masterha/default.cnf tail -f /masterha/app1/manager.log [root@sqlm1 ~]# systemctl stop mariadb