• Redis Sentinel高可用架构


    Redis的高可用架构现在越来越多了,可以见得Redis的发展是有多么的迅速,现在不少公司都用上了Redis,所以Redis高可用也显得尤其重要,现在Redis的高可用架构有比如keepalived+redis,redis cluster,twemproxy,codis,下面我们主要针对Redis Sentinel高可用架构展开学习。

    Redis Sentinel主要功能有以下几点:

    • 不时地监控redis是否按照预期良好地运行;

    • 如果发现某个redis节点运行出现状况,能够通知另外一个进程(例如它的客户端);

    • 能够进行自动切换。当一个master节点不可用时,能够选举出master的多个slave(如果有超过一个slave的话)中的一个来作为新的master,其它的slave节点会将它所追随的master的地址改为被提升为master的slave的新地址。

        Sentinel是一个监视器,它可以根据被监视实例的身份和状态来判断应该执行何种动作。Sentinel是如何发现其他Sentinel的呢?Sentinel会通过命令连接向被监视的主从服务器发送HELLO信息,该消息包含Sentinel的IP、端口号、ID等内容,以此来向其他Sentinel宣告自己的存在。与此同时,Sentinel会通过订阅连接接收其他Sentinel的HELLO信息,以此来发现监视同一个主服务器的其他Sentinel。

    Sentinel之间会互相创建命令连接,用于进行通信。因为已经有主从服务器作发送和接收HELLO信息的中介,所以Sentinel之间不会创建订阅连接:

     以下是Redis Sentinel的架构图,Sentinel节点数最好是单数,至于为什么,请参考以下的资料:

    http://segmentfault.com/a/1190000002680804

    http://segmentfault.com/a/1190000002685515

    下面进行Redis Sentinel的部署和测试,本次实验的版本是redis-3.0.7版本,环境说明:

     192.168.10.128  Sentinel_1
     192.168.10.129  Sentinel_2
     192.168.10.130  Sentinel_3
     192.168.10.131  Redis_Master
     192.168.10.132  Redis_Slave

    一、 在五台服务器上分别执行下redis-3.0.7的安装,以Sentinel_1服务为例:

    [root@Sentinel_1 ~]# wget http://download.redis.io/releases/redis-3.0.7.tar.gz
    [root@Sentinel_1 ~]# tar xf redis-3.0.7.tar.gz 
    [root@Sentinel_1 ~]# cd redis-3.0.7/src/
    [root@Sentinel_1 ~]# make PREFIX=/data/service/redis install

    安装完成后,会在/data/service/redis下会产生一个bin目录:

    [root@Sentinel_1 ~]# ll /data/service/redis/
    total 12
    drwxr-xr-x. 2 root root 4096 Mar  7 19:19 bin
    [root@Sentinel_1 ~]# 

    分别在五台服务器上添加redis的bin目录的环境变量(不是必需的),方便命令的使用,编辑vim /etc/profile.d/redis.sh 添加以下内容:

    export PATH=/data/service/redis/bin:$PATH

    执行source /etc/profile.d/redis.sh 让环境变量生效:

    [root@Sentinel_1 ~]# source /etc/profile.d/redis.sh

    二、配置Redis主从环境,主从环境的部署很简单,这里不演示搭建过程,Redis_Master: 192.168.10.131  Redis_Slave: 192.168.10.132

    Redis_Master启动的Log:

    复制代码
    [root@Redis_Master redis]# tail -f logs/redis_6379.log 
    1974:M 07 Mar 22:03:05.381 * DB loaded from disk: 0.001 seconds
    1974:M 07 Mar 22:03:05.381 * The server is now ready to accept connections on port 6379
    1974:M 07 Mar 22:03:44.592 * Slave 192.168.10.132:6379 asks for synchronization
    1974:M 07 Mar 22:03:44.593 * Full resync requested by slave 192.168.10.132:6379
    1974:M 07 Mar 22:03:44.593 * Starting BGSAVE for SYNC with target: disk
    1974:M 07 Mar 22:03:44.594 * Background saving started by pid 1977
    1977:C 07 Mar 22:03:44.632 * DB saved on disk
    1977:C 07 Mar 22:03:44.632 * RDB: 4 MB of memory used by copy-on-write
    1974:M 07 Mar 22:03:44.649 * Background saving terminated with success
    1974:M 07 Mar 22:03:44.650 * Synchronization with slave 192.168.10.132:6379 succeeded
    复制代码

    在Redis_Slave启动的Log:

    复制代码
    [root@Redis_Slave redis]# tail -f logs/redis_6379.log 
    2437:S 07 Mar 22:03:44.246 * Connecting to MASTER 192.168.10.131:6379
    2437:S 07 Mar 22:03:44.247 * MASTER <-> SLAVE sync started
    2437:S 07 Mar 22:03:44.262 * Non blocking connect for SYNC fired the event.
    2437:S 07 Mar 22:03:44.268 * Master replied to PING, replication can continue...
    2437:S 07 Mar 22:03:44.269 * Partial resynchronization not possible (no cached master)
    2437:S 07 Mar 22:03:44.270 * Full resync from master: 5d1fbf46ddd1eb0a7728abbbad61e78908dd7963:1
    2437:S 07 Mar 22:03:44.326 * MASTER <-> SLAVE sync: receiving 34 bytes from master
    2437:S 07 Mar 22:03:44.326 * MASTER <-> SLAVE sync: Flushing old data
    2437:S 07 Mar 22:03:44.328 * MASTER <-> SLAVE sync: Loading DB in memory
    2437:S 07 Mar 22:03:44.329 * MASTER <-> SLAVE sync: Finished with success
    复制代码

    可以看到主从环境是正常的!

    三、进行Sentinel配置,及配置文件的解释。

    在三台Sentinel服务器下创建conf目录和log目录,存放配置文件和log:

    [root@Sentinel_1 ~]# mkdir -p /data/service/redis/sentinel/conf

    [root@Sentinel_1 ~]# mkdir -p /data/service/redis/sentinel/log

     进到conf目录,编辑文件26379.conf,三台Sentinel服务器,配置都一样:

    复制代码
    [root@Sentinel_1 conf]# pwd
    /data/service/redis/sentinel/conf
    [root@Sentinel_1 conf]# cat 26379.conf 
    port 26379
    dir "/data/service/redis/sentinel"
    daemonize yes
    logfile "/data/service/redis/sentinel/log/sentinel.log"
    
    # 6379
    sentinel monitor master-6379 192.168.10.131 6379 2
    sentinel down-after-milliseconds master-6379 15000
    sentinel parallel-syncs master-6379 1
    sentinel failover-timeout master-6379 180000
    sentinel auth-pass master-6379 123456
    sentinel client-reconfig-script master-6379 /data/script/python/notify.py
    [root@Sentinel_1 conf]# 
    复制代码

    26379.conf配置文件解释:
    1、前4行是定义sentinel的一些基本信息,跟redis很类似,不作过多解释。

    2、sentinel monitor master-6379 192.168.10.131 6379 2(这一行代表sentinel监控的master的名字叫做master-6379,地址为192.168.10.131:6379,这个2代表,当集群中有2个sentinel认为master死了时,才能真正认为该master已经不可用了)

    3、down-after-milliseconds (sentinel会向master发送心跳PING来确认master是否存活,如果master在“一定时间范围”内不回应PONG 或者是回复了一个错误消息,那么这个sentinel会主观地(单方面地)认为这个master已经不可用,而这个down-after-milliseconds就是用来指定这个“一定时间范围”的,单位是毫秒。)

    4、parallel-syncs(在发生failover主备切换时,这个选项指定了最多可以有多少个slave同时对新的master进行同步,这个数字越小,完成failover所需的时间就越长,但是如果这个数字越大,就意味着越多的slave因为replication而不可用。可以通过将这个值设为 1 来保证每次只有一个slave处于不能处理命令请求的状态

    5、failover-timeout(sentinel集群都遵守一个规则:如果sentinel A推荐sentinel B去执行failover,B会等待一段时间后,自行再次去对同一个master执行failover,这个等待的时间是通过failover-timeout配置项去配置的。从这个规则可以看出,sentinel集群中的sentinel不会再同一时刻并发去failover同一个master,第一个进行failover的sentinel如果失败了,另外一个将会在一定时间内进行重新进行failover,以此类推

    6、auth-pass(这选项主要针对redis master/slave架构设置了密码认证,如果配置主从时没有设定密码,就不需要些选项,若有密码,这里要指定连接的密码)

    7、client-reconfig-script (该参数是定义故障转移脚本,当master故障转移后,执行发短信或者IP切换等)

    故障转移后发邮件的notify.py脚本是参考了大神的博客:http://www.cnblogs.com/gomysql/p/5040847.html

    复制代码
    #!/usr/bin/python
    #coding:utf8
    
    import sys
    import time
    import smtplib
    import logging
    from email.mime.text import MIMEText
    from email.message import Message
    from email.header import Header
    
    
    alarm_mail =['1111111111@qq.com']
    
    def main():
      
        failover_time=time.strftime("%Y-%m-%d %H:%M:%S")
    
        logging.basicConfig(level=logging.DEBUG,
                    format='%(asctime)s %(filename)s[line:%(lineno)d] %(levelname)s %(message)s',
                    datefmt='%Y-%m-%d %H:%M:%S',
                    filename='/data/service/redis/failover.log',
                    filemode='a')
    
        console = logging.StreamHandler()
        console.setLevel(logging.INFO)
        formatter = logging.Formatter('%(name)-12s: %(levelname)-8s %(message)s')
        console.setFormatter(formatter)
        logging.getLogger('').addHandler(console)
    
        mail_host='smtp.163.com'
        mail_port=25
        mail_user=''
        mail_pass=''
        mail_send_from = ''
    
        def send_mail(to_list,sub,content):
            me=mail_send_from
            msg = MIMEText(content, _subtype='html', _charset='utf-8')
            msg['Subject'] = Header(sub,'utf-8')
            msg['From'] = Header(me,'utf-8')
            msg['To'] = ";".join(to_list)
            try:
                smtp = smtplib.SMTP()
                smtp.connect(mail_host,mail_port)
                smtp.login(mail_user,mail_pass)
                smtp.sendmail(me,to_list, msg.as_string())
                smtp.close()
                return True
            except Exception as error:
                logging.error("邮件发送失败: %s" % (error))
                return False
    
        try:
            master_name = sys.argv[1]
            role = sys.argv[2]
            from_ip = sys.argv[4]
            from_port = sys.argv[5]
            to_ip = sys.argv[6]
            to_port = sys.argv[7]
        except Exception as error:
            logging.error('从 Sentinel 获取参数错误: %s ' % (error))
            sys.exit(1)
    
        sub='redis %s faiover' % (master_name)
        nodify_message = "%s %s is failover end. sentinel find redis master %s:%s is down. failover to slave %s:%s" % (failover_time,master_name,from_ip,from_port,to_ip,to_port)
        
        if role == 'leader':
            logging.info(nodify_message)
            send_mail(alarm_mail,sub,nodify_message)
    
    if __name__ == "__main__":
        main()
    复制代码

    四、下面启动Sentinel服务,启动方式有两种:

    方式一:

    redis-sentinel /path/to/sentinel.conf

    方式二:

    redis-server /path/to/sentinel.conf --sentinel

    我习惯用第一种方法,分别在三台Sentinel服务器进行启动:

    第一台Sentinel_1启动log:

    复制代码
    [root@Sentinel_1 sentinel]# redis-sentinel /data/service/redis/sentinel/conf/26379.conf 
    [root@Sentinel_1 sentinel]# tail -f log/sentinel.log 
     |    `-._`-._        _.-'_.-'    |                                  
      `-._    `-._`-.__.-'_.-'    _.-'                                   
          `-._    `-.__.-'    _.-'                                       
              `-._        _.-'                                           
                  `-.__.-'                                               
    
    5153:X 07 Mar 22:37:16.290 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
    5153:X 07 Mar 22:37:16.290 # Sentinel runid is 21e629e6d2b26682e660258787d5fb995010e6c8
    5153:X 07 Mar 22:37:16.290 # +monitor master master-6379 192.168.10.131 6379 quorum 2
    5153:X 07 Mar 22:37:17.330 * +slave slave 192.168.10.132:6379 192.168.10.132 6379 @ master-6379 192.168.10.131 6379
    5153:X 07 Mar 22:38:29.406 * +sentinel sentinel 192.168.10.129:26379 192.168.10.129 26379 @ master-6379 192.168.10.131 6379
    5153:X 07 Mar 22:38:45.024 * +sentinel sentinel 192.168.10.130:26379 192.168.10.130 26379 @ master-6379 192.168.10.131 6379
    复制代码

    第二台Sentinel_2启动log:

    复制代码
    [root@Sentinel_2 sentinel]# redis-sentinel /data/service/redis/sentinel/conf/26379.conf
    [root@Sentinel_2 sentinel]# tail -f log/sentinel.log 
      `-._    `-._`-.__.-'_.-'    _.-'                                   
          `-._    `-.__.-'    _.-'                                       
              `-._        _.-'                                           
                  `-.__.-'                                               
    
    4647:X 07 Mar 22:38:27.570 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
    4647:X 07 Mar 22:38:27.570 # Sentinel runid is f391228f430177d881464e908c683bfc73d61c24
    4647:X 07 Mar 22:38:27.571 # +monitor master master-6379 192.168.10.131 6379 quorum 2
    4647:X 07 Mar 22:38:28.582 * +slave slave 192.168.10.132:6379 192.168.10.132 6379 @ master-6379 192.168.10.131 6379
    4647:X 07 Mar 22:38:29.218 * +sentinel sentinel 192.168.10.128:26379 192.168.10.128 26379 @ master-6379 192.168.10.131 6379
    4647:X 07 Mar 22:38:45.200 * +sentinel sentinel 192.168.10.130:26379 192.168.10.130 26379 @ master-6379 192.168.10.131 6379
    复制代码

    第三台Sentinel_3启动log:

    复制代码
    [root@Sentinel_3 sentinel]# redis-sentinel /data/service/redis/sentinel/conf/26379.conf
    [root@Sentinel_3 sentinel]# tail -f log/sentinel.log 
          `-._    `-.__.-'    _.-'                                       
              `-._        _.-'                                           
                  `-.__.-'                                               
    
    2115:X 07 Mar 22:38:43.161 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
    2115:X 07 Mar 22:38:43.161 # Sentinel runid is 7fbee9138d4e5c1e2def7bbc4f888cef04d95677
    2115:X 07 Mar 22:38:43.161 # +monitor master master-6379 192.168.10.131 6379 quorum 2
    2115:X 07 Mar 22:38:44.167 * +slave slave 192.168.10.132:6379 192.168.10.132 6379 @ master-6379 192.168.10.131 6379
    2115:X 07 Mar 22:38:44.818 * +sentinel sentinel 192.168.10.129:26379 192.168.10.129 26379 @ master-6379 192.168.10.131 6379
    2115:X 07 Mar 22:38:44.851 * +sentinel sentinel 192.168.10.128:26379 192.168.10.128 26379 @ master-6379 192.168.10.131 6379
    复制代码

    可以看到Sentinel整个集群都开始工作了,我们可以随便登录一台Sentinel看下现在监视的状态:

    复制代码
    [root@Sentinel_1 sentinel]# redis-cli -p 26379
    127.0.0.1:26379> INFO sentinel
    # Sentinel
    sentinel_masters:1
    sentinel_tilt:0
    sentinel_running_scripts:0
    sentinel_scripts_queue_length:0
    master0:name=master-6379,status=ok,address=192.168.10.131:6379,slaves=1,sentinels=3
    127.0.0.1:26379> 
    复制代码

    可以看到状态是status=ok,slaves=1有一个从节点。

    五、Redis down机测试

    测试一、停掉Redis_Master,看Sentinel会不会把存活的Slave节点提升为Master节点

    [root@Redis_Master redis]# sh redis stop
    Stopping ...
    Waiting for Redis to shutdown ...
    Redis stopped
    [root@Redis_Master redis]# 

    1、随便查看一台Sentinel的log,tail -f log/sentinel.log:

    复制代码
    5153:X 07 Mar 22:48:20.986 # +sdown master master-6379 192.168.10.131 6379
    5153:X 07 Mar 22:48:21.047 # +odown master master-6379 192.168.10.131 6379 #quorum 2/2
    5153:X 07 Mar 22:48:21.049 # +new-epoch 1
    5153:X 07 Mar 22:48:21.050 # +try-failover master master-6379 192.168.10.131 6379
    5153:X 07 Mar 22:48:21.053 # +vote-for-leader 21e629e6d2b26682e660258787d5fb995010e6c8 1
    5153:X 07 Mar 22:48:21.057 # 192.168.10.130:26379 voted for 7fbee9138d4e5c1e2def7bbc4f888cef04d95677 1
    5153:X 07 Mar 22:48:21.062 # 192.168.10.129:26379 voted for 7fbee9138d4e5c1e2def7bbc4f888cef04d95677 1
    5153:X 07 Mar 22:48:22.441 # +config-update-from sentinel 192.168.10.130:26379 192.168.10.130 26379 @ master-6379 192.168.10.131 6379
    5153:X 07 Mar 22:48:22.442 # +switch-master master-6379 192.168.10.131 6379 192.168.10.132 6379
    5153:X 07 Mar 22:48:22.443 * +slave slave 192.168.10.131:6379 192.168.10.131 6379 @ master-6379 192.168.10.132 6379
    5153:X 07 Mar 22:48:37.496 # +sdown slave 192.168.10.131:6379 192.168.10.131 6379 @ master-6379 192.168.10.132 6379
    复制代码

    2、再查看Redis_Slave的log:

    复制代码
    2437:S 07 Mar 22:48:18.023 * Connecting to MASTER 192.168.10.131:6379
    2437:S 07 Mar 22:48:18.026 * MASTER <-> SLAVE sync started
    2437:S 07 Mar 22:48:18.029 # Error condition on socket for SYNC: Connection refused
    2437:S 07 Mar 22:48:19.050 * Connecting to MASTER 192.168.10.131:6379
    2437:S 07 Mar 22:48:19.053 * MASTER <-> SLAVE sync started
    2437:S 07 Mar 22:48:19.055 # Error condition on socket for SYNC: Connection refused
    2437:S 07 Mar 22:48:20.074 * Connecting to MASTER 192.168.10.131:6379
    2437:S 07 Mar 22:48:20.077 * MASTER <-> SLAVE sync started
    2437:S 07 Mar 22:48:20.079 # Error condition on socket for SYNC: Connection refused
    2437:M 07 Mar 22:48:20.724 * Discarding previously cached master state.
    2437:M 07 Mar 22:48:20.725 * MASTER MODE enabled (user request from 'id=7 addr=192.168.10.130:60991 fd=11 name=sentinel-7fbee913-cmd age=577 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=0 qbuf-free=32768 obl=36 oll=0 omem=0 events=rw cmd=exec')
    2437:M 07 Mar 22:48:20.745 # CONFIG REWRITE executed with success.
    2437:M 07 Mar 22:48:20.796 * 1 changes in 900 seconds. Saving...
    2437:M 07 Mar 22:48:20.870 * Background saving started by pid 2442
    2442:C 07 Mar 22:48:20.915 * DB saved on disk
    2442:C 07 Mar 22:48:20.915 * RDB: 4 MB of memory used by copy-on-write
    2437:M 07 Mar 22:48:20.974 * Background saving terminated with success
    复制代码

    3、现在再登录Sentinel查看现在的主节点是谁:

    复制代码
    [root@Sentinel_1 sentinel]# redis-cli -p 26379       
    127.0.0.1:26379> INFO sentinel
    # Sentinel
    sentinel_masters:1
    sentinel_tilt:0
    sentinel_running_scripts:0
    sentinel_scripts_queue_length:0
    master0:name=master-6379,status=ok,address=192.168.10.132:6379,slaves=1,sentinels=3
    127.0.0.1:26379> 
    复制代码

    可以看到,新的Master已经变成192.168.10.132了。切换后的邮件通知:

    4、把down机的redis启动后,会自动添加为slave角色:

    复制代码
    [root@Redis_Master redis]# sh redis start
    Starting Redis server...
    [root@Redis_Master redis]# tail -f logs/redis_6379.log 
      `-._    `-._`-.__.-'_.-'    _.-'                                   
          `-._    `-.__.-'    _.-'                                       
              `-._        _.-'                                           
                  `-.__.-'                                               
    
    2050:M 07 Mar 22:55:21.357 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
    2050:M 07 Mar 22:55:21.357 # Server started, Redis version 3.0.7
    2050:M 07 Mar 22:55:21.357 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
    2050:M 07 Mar 22:55:21.357 * DB loaded from disk: 0.000 seconds
    2050:M 07 Mar 22:55:21.357 * The server is now ready to accept connections on port 6379
    2050:S 07 Mar 22:55:31.393 * SLAVE OF 192.168.10.132:6379 enabled (user request from 'id=4 addr=192.168.10.129:50326 fd=8 name=sentinel-f391228f-cmd age=10 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=0 qbuf-free=32768 obl=36 oll=0 omem=0 events=rw cmd=exec')
    2050:S 07 Mar 22:55:31.397 # CONFIG REWRITE executed with success.
    2050:S 07 Mar 22:55:31.596 * Connecting to MASTER 192.168.10.132:6379
    2050:S 07 Mar 22:55:31.597 * MASTER <-> SLAVE sync started
    2050:S 07 Mar 22:55:31.597 * Non blocking connect for SYNC fired the event.
    2050:S 07 Mar 22:55:31.598 * Master replied to PING, replication can continue...
    2050:S 07 Mar 22:55:31.600 * Partial resynchronization not possible (no cached master)
    2050:S 07 Mar 22:55:31.634 * Full resync from master: 234202729a196fd6523e41bcb7e29d9866c905c6:1
    2050:S 07 Mar 22:55:31.648 * MASTER <-> SLAVE sync: receiving 34 bytes from master
    2050:S 07 Mar 22:55:31.649 * MASTER <-> SLAVE sync: Flushing old data
    2050:S 07 Mar 22:55:31.649 * MASTER <-> SLAVE sync: Loading DB in memory
    2050:S 07 Mar 22:55:31.649 * MASTER <-> SLAVE sync: Finished with success
    复制代码

    5、查看Sentinel log,可以看到slave被加进来,并成为Slave的角色了:

    4647:X 07 Mar 22:55:31.787 * +convert-to-slave slave 192.168.10.131:6379 192.168.10.131 6379 @ master-6379 192.168.10.132 6379

    测试二、把新的Redis_Master(192.168.10.132,原来的slave)停掉,看是否把新的Slave(192.168.10.131,原来的master)提升为主:

    1、执行redis stop操作

    [root@Redis_Slave redis]# sh redis stop
    Stopping ...
    Waiting for Redis to shutdown ...
    Redis stopped

    2、查看Sentinel log:

    复制代码
    5153:X 07 Mar 23:01:54.895 # +try-failover master master-6379 192.168.10.132 6379
    5153:X 07 Mar 23:01:54.898 # +vote-for-leader 21e629e6d2b26682e660258787d5fb995010e6c8 2
    5153:X 07 Mar 23:01:54.908 # 192.168.10.129:26379 voted for f391228f430177d881464e908c683bfc73d61c24 2
    5153:X 07 Mar 23:01:54.913 # 192.168.10.130:26379 voted for 21e629e6d2b26682e660258787d5fb995010e6c8 2
    5153:X 07 Mar 23:01:54.968 # +elected-leader master master-6379 192.168.10.132 6379
    5153:X 07 Mar 23:01:54.968 # +failover-state-select-slave master master-6379 192.168.10.132 6379
    5153:X 07 Mar 23:01:55.027 # +selected-slave slave 192.168.10.131:6379 192.168.10.131 6379 @ master-6379 192.168.10.132 6379
    5153:X 07 Mar 23:01:55.027 * +failover-state-send-slaveof-noone slave 192.168.10.131:6379 192.168.10.131 6379 @ master-6379 192.168.10.132 6379
    5153:X 07 Mar 23:01:55.085 * +failover-state-wait-promotion slave 192.168.10.131:6379 192.168.10.131 6379 @ master-6379 192.168.10.132 6379
    5153:X 07 Mar 23:01:55.912 # +promoted-slave slave 192.168.10.131:6379 192.168.10.131 6379 @ master-6379 192.168.10.132 6379
    5153:X 07 Mar 23:01:55.915 # +failover-state-reconf-slaves master master-6379 192.168.10.132 6379
    5153:X 07 Mar 23:01:56.009 # +failover-end master master-6379 192.168.10.132 6379
    5153:X 07 Mar 23:01:56.010 # +switch-master master-6379 192.168.10.132 6379 192.168.10.131 6379
    5153:X 07 Mar 23:01:56.010 * +slave slave 192.168.10.132:6379 192.168.10.132 6379 @ master-6379 192.168.10.131 6379
    5153:X 07 Mar 23:02:11.066 # +sdown slave 192.168.10.132:6379 192.168.10.132 6379 @ master-6379 192.168.10.131 6379
    复制代码

    3、再查看新Redis_Master的log,可以看到状态从SLave转回了Master:

    复制代码
    [root@Redis_Master redis]# tail -f logs/redis_6379.log 
    2050:S 07 Mar 23:01:52.367 # Error condition on socket for SYNC: Connection refused
    2050:S 07 Mar 23:01:53.382 * Connecting to MASTER 192.168.10.132:6379
    2050:S 07 Mar 23:01:53.384 * MASTER <-> SLAVE sync started
    2050:S 07 Mar 23:01:53.384 # Error condition on socket for SYNC: Connection refused
    2050:S 07 Mar 23:01:54.404 * Connecting to MASTER 192.168.10.132:6379
    2050:S 07 Mar 23:01:54.405 * MASTER <-> SLAVE sync started
    2050:S 07 Mar 23:01:54.406 # Error condition on socket for SYNC: Connection refused
    2050:M 07 Mar 23:01:54.868 * Discarding previously cached master state.
    2050:M 07 Mar 23:01:54.868 * MASTER MODE enabled (user request from 'id=8 addr=192.168.10.128:37585 fd=6 name=sentinel-21e629e6-cmd age=383 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=0 qbuf-free=32768 obl=36 oll=0 omem=0 events=rw cmd=exec')
    2050:M 07 Mar 23:01:54.870 # CONFIG REWRITE executed with success.
    复制代码

    4、再查看Sentinel的监视信息,可以看到新的Redis_Master已经是192.168.10.131了:

    复制代码
    [root@Sentinel_1 sentinel]# redis-cli -p 26379       
    127.0.0.1:26379> INFO sentinel
    # Sentinel
    sentinel_masters:1
    sentinel_tilt:0
    sentinel_running_scripts:0
    sentinel_scripts_queue_length:0
    master0:name=master-6379,status=ok,address=192.168.10.131:6379,slaves=1,sentinels=3
    127.0.0.1:26379> 
    复制代码

    故障转移后的邮件报警如下:

     5、把down机的Redis启动,Sentinel 又会把它加进来来,作为Slave的角色:

    [root@Redis_Slave redis]# sh redis start
    Starting Redis server...
    [root@Redis_Slave redis]#

    查看Sentinel log:

    2115:X 07 Mar 23:18:49.494 * +convert-to-slave slave 192.168.10.132:6379 192.168.10.132 6379 @ master-6379 192.168.10.131 6379

    再查看Redis-Master log:

    复制代码
    [root@Redis_Master redis]# tail -f logs/redis_6379.log 
    2050:S 07 Mar 23:01:52.367 # Error condition on socket for SYNC: Connection refused
    2050:S 07 Mar 23:01:53.382 * Connecting to MASTER 192.168.10.132:6379
    2050:S 07 Mar 23:01:53.384 * MASTER <-> SLAVE sync started
    2050:S 07 Mar 23:01:53.384 # Error condition on socket for SYNC: Connection refused
    2050:S 07 Mar 23:01:54.404 * Connecting to MASTER 192.168.10.132:6379
    2050:S 07 Mar 23:01:54.405 * MASTER <-> SLAVE sync started
    2050:S 07 Mar 23:01:54.406 # Error condition on socket for SYNC: Connection refused
    2050:M 07 Mar 23:01:54.868 * Discarding previously cached master state.
    2050:M 07 Mar 23:01:54.868 * MASTER MODE enabled (user request from 'id=8 addr=192.168.10.128:37585 fd=6 name=sentinel-21e629e6-cmd age=383 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=0 qbuf-free=32768 obl=36 oll=0 omem=0 events=rw cmd=exec')
    2050:M 07 Mar 23:01:54.870 # CONFIG REWRITE executed with success.
    2050:M 07 Mar 23:10:22.009 * 1 changes in 900 seconds. Saving...
    2050:M 07 Mar 23:10:22.126 * Background saving started by pid 2084
    2084:C 07 Mar 23:10:22.167 * DB saved on disk
    2084:C 07 Mar 23:10:22.167 * RDB: 4 MB of memory used by copy-on-write
    2050:M 07 Mar 23:10:22.229 * Background saving terminated with success
    2050:M 07 Mar 23:18:49.389 * Slave 192.168.10.132:6379 asks for synchronization
    2050:M 07 Mar 23:18:49.389 * Full resync requested by slave 192.168.10.132:6379
    2050:M 07 Mar 23:18:49.389 * Starting BGSAVE for SYNC with target: disk
    2050:M 07 Mar 23:18:49.417 * Background saving started by pid 2085
    2085:C 07 Mar 23:18:49.428 * DB saved on disk
    2085:C 07 Mar 23:18:49.429 * RDB: 4 MB of memory used by copy-on-write
    2050:M 07 Mar 23:18:49.479 * Background saving terminated with success
    2050:M 07 Mar 23:18:49.479 * Synchronization with slave 192.168.10.132:6379 succeeded
    复制代码

    再查看Redis_Slave的log:

    复制代码
    2514:S 07 Mar 23:18:48.859 # CONFIG REWRITE executed with success.
    2514:S 07 Mar 23:18:49.049 * Connecting to MASTER 192.168.10.131:6379
    2514:S 07 Mar 23:18:49.053 * MASTER <-> SLAVE sync started
    2514:S 07 Mar 23:18:49.055 * Non blocking connect for SYNC fired the event.
    2514:S 07 Mar 23:18:49.059 * Master replied to PING, replication can continue...
    2514:S 07 Mar 23:18:49.065 * Partial resynchronization not possible (no cached master)
    2514:S 07 Mar 23:18:49.099 * Full resync from master: 3e1fbd2ec6f57b3362687051ab1bb6edf1d2ee27:1
    2514:S 07 Mar 23:18:49.157 * MASTER <-> SLAVE sync: receiving 34 bytes from master
    2514:S 07 Mar 23:18:49.157 * MASTER <-> SLAVE sync: Flushing old data
    2514:S 07 Mar 23:18:49.157 * MASTER <-> SLAVE sync: Loading DB in memory
    2514:S 07 Mar 23:18:49.157 * MASTER <-> SLAVE sync: Finished with success
    复制代码

    可以看到Redis主从同步还是正常运行的。更多的测试就留给同学们了^o^

    总结:

         一、Redis-Sentinel是Redis官方推荐的高可用性(HA)解决方案,还是比较可靠的,推荐大家在生产环境部署并使用

         二、Redis-Sentinel可以自定义故障转移脚本,这还是比较人性化的,可以结合shell脚本或者Python脚本

         三、现在Redis高可用架构非常多,但各有优劣,需要说的是,如果要上Redis高可用架构,需要反复测试。

    参考资料:

    http://segmentfault.com/a/1190000002680804

    http://segmentfault.com/a/1190000002685515

    http://redis.io/topics/sentinel-clients

    https://pypi.python.org/pypi/redis/

    http://www.cnblogs.com/gomysql/p/5040847.html

  • 相关阅读:
    SQL审核平台
    Redis单线程为什么快简单理解
    性能测试关注指标
    nmon
    pycharm安装教程
    MAVEN中央仓库地址大全
    MAVEN概念、安装与配置、配置文件
    linux在线模拟地址
    HTTP学习链接、书籍
    Java启动exe
  • 原文地址:https://www.cnblogs.com/hujihon/p/6429230.html
Copyright © 2020-2023  润新知