• redis-3.0.1 sentinel 主从高可用 详细配置


    最近项目上线部署,要求redis作高可用,由于redis cluster还不是特别成熟,就选择了redis sentinel做高可用。redis本身有replication,实现主从备份。结合sentinel可以做主、从自动切换。
    生产环境中,一般要求有3个redis节点。但本文为了试验方便,只用了两个节点,一主一从。

    部署规划
    172.16.203.10 主节点
    172.16.203.4 从节点
    redis版本为3.0.1

    主节点
    redis采用源码编译的方式安装,非常简单,解压出来,进入解压目录,执行make就可以了,这里就不再详细介绍了。
    下面来看redis.conf需要做的修改。

    daemonize yes #让redis后台运行
    pidfile /apps/run/redis/redis.pid #指定redis的pid文件存放位置
    port 6379 #redis使用端口
    logfile "/apps/logs/redis/redis.log" #log文件的位置。如果为空,则默认打印到/dev/null
    requirepass 123456 #redis的密码,如果不需要密码验证,则可以不做修改
    masterauth 123456 #如果上面设置了redis的密码,则这里必须设置,而且要和他一样。当该节点作为从节点连接主节点时,要用到这个密码和主节点做校验。
    启动redis:
    src/redis-server redis.conf
    查看当前主从状态:
    src/redis-cli -h 172.16.203.10 -a 123456 info Replication

    # Replication
    role:master
    connected_slaves:0
    master_repl_offset:544693
    repl_backlog_active:1
    repl_backlog_size:1048576
    repl_backlog_first_byte_offset:2
    repl_backlog_histlen:544692
    可以看到,172.16.203.10为master,当前没有slave。
    接下来,就该配置sentinel.conf了:

    port 26379 #sentinel使用的端口
    daemonize yes #sentinel后台运行。这行配置是添加的
    logfile "/apps/logs/redis/sentinel.log" #log文件地址,这行配置是添加的
    sentinel monitor mymaster 172.16.203.10 6379 1 #指定master。后面的数字表示,当有几个节点认为主节点down时才认为主节点进入ODOWN状态,就是真正挂了。
    sentinel down-after-milliseconds mymaster 5000 #当多久,连接不上节点时,认为被连接节点进入S_DOWN(主观认为它down了);
    sentinel failover-timeout mymaster 15000 #这个配置有很多作用。1、重新执行failover的时间是该值的2倍;2、取消一个没更改配置的failover3、failover中等待所有slave更改新的配置的最大时间。
    sentinel auth-pass mymaster 123456 #设置校验的密码。如果redis设置了密码,这个一定要设置
    要修改的就是上面几项,一定要特别注意sentinel auth-pass这一项,别忘记改 。修改好后,先拷贝一个备份。因为运行过程中,redis会自动修改这个配置。如果之后出了问题,可以通过备份恢复成最开始正确的状态。
    启动sentinel
    src/redis-sentine sentinel.conf
    查看sentinel log:

    _._
    _.-``__ ''-._
    _.-`` `. `_. ''-._ Redis 3.0.1 (00000000/0) 64 bit
    .-`` .-```. ```/ _.,_ ''-._
    ( ' , .-` | `, ) Running in sentinel mode
    |`-._`-...-` __...-.``-._|'` _.-'| Port: 26379
    | `-._ `._ / _.-' | PID: 19957
    `-._ `-._ `-./ _.-' _.-'
    |`-._`-._ `-.__.-' _.-'_.-'|
    | `-._`-._ _.-'_.-' | http://redis.io
    `-._ `-._`-.__.-'_.-' _.-'
    |`-._`-._ `-.__.-' _.-'_.-'|
    | `-._`-._ _.-'_.-' |
    `-._ `-._`-.__.-'_.-' _.-'
    `-._ `-.__.-' _.-'
    `-._ _.-'
    `-.__.-'

    19957:X 12 Dec 13:13:36.746 # Sentinel runid is 6ab6f8abdc3dba4097da202954ecece7bc6d3215
    19957:X 12 Dec 13:13:36.746 # +monitor master mymaster 172.16.203.10 6379 quorum 1
    第一行表示当前Sentinel 的id,第二行显示当前的主节点是172.16.203.10 6379
    查看下午Sentinel的状态:
    src/redis-cli -h 172.16.203.10 -a 123456 -p 26379 info Sentinel

    # Sentinel
    sentinel_masters:1
    sentinel_tilt:0
    sentinel_running_scripts:0
    sentinel_scripts_queue_length:0
    master0:name=mymaster,status=ok,address=172.16.203.10:6379,slaves=0,sentinels=1
    从节点
    redis.conf的配置与主节点只有一点不同,增加下面一行:
    slaveof 172.16.203.10 6379
    启动redis
    src/redis-server redis.conf
    在从节点查看主、从状态:
    src/redis-cli -h 172.16.203.4 -a 123456 info Replication

    # Replication
    role:slave
    master_host:172.16.203.10
    master_port:6379
    master_link_status:up
    master_last_io_seconds_ago:0
    master_sync_in_progress:0
    slave_repl_offset:617956
    slave_priority:100
    slave_read_only:1
    connected_slaves:0
    master_repl_offset:0
    repl_backlog_active:0
    repl_backlog_size:1048576
    repl_backlog_first_byte_offset:0
    repl_backlog_histlen:0
    可以看到当前节点为slave。
    sentinel的配置和主节点保持一致就可以,启动sentinel:
    src/redis-sentine sentinel.conf
    查看sentinel log:

    12190:X 12 Dec 13:21:38.658 # Sentinel runid is 270f322d0f3f8605b92902417e499cedc8866163
    12190:X 12 Dec 13:21:38.658 # +monitor master mymaster 172.16.203.10 6379 quorum 1
    12190:X 12 Dec 13:21:38.659 * +slave slave 172.16.203.4:6379 172.16.203.4 6379 @ mymaster 172.16.203.10 6379
    12190:X 12 Dec 13:21:39.609 * +sentinel sentinel 172.16.203.10:26379 172.16.203.10 26379 @ mymaster 172.16.203.10 6379
    查看下sentinel状态:
    src/redis-cli -h 172.16.203.4 -a 123456 -p 26379 info Sentinel

    # Sentinel
    sentinel_masters:1
    sentinel_tilt:0
    sentinel_running_scripts:0
    sentinel_scripts_queue_length:0
    master0:name=mymaster,status=ok,address=172.16.203.10:6379,slaves=1,sentinels=2
    可以看出,当前有两个sentinels,一个slave。
    到此,redis主从高可用就算配置结束了,下面开始验证

    验证
    1、从节点down机,redis、sentinel都挂了,关注主节点sentinel的log
    +sdown sentinel 172.16.203.4:26379 172.16.203.4 26379 @ mymaster 172.16.203.10 6379
    +sdown slave 172.16.203.4:6379 172.16.203.4 6379 @ mymaster 172.16.203.10 6379
    2、重新启动从节点上的redis、sentinel
    -sdown slave 172.16.203.4:6379 172.16.203.4 6379 @ mymaster 172.16.203.10 6379
    -sdown sentinel 172.16.203.4:26379 172.16.203.4 26379 @ mymaster 172.16.203.10 6379
    -dup-sentinel master mymaster 172.16.203.10 6379 #duplicate of 172.16.203.4:26379 or 0b0bf0cddcf7aa5b518a8a62c65188f9c4a1ecaf
    +sentinel sentinel 172.16.203.4:26379 172.16.203.4 26379 @ mymaster 172.16.203.10 6379
    可以看到,sentinel 的id变了,自动更新了sentinel配置文件中的相应配置。
    查看主、从情况:
    src/redis-cli -h 172.16.203.10 -a 123456 -p 6379 info Replication

    # Replication
    role:master
    connected_slaves:1
    slave0:ip=172.16.203.4,port=6379,state=online,offset=862487,lag=1
    master_repl_offset:862642
    repl_backlog_active:1
    repl_backlog_size:1048576
    repl_backlog_first_byte_offset:2
    repl_backlog_histlen:862641
    3、主节点down机
    先停掉redis看主节点sentinel:

    19957:X 12 Dec 15:10:19.207 # +sdown master mymaster 172.16.203.10 6379
    19957:X 12 Dec 15:10:19.207 # +odown master mymaster 172.16.203.10 6379 #quorum 1/1
    19957:X 12 Dec 15:10:19.207 # +new-epoch 1
    19957:X 12 Dec 15:10:19.207 # +try-failover master mymaster 172.16.203.10 6379
    19957:X 12 Dec 15:10:19.208 # +vote-for-leader 6ab6f8abdc3dba4097da202954ecece7bc6d3215 1
    19957:X 12 Dec 15:10:19.211 # 172.16.203.4:26379 voted for 6ab6f8abdc3dba4097da202954ecece7bc6d3215 1
    19957:X 12 Dec 15:10:19.275 # +elected-leader master mymaster 172.16.203.10 6379
    19957:X 12 Dec 15:10:19.275 # +failover-state-select-slave master mymaster 172.16.203.10 6379
    19957:X 12 Dec 15:10:19.375 # +selected-slave slave 172.16.203.4:6379 172.16.203.4 6379 @ mymaster 172.16.203.10 6379
    19957:X 12 Dec 15:10:19.375 * +failover-state-send-slaveof-noone slave 172.16.203.4:6379 172.16.203.4 6379 @ mymaster 172.16.203.10 6379
    19957:X 12 Dec 15:10:19.447 * +failover-state-wait-promotion slave 172.16.203.4:6379 172.16.203.4 6379 @ mymaster 172.16.203.10 6379
    19957:X 12 Dec 15:10:20.216 # +promoted-slave slave 172.16.203.4:6379 172.16.203.4 6379 @ mymaster 172.16.203.10 6379
    19957:X 12 Dec 15:10:20.216 # +failover-state-reconf-slaves master mymaster 172.16.203.10 6379
    19957:X 12 Dec 15:10:20.297 # +failover-end master mymaster 172.16.203.10 6379
    19957:X 12 Dec 15:10:20.297 # +switch-master mymaster 172.16.203.10 6379 172.16.203.4 6379
    19957:X 12 Dec 15:10:20.298 * +slave slave 172.16.203.10:6379 172.16.203.10 6379 @ mymaster 172.16.203.4 6379
    19957:X 12 Dec 15:10:25.350 # +sdown slave 172.16.203.10:6379 172.16.203.10 6379 @ mymaster 172.16.203.4 6379
    redis主节点挂了后,首先重新选择leader(注意区分leader和master,leader对应sentinel,master对应redis),可以看到,leader选择为172.16.203.10,之后他开始选择master:
    failover-state-select-slave
    下面表示找到了合适的slave:172.16.203.4 6379
    selected-slave 172.16.203.4 6379
    然后更改选中的这个节点的配置文件
    failover-state-send-slaveof-noone
    等待其他sentinel的确认:
    failover-state-wait-promotion
    确认成功:
    promoted-slave
    开始对slaves进行reconfig操作。
    failover-state-reconf-slaves
    failover结束
    failover-end
    监听新的master
    switch-master

    看看从节点的sentinel日志:

    24199:X 12 Dec 15:10:19.210 # +vote-for-leader 6ab6f8abdc3dba4097da202954ecece7bc6d3215 1
    24199:X 12 Dec 15:10:19.249 # +sdown master mymaster 172.16.203.10 6379
    24199:X 12 Dec 15:10:19.249 # +odown master mymaster 172.16.203.10 6379 #quorum 1/1
    24199:X 12 Dec 15:10:19.249 # Next failover delay: I will not start a failover before Sat Dec 12 15:10:50 2015
    24199:X 12 Dec 15:10:20.299 # +config-update-from sentinel 172.16.203.10:26379 172.16.203.10 26379 @ mymaster 172.16.203.10 6379
    24199:X 12 Dec 15:10:20.299 # +switch-master mymaster 172.16.203.10 6379 172.16.203.4 6379
    24199:X 12 Dec 15:10:20.299 * +slave slave 172.16.203.10:6379 172.16.203.10 6379 @ mymaster 172.16.203.4 6379
    24199:X 12 Dec 15:10:25.315 # +sdown slave 172.16.203.10:6379 172.16.203.10 6379 @ mymaster 172.16.203.4 6379
    再停掉master的sentinel
    +sdown sentinel 172.16.203.10:26379 172.16.203.10 26379 @ mymaster 172.16.203.4 6379

    问题
    1、停掉一个sentinel,然后再停掉master,sentinel一直这个状态:


    18430:X 12 Dec 11:36:37.949 # +new-epoch 68
    18430:X 12 Dec 11:36:37.949 # +try-failover master mymaster 127.0.0.1 6380
    18430:X 12 Dec 11:36:39.179 # +vote-for-leader 1c9ea5336e95283251d9e53dccf8f6dedd51536d 68
    18430:X 12 Dec 11:36:48.077 # -failover-abort-not-elected master mymaster 127.0.0.1 6380
    18430:X 12 Dec 11:36:48.177 # Next failover delay: I will not start a failover before Sat Dec 12 11:42:38 2015
    18430:X 12 Dec 11:42:38.057 # +new-epoch 69
    18430:X 12 Dec 11:42:38.057 # +try-failover master mymaster 127.0.0.1 6380
    18430:X 12 Dec 11:42:38.106 # +vote-for-leader 1c9ea5336e95283251d9e53dccf8f6dedd51536d 69
    18430:X 12 Dec 11:42:48.443 # -failover-abort-not-elected master mymaster 127.0.0.1 6380
    18430:X 12 Dec 11:42:48.544 # Next failover delay: I will not start a failover before Sat Dec 12 11:48:38 2015
    这里要提下sentinel的leader选举流程:每个发现主服务器进入客观下线的sentinel,在发送is-master-down-by-addr询问的时候,
    会带上自己的run id,要求其他sentinel将自己设置为局部领头sentinel。局部领头sentinel是先到先得:只有第一个发送is-master-down-by-addr询问的sentinel被设为局部领头sentinel,后续的都会被拒绝。如果有某个sentinel被**半数以上**sentinel设置局部领头sentinel,则这个sentinel成为领头sentinel。
    注意半数以上 ,虽然我们停掉了一个sentinel,但由于配置文件纪录了他,所以sentinel数量还是2。半数以上也就是2,但实际我们只有一个sentinel,因此永远也选不出leader,也就不会进行failover。
    ---------------------

  • 相关阅读:
    Spring进阶—如何用Java代码实现邮件发送(一)
    如何在Apache中使用PHP处理PHP文件
    最“高大上”的Spring测试:Spring Test
    【编程直播】来约吗?
    【PaPaPa】实现缓存决策
    【PaPaPa】系统架构搭建浅析
    【PaPaPa】集成B/S主流技术的MVC5项目
    【轮子狂魔】手把手教你自造Redis Client
    【轮子狂魔】抛弃IIS,打造个性的Web Server
    【轮子狂魔】抛弃IIS,向天借个HttpListener
  • 原文地址:https://www.cnblogs.com/hyhy904/p/10961714.html
Copyright © 2020-2023  润新知