• redis-3.0.1 sentinel 主从高可用 详细配置


    最近项目上线部署,要求redis作高可用,由于redis cluster还不是特别成熟,就选择了redis sentinel做高可用。redis本身有replication,实现主从备份。结合sentinel可以做主、从自动切换。
    生产环境中,一般要求有3个redis节点。但本文为了试验方便,只用了两个节点,一主一从。

    部署规划
    172.16.203.10 主节点
    172.16.203.4 从节点
    redis版本为3.0.1

    主节点
    redis采用源码编译的方式安装,非常简单,解压出来,进入解压目录,执行make就可以了,这里就不再详细介绍了。
    下面来看redis.conf需要做的修改。

    daemonize yes #让redis后台运行
    pidfile /apps/run/redis/redis.pid #指定redis的pid文件存放位置
    port 6379 #redis使用端口
    logfile "/apps/logs/redis/redis.log" #log文件的位置。如果为空,则默认打印到/dev/null
    requirepass 123456 #redis的密码,如果不需要密码验证,则可以不做修改
    masterauth 123456 #如果上面设置了redis的密码,则这里必须设置,而且要和他一样。当该节点作为从节点连接主节点时,要用到这个密码和主节点做校验。
    启动redis:
    src/redis-server redis.conf
    查看当前主从状态:
    src/redis-cli -h 172.16.203.10 -a 123456 info Replication

    # Replication
    role:master
    connected_slaves:0
    master_repl_offset:544693
    repl_backlog_active:1
    repl_backlog_size:1048576
    repl_backlog_first_byte_offset:2
    repl_backlog_histlen:544692
    可以看到,172.16.203.10为master,当前没有slave。
    接下来,就该配置sentinel.conf了:

    port 26379 #sentinel使用的端口
    daemonize yes #sentinel后台运行。这行配置是添加的
    logfile "/apps/logs/redis/sentinel.log" #log文件地址,这行配置是添加的
    sentinel monitor mymaster 172.16.203.10 6379 1 #指定master。后面的数字表示,当有几个节点认为主节点down时才认为主节点进入ODOWN状态,就是真正挂了。
    sentinel down-after-milliseconds mymaster 5000 #当多久,连接不上节点时,认为被连接节点进入S_DOWN(主观认为它down了);
    sentinel failover-timeout mymaster 15000 #这个配置有很多作用。1、重新执行failover的时间是该值的2倍;2、取消一个没更改配置的failover3、failover中等待所有slave更改新的配置的最大时间。
    sentinel auth-pass mymaster 123456 #设置校验的密码。如果redis设置了密码,这个一定要设置
    要修改的就是上面几项,一定要特别注意sentinel auth-pass这一项,别忘记改 。修改好后,先拷贝一个备份。因为运行过程中,redis会自动修改这个配置。如果之后出了问题,可以通过备份恢复成最开始正确的状态。
    启动sentinel
    src/redis-sentine sentinel.conf
    查看sentinel log:

    _._
    _.-``__ ''-._
    _.-`` `. `_. ''-._ Redis 3.0.1 (00000000/0) 64 bit
    .-`` .-```. ```/ _.,_ ''-._
    ( ' , .-` | `, ) Running in sentinel mode
    |`-._`-...-` __...-.``-._|'` _.-'| Port: 26379
    | `-._ `._ / _.-' | PID: 19957
    `-._ `-._ `-./ _.-' _.-'
    |`-._`-._ `-.__.-' _.-'_.-'|
    | `-._`-._ _.-'_.-' | http://redis.io
    `-._ `-._`-.__.-'_.-' _.-'
    |`-._`-._ `-.__.-' _.-'_.-'|
    | `-._`-._ _.-'_.-' |
    `-._ `-._`-.__.-'_.-' _.-'
    `-._ `-.__.-' _.-'
    `-._ _.-'
    `-.__.-'

    19957:X 12 Dec 13:13:36.746 # Sentinel runid is 6ab6f8abdc3dba4097da202954ecece7bc6d3215
    19957:X 12 Dec 13:13:36.746 # +monitor master mymaster 172.16.203.10 6379 quorum 1
    第一行表示当前Sentinel 的id,第二行显示当前的主节点是172.16.203.10 6379
    查看下午Sentinel的状态:
    src/redis-cli -h 172.16.203.10 -a 123456 -p 26379 info Sentinel

    # Sentinel
    sentinel_masters:1
    sentinel_tilt:0
    sentinel_running_scripts:0
    sentinel_scripts_queue_length:0
    master0:name=mymaster,status=ok,address=172.16.203.10:6379,slaves=0,sentinels=1
    从节点
    redis.conf的配置与主节点只有一点不同,增加下面一行:
    slaveof 172.16.203.10 6379
    启动redis
    src/redis-server redis.conf
    在从节点查看主、从状态:
    src/redis-cli -h 172.16.203.4 -a 123456 info Replication

    # Replication
    role:slave
    master_host:172.16.203.10
    master_port:6379
    master_link_status:up
    master_last_io_seconds_ago:0
    master_sync_in_progress:0
    slave_repl_offset:617956
    slave_priority:100
    slave_read_only:1
    connected_slaves:0
    master_repl_offset:0
    repl_backlog_active:0
    repl_backlog_size:1048576
    repl_backlog_first_byte_offset:0
    repl_backlog_histlen:0
    可以看到当前节点为slave。
    sentinel的配置和主节点保持一致就可以,启动sentinel:
    src/redis-sentine sentinel.conf
    查看sentinel log:

    12190:X 12 Dec 13:21:38.658 # Sentinel runid is 270f322d0f3f8605b92902417e499cedc8866163
    12190:X 12 Dec 13:21:38.658 # +monitor master mymaster 172.16.203.10 6379 quorum 1
    12190:X 12 Dec 13:21:38.659 * +slave slave 172.16.203.4:6379 172.16.203.4 6379 @ mymaster 172.16.203.10 6379
    12190:X 12 Dec 13:21:39.609 * +sentinel sentinel 172.16.203.10:26379 172.16.203.10 26379 @ mymaster 172.16.203.10 6379
    查看下sentinel状态:
    src/redis-cli -h 172.16.203.4 -a 123456 -p 26379 info Sentinel

    # Sentinel
    sentinel_masters:1
    sentinel_tilt:0
    sentinel_running_scripts:0
    sentinel_scripts_queue_length:0
    master0:name=mymaster,status=ok,address=172.16.203.10:6379,slaves=1,sentinels=2
    可以看出,当前有两个sentinels,一个slave。
    到此,redis主从高可用就算配置结束了,下面开始验证

    验证
    1、从节点down机,redis、sentinel都挂了,关注主节点sentinel的log
    +sdown sentinel 172.16.203.4:26379 172.16.203.4 26379 @ mymaster 172.16.203.10 6379
    +sdown slave 172.16.203.4:6379 172.16.203.4 6379 @ mymaster 172.16.203.10 6379
    2、重新启动从节点上的redis、sentinel
    -sdown slave 172.16.203.4:6379 172.16.203.4 6379 @ mymaster 172.16.203.10 6379
    -sdown sentinel 172.16.203.4:26379 172.16.203.4 26379 @ mymaster 172.16.203.10 6379
    -dup-sentinel master mymaster 172.16.203.10 6379 #duplicate of 172.16.203.4:26379 or 0b0bf0cddcf7aa5b518a8a62c65188f9c4a1ecaf
    +sentinel sentinel 172.16.203.4:26379 172.16.203.4 26379 @ mymaster 172.16.203.10 6379
    可以看到,sentinel 的id变了,自动更新了sentinel配置文件中的相应配置。
    查看主、从情况:
    src/redis-cli -h 172.16.203.10 -a 123456 -p 6379 info Replication

    # Replication
    role:master
    connected_slaves:1
    slave0:ip=172.16.203.4,port=6379,state=online,offset=862487,lag=1
    master_repl_offset:862642
    repl_backlog_active:1
    repl_backlog_size:1048576
    repl_backlog_first_byte_offset:2
    repl_backlog_histlen:862641
    3、主节点down机
    先停掉redis看主节点sentinel:

    19957:X 12 Dec 15:10:19.207 # +sdown master mymaster 172.16.203.10 6379
    19957:X 12 Dec 15:10:19.207 # +odown master mymaster 172.16.203.10 6379 #quorum 1/1
    19957:X 12 Dec 15:10:19.207 # +new-epoch 1
    19957:X 12 Dec 15:10:19.207 # +try-failover master mymaster 172.16.203.10 6379
    19957:X 12 Dec 15:10:19.208 # +vote-for-leader 6ab6f8abdc3dba4097da202954ecece7bc6d3215 1
    19957:X 12 Dec 15:10:19.211 # 172.16.203.4:26379 voted for 6ab6f8abdc3dba4097da202954ecece7bc6d3215 1
    19957:X 12 Dec 15:10:19.275 # +elected-leader master mymaster 172.16.203.10 6379
    19957:X 12 Dec 15:10:19.275 # +failover-state-select-slave master mymaster 172.16.203.10 6379
    19957:X 12 Dec 15:10:19.375 # +selected-slave slave 172.16.203.4:6379 172.16.203.4 6379 @ mymaster 172.16.203.10 6379
    19957:X 12 Dec 15:10:19.375 * +failover-state-send-slaveof-noone slave 172.16.203.4:6379 172.16.203.4 6379 @ mymaster 172.16.203.10 6379
    19957:X 12 Dec 15:10:19.447 * +failover-state-wait-promotion slave 172.16.203.4:6379 172.16.203.4 6379 @ mymaster 172.16.203.10 6379
    19957:X 12 Dec 15:10:20.216 # +promoted-slave slave 172.16.203.4:6379 172.16.203.4 6379 @ mymaster 172.16.203.10 6379
    19957:X 12 Dec 15:10:20.216 # +failover-state-reconf-slaves master mymaster 172.16.203.10 6379
    19957:X 12 Dec 15:10:20.297 # +failover-end master mymaster 172.16.203.10 6379
    19957:X 12 Dec 15:10:20.297 # +switch-master mymaster 172.16.203.10 6379 172.16.203.4 6379
    19957:X 12 Dec 15:10:20.298 * +slave slave 172.16.203.10:6379 172.16.203.10 6379 @ mymaster 172.16.203.4 6379
    19957:X 12 Dec 15:10:25.350 # +sdown slave 172.16.203.10:6379 172.16.203.10 6379 @ mymaster 172.16.203.4 6379
    redis主节点挂了后,首先重新选择leader(注意区分leader和master,leader对应sentinel,master对应redis),可以看到,leader选择为172.16.203.10,之后他开始选择master:
    failover-state-select-slave
    下面表示找到了合适的slave:172.16.203.4 6379
    selected-slave 172.16.203.4 6379
    然后更改选中的这个节点的配置文件
    failover-state-send-slaveof-noone
    等待其他sentinel的确认:
    failover-state-wait-promotion
    确认成功:
    promoted-slave
    开始对slaves进行reconfig操作。
    failover-state-reconf-slaves
    failover结束
    failover-end
    监听新的master
    switch-master

    看看从节点的sentinel日志:

    24199:X 12 Dec 15:10:19.210 # +vote-for-leader 6ab6f8abdc3dba4097da202954ecece7bc6d3215 1
    24199:X 12 Dec 15:10:19.249 # +sdown master mymaster 172.16.203.10 6379
    24199:X 12 Dec 15:10:19.249 # +odown master mymaster 172.16.203.10 6379 #quorum 1/1
    24199:X 12 Dec 15:10:19.249 # Next failover delay: I will not start a failover before Sat Dec 12 15:10:50 2015
    24199:X 12 Dec 15:10:20.299 # +config-update-from sentinel 172.16.203.10:26379 172.16.203.10 26379 @ mymaster 172.16.203.10 6379
    24199:X 12 Dec 15:10:20.299 # +switch-master mymaster 172.16.203.10 6379 172.16.203.4 6379
    24199:X 12 Dec 15:10:20.299 * +slave slave 172.16.203.10:6379 172.16.203.10 6379 @ mymaster 172.16.203.4 6379
    24199:X 12 Dec 15:10:25.315 # +sdown slave 172.16.203.10:6379 172.16.203.10 6379 @ mymaster 172.16.203.4 6379
    再停掉master的sentinel
    +sdown sentinel 172.16.203.10:26379 172.16.203.10 26379 @ mymaster 172.16.203.4 6379

    问题
    1、停掉一个sentinel,然后再停掉master,sentinel一直这个状态:


    18430:X 12 Dec 11:36:37.949 # +new-epoch 68
    18430:X 12 Dec 11:36:37.949 # +try-failover master mymaster 127.0.0.1 6380
    18430:X 12 Dec 11:36:39.179 # +vote-for-leader 1c9ea5336e95283251d9e53dccf8f6dedd51536d 68
    18430:X 12 Dec 11:36:48.077 # -failover-abort-not-elected master mymaster 127.0.0.1 6380
    18430:X 12 Dec 11:36:48.177 # Next failover delay: I will not start a failover before Sat Dec 12 11:42:38 2015
    18430:X 12 Dec 11:42:38.057 # +new-epoch 69
    18430:X 12 Dec 11:42:38.057 # +try-failover master mymaster 127.0.0.1 6380
    18430:X 12 Dec 11:42:38.106 # +vote-for-leader 1c9ea5336e95283251d9e53dccf8f6dedd51536d 69
    18430:X 12 Dec 11:42:48.443 # -failover-abort-not-elected master mymaster 127.0.0.1 6380
    18430:X 12 Dec 11:42:48.544 # Next failover delay: I will not start a failover before Sat Dec 12 11:48:38 2015
    这里要提下sentinel的leader选举流程:每个发现主服务器进入客观下线的sentinel,在发送is-master-down-by-addr询问的时候,
    会带上自己的run id,要求其他sentinel将自己设置为局部领头sentinel。局部领头sentinel是先到先得:只有第一个发送is-master-down-by-addr询问的sentinel被设为局部领头sentinel,后续的都会被拒绝。如果有某个sentinel被**半数以上**sentinel设置局部领头sentinel,则这个sentinel成为领头sentinel。
    注意半数以上 ,虽然我们停掉了一个sentinel,但由于配置文件纪录了他,所以sentinel数量还是2。半数以上也就是2,但实际我们只有一个sentinel,因此永远也选不出leader,也就不会进行failover。
    ---------------------

  • 相关阅读:
    hdu 1290 献给杭电五十周年校庆的礼物 (DP)
    hdu 3123 GCC (数学)
    hdu 1207 汉诺塔II (DP)
    hdu 1267 下沙的沙子有几粒? (DP)
    hdu 1249 三角形 (DP)
    hdu 2132 An easy problem (递推)
    hdu 2139 Calculate the formula (递推)
    hdu 1284 钱币兑换问题 (DP)
    hdu 4151 The Special Number (DP)
    hdu 1143 Tri Tiling (DP)
  • 原文地址:https://www.cnblogs.com/hyhy904/p/10961714.html
Copyright © 2020-2023  润新知