redis版本:redis-3.0.6.tar.gz
master:192.168.3.180
slave:192.168.3.184 (机器原因,两从都在这上面)
一.redis安装
cd /root/tools/ tar -zxvf redis-3.0.6.tar.gz cd redis-3.0.6 make install PREFIX=/usr/local/redis ln -s /usr/local/redis/bin/redis-cli /usr/local/bin/redis-cli cp utils/redis_init_script /etc/init.d/redis mkdir /etc/redis cp redis.conf /etc/redis/6379.conf cp -rp sentinel.conf /etc/redis/sentinel_26379.conf
二.配置
主服务器上redis
vim 6379.conf daemonize yes pidfile /var/run/redis_6379.pid port 6379 bind 0.0.0.0 (很重要,否则主从同步会失败) logfile "/var/log/redis_6379.log" dbfilename "dump_6379.rdb" dir "/opt/redis/6379" requirepass 123456 (设置redis密码)
哨兵sentinel (本文只有一个哨兵,哨兵也可以部署多台机器,形成哨兵集群,避免单点的问题,保证系统的高可用。)
#修改或添加以下几项 vim sentinel_26379.conf port 26379 daemonize yes logfile "/tmp/sentinel.log" dir "/opt/redis/redis_sentinel" sentinel monitor mymaster 192.168.3.180 6379 1 (最后的数字1指明当有多少个sentinel认为一个master失效时,master才算真正失效) sentinel parallel-syncs mymaster 2 (最后的数字2表示有多少个slave) sentinel auth-pass mymaster password-xxx (如果redis配有密码,为了使哨兵能登录监控,需要给哨兵也配置密码认证)
从服务器上
#从服务器配置可复制主服务器的,只需要修改相应端口和修改添加以下两行即可 slaveof 192.168.3.180 6379 (指定主redis) masterauth "1234556" (指定master密码)
测试:
分别启动master和两个slave
/usr/local/redis/bin/redis-server /etc/redis/6379.conf
master上:
[root@novel tmp]# redis-cli -p 6379 -a password 127.0.0.1:6379> info replication # Replication role:master connected_slaves:2 slave0:ip=192.168.3.184,port=6391,state=online,offset=190442,lag=0 slave1:ip=192.168.3.184,port=6390,state=online,offset=190442,lag=0 master_repl_offset:190442 repl_backlog_active:1 repl_backlog_size:1048576 repl_backlog_first_byte_offset:2 repl_backlog_histlen:190441
slave上:
[root@danny redis]# redis-cli -p 6390 -a password 127.0.0.1:6390> info replication # Replication role:slave master_host:192.168.3.180 master_port:6379 master_link_status:up [root@danny redis]# redis-cli -p 6391 -a password 127.0.0.1:6391> info replication # Replication role:slave master_host:192.168.3.180 master_port:6379 master_link_status:up
以上可看到已经完成主从同步,也可以去看看slave的启动日志,查看同步实时信息
[root@danny redis]# tailf /var/log/redis_6390.log 4706:S 28 Jan 13:05:29.985 * Connecting to MASTER 192.168.3.180:6390 4706:S 28 Jan 13:05:29.985 * MASTER <-> SLAVE sync started 4706:S 28 Jan 13:05:29.986 * Non blocking connect for SYNC fired the event. 4706:S 28 Jan 13:05:29.987 * Master replied to PING, replication can continue... 4706:S 28 Jan 13:05:29.987 * Partial resynchronization not possible (no cached master) 4706:S 28 Jan 13:05:29.989 * Full resync from master: 4da2c58d50928717d9a45216ced5c36a45a3b78c:29 4706:S 28 Jan 13:05:30.055 * MASTER <-> SLAVE sync: receiving 18 bytes from master 4706:S 28 Jan 13:05:30.055 * MASTER <-> SLAVE sync: Flushing old data 4706:S 28 Jan 13:05:30.055 * MASTER <-> SLAVE sync: Loading DB in memory 4706:S 28 Jan 13:05:30.055 * MASTER <-> SLAVE sync: Finished with success
三.哨兵启用
进入redis的安装目录可以看到
[root@danny bin]# ls dump.rdb redis-benchmark redis-check-aof redis-check-dump redis-cli redis-sentinel redis-server [root@danny bin]# pwd /usr/local/redis/bin
启动哨兵
./redis-sentinel /etc/redis/sentinel_26379.conf
四.哨兵测试
打开slave机上哨兵日志
关闭master机上redis
日志如下:
4862:X 28 Jan 14:47:21.086 # +sdown master mymaster 192.168.3.180 6390 4862:X 28 Jan 14:47:21.086 # +odown master mymaster 192.168.3.180 6390 #quorum 1/1 4862:X 28 Jan 14:47:21.086 # +new-epoch 1 4862:X 28 Jan 14:47:21.086 # +try-failover master mymaster 192.168.3.180 6390 4862:X 28 Jan 14:47:21.132 # +vote-for-leader 911877d1d33938dbdfdd0224ea61a3245df12617 1 4862:X 28 Jan 14:47:21.132 # +elected-leader master mymaster 192.168.3.180 6390 4862:X 28 Jan 14:47:21.133 # +failover-state-select-slave master mymaster 192.168.3.180 6390 4862:X 28 Jan 14:47:21.224 # +selected-slave slave 192.168.3.184:6390 192.168.3.184 6390 @ mymaster 192.168.3.180 6390 4862:X 28 Jan 14:47:21.224 * +failover-state-send-slaveof-noone slave 192.168.3.184:6390 192.168.3.184 6390 @ mymaster 192.168.3.180 6390 4862:X 28 Jan 14:47:21.307 * +failover-state-wait-promotion slave 192.168.3.184:6390 192.168.3.184 6390 @ mymaster 192.168.3.180 6390 4862:X 28 Jan 14:47:22.189 # +promoted-slave slave 192.168.3.184:6390 192.168.3.184 6390 @ mymaster 192.168.3.180 6390 4862:X 28 Jan 14:47:22.189 # +failover-state-reconf-slaves master mymaster 192.168.3.180 6390 4862:X 28 Jan 14:47:22.237 * +slave-reconf-sent slave 192.168.3.184:6391 192.168.3.184 6391 @ mymaster 192.168.3.180 6390 4862:X 28 Jan 14:47:23.194 * +slave-reconf-inprog slave 192.168.3.184:6391 192.168.3.184 6391 @ mymaster 192.168.3.180 6390 4862:X 28 Jan 14:47:23.194 * +slave-reconf-done slave 192.168.3.184:6391 192.168.3.184 6391 @ mymaster 192.168.3.180 6390 4862:X 28 Jan 14:47:23.270 # +failover-end master mymaster 192.168.3.180 6390 4862:X 28 Jan 14:47:23.270 # +switch-master mymaster 192.168.3.180 6390 192.168.3.184 6390 4862:X 28 Jan 14:47:23.271 * +slave slave 192.168.3.184:6391 192.168.3.184 6391 @ mymaster 192.168.3.184 6390 4862:X 28 Jan 14:47:23.271 * +slave slave 192.168.3.180:6390 192.168.3.180 6390 @ mymaster 192.168.3.184 6390 4862:X 28 Jan 14:47:53.292 # +sdown slave 192.168.3.180:6390 192.168.3.180 6390 @ mymaster 192.168.3.184 6390 4862:X 28 Jan 14:48:40.874 # -sdown slave 192.168.3.180:6390 192.168.3.180 6390 @ mymaster 192.168.3.184 6390
可以看到,新的master已经被选举出来了。
备注:
有时候redis的master宕了之后哨兵并没有实现故障转移,哨兵日志打印为:
failover-abort-not-elected master mymaster 192.168.1.88 6379
可能原因之一就是哨兵集群的下列两个初始值没有一致,保证每个哨兵的初始值一样即可。
sentinel config-epoch mymaster 14 sentinel leader-epoch mymaster 14