• Redis Cluster在线迁移


    由于之前的redis cluster物理硬件性能不足。决定升级到更好的服务器上。
    考虑到redis是核心生产数据库,决定在线迁移,迁移过程,不中断服务。

    下面是测试环境的完成迁移步骤:
    1. 原环境(测试环境,没有创建slave)

    10.21.14.251:7000
    10.21.14.251:7001
    10.21.14.251:7002
     

    2. 在新主机上,启动三个redis实例

    10.21.10.120:7000
    10.21.10.120:7001
    10.21.10.120:7002
     

    3. 将三个redis,都添加到集群中. 命令格式redis-trib.rb add-node <新增节点名> < 原集群节点名>

    ./redis-trib.rb add-node  10.21.10.120:7000  10.21.14.251:7000
    ./redis-trib.rb add-node  10.21.10.120:7001  10.21.14.251:7000
    ./redis-trib.rb add-node  10.21.10.120:7002  10.21.14.251:7000
     

    4. 确认添加成功之后,开始reshard slot

    ./redis-trib.rb reshard 10.21.10.120:7000
     

    5. 遇到点问题,由于网络超时等原因,导致resharding中断。然后出现两边都有slot的情况,需要通过fix的方式来修复

    [redis@ip-10-21-14-251 redis]$ ./redis-trib.rb reshard 10.21.10.120:7000
    >>> Performing Cluster Check (using node 10.21.10.120:7000)
    M: 4422ab38377fa8828e0f7884570b3b482a66496b 10.21.10.120:7000
       slots:5026-5460 (435 slots) master
       0 additional replica(s)
    M: 5b38e63a1091baa3a871a52275489a2aa1d28bfb 10.21.10.120:7002
       slots:894-3397 (2504 slots) master
       0 additional replica(s)
    M: bb1572074d41254e5b4d5aae5c52e54f5129d6d5 10.21.14.251:7001
       slots:3398-4999,5461-15922 (12064 slots) master
       0 additional replica(s)
    M: 396a7fbd2ec61752f9e848a1d8cc7b405aef0356 10.21.14.251:7000
       slots: (0 slots) master
       0 additional replica(s)
    M: 9f215e7e4b511f3d2bbf5d734731899b71a62a3b 10.21.10.120:7001
       slots:0-893,5000-5025,15923-15948 (946 slots) master
       0 additional replica(s)
    M: 0c9b383f65ae4fefc5e02617fb76a845d7510a53 10.21.14.251:7002
       slots:15949-16383 (435 slots) master
       0 additional replica(s)
    [OK] All nodes agree about slots configuration.
    >>> Check for open slots...
    [WARNING] Node 10.21.10.120:7002 has slots in importing state (3398).
    [WARNING] Node 10.21.14.251:7001 has slots in migrating state (3398).
    [WARNING] The following slots are open: 3398
    >>> Check slots coverage...
    [OK] All 16384 slots covered.
    *** Please fix your cluster problems before resharding     <<<<<<<<<<<<<<<<<<<<<
     

    6. 可以使用下面命令进行集群检查

    ./redis-trib.rb check 10.21.10.120:7000
     

    7. 执行下面命令进行修复,然后就可以继续reshard slot了

    ./redis-trib.rb fix 10.21.10.120:7000
     

    8. 全部slot迁移完成

    $ ./redis-trib.rb check 10.21.10.120:7000
    >>> Performing Cluster Check (using node 10.21.10.120:7000)
    M: 4422ab38377fa8828e0f7884570b3b482a66496b 10.21.10.120:7000
       slots:3399-4999,5026-7332 (3908 slots) master
       0 additional replica(s)
    M: 5b38e63a1091baa3a871a52275489a2aa1d28bfb 10.21.10.120:7002
       slots:894-3398,7333-15332,15949-16383 (10940 slots) master
       0 additional replica(s)
    M: bb1572074d41254e5b4d5aae5c52e54f5129d6d5 10.21.14.251:7001
       slots: (0 slots) master
       0 additional replica(s)
    M: 396a7fbd2ec61752f9e848a1d8cc7b405aef0356 10.21.14.251:7000
       slots: (0 slots) master
       0 additional replica(s)
    M: 9f215e7e4b511f3d2bbf5d734731899b71a62a3b 10.21.10.120:7001
       slots:0-893,5000-5025,15333-15948 (1536 slots) master
       0 additional replica(s)
    M: 0c9b383f65ae4fefc5e02617fb76a845d7510a53 10.21.14.251:7002
       slots: (0 slots) master
       0 additional replica(s)
    [OK] All nodes agree about slots configuration.
    >>> Check for open slots...
    >>> Check slots coverage...
    [OK] All 16384 slots covered.
     

    确认一下状态

    [redis@ip-10-21-14-251 redis]$ ./redis-cli -p 7000 cluster nodes
    bb1572074d41254e5b4d5aae5c52e54f5129d6d5 10.21.14.251:7001 master - 0 1509611814919 6 connected
    9f215e7e4b511f3d2bbf5d734731899b71a62a3b 10.21.10.120:7001 master - 0 1509611811917 14 connected 0-893 5000-5025 15333-15948
    5b38e63a1091baa3a871a52275489a2aa1d28bfb 10.21.10.120:7002 master - 0 1509611815923 13 connected 894-3398 7333-15332 15949-16383
    396a7fbd2ec61752f9e848a1d8cc7b405aef0356 10.21.14.251:7000 myself,master - 0 0 1 connected
    4422ab38377fa8828e0f7884570b3b482a66496b 10.21.10.120:7000 master - 0 1509611813919 12 connected 3399-4999 5026-7332
    0c9b383f65ae4fefc5e02617fb76a845d7510a53 10.21.14.251:7002 master - 0 1509611812917 3 connected
     

    9. 确认状态OK的话,开始删除节点

     ./redis-trib.rb del-node 10.21.14.251:7000 396a7fbd2ec61752f9e848a1d8cc7b405aef0356
     ./redis-trib.rb del-node 10.21.14.251:7001 bb1572074d41254e5b4d5aae5c52e54f5129d6d5
     ./redis-trib.rb del-node 10.21.14.251:7002 0c9b383f65ae4fefc5e02617fb76a845d7510a53
     

    10. 干掉10.21.14.251:7002成功,

     [redis@ip-10-21-14-251 redis]$ ./redis-trib.rb del-node 10.21.14.251:7002 0c9b383f65ae4fefc5e02617fb76a845d7510a53
    >>> Removing node 0c9b383f65ae4fefc5e02617fb76a845d7510a53 from cluster 10.21.14.251:7002
    >>> Sending CLUSTER FORGET messages to the cluster...
    >>> SHUTDOWN the node.
    [envuser@ip-10-21-14-251 redis]$ ./redis-trib.rb check 10.21.10.120:7000
    >>> Performing Cluster Check (using node 10.21.10.120:7000)
    M: 4422ab38377fa8828e0f7884570b3b482a66496b 10.21.10.120:7000
       slots:3399-4999,5026-7332 (3908 slots) master
       0 additional replica(s)
    M: 5b38e63a1091baa3a871a52275489a2aa1d28bfb 10.21.10.120:7002
       slots:894-3398,7333-15332,15949-16383 (10940 slots) master
       0 additional replica(s)
    M: bb1572074d41254e5b4d5aae5c52e54f5129d6d5 10.21.14.251:7001
       slots: (0 slots) master
       0 additional replica(s)
    M: 396a7fbd2ec61752f9e848a1d8cc7b405aef0356 10.21.14.251:7000
       slots: (0 slots) master
       0 additional replica(s)
    M: 9f215e7e4b511f3d2bbf5d734731899b71a62a3b 10.21.10.120:7001
       slots:0-893,5000-5025,15333-15948 (1536 slots) master
       0 additional replica(s)
    [OK] All nodes agree about slots configuration.
    >>> Check for open slots...
    >>> Check slots coverage...
    [OK] All 16384 slots covered.
     

    状态检查

    [redis@ip-10-21-14-251 redis]$ ./redis-cli -p 7000 cluster nodes
    bb1572074d41254e5b4d5aae5c52e54f5129d6d5 10.21.14.251:7001 master - 0 1509611900094 6 connected
    9f215e7e4b511f3d2bbf5d734731899b71a62a3b 10.21.10.120:7001 master - 0 1509611902100 14 connected 0-893 5000-5025 15333-15948
    5b38e63a1091baa3a871a52275489a2aa1d28bfb 10.21.10.120:7002 master - 0 1509611901098 13 connected 894-3398 7333-15332 15949-16383
    396a7fbd2ec61752f9e848a1d8cc7b405aef0356 10.21.14.251:7000 myself,master - 0 0 1 connected
    4422ab38377fa8828e0f7884570b3b482a66496b 10.21.10.120:7000 master - 0 1509611899093 12 connected 3399-4999 5026-7332
     

    根据上面步骤,删除剩余节点即可。
    经过测试,应用在迁移过程中,没有受到任何影响。但是应用连接池的IP需要找机会增加10.21.10.120。

  • 相关阅读:
    nginx 安全请求头
    使用citus 列式存储压缩数据
    nginx ngx_http_realip 的功能以及使用
    act 的密钥&&环境变量管理
    oracle怎么查询重复的数据
    如何在Oracle中复制表结构和表数据
    2022成都.NET开发者Connect线下活动
    闭包具有逻辑内聚的功能
    编程范式是人类思维方式的投影代表了程序设计者认为程序应该如何被构建和执行的看法
    工程师是高级生产者
  • 原文地址:https://www.cnblogs.com/mylovelulu/p/9521789.html
Copyright © 2020-2023  润新知