• RabbitMQ概念及环境搭建(三)RabbitMQ cluster


    测试环境:VMS00781 VMS00782 VMS00386 (centos5.8)
    1.先在三台机器上分别安装RabbitMQ Server

    2.读取其中一个节点的cookie,并复制到其他节点(节点间通过cookie确定相互是否可通信)
    两者之一均可:
    sudo vim /var/lib/rabbitmq/.erlang.cookie
    sudo vim $HOME/.erlang.cookie

    3.逐个启动节点
    sudo service rabbitmq-server start

    4.查看各节点中的RabbitMQ brokers
    sudo rabbitmqctl cluster_status

    5.建集群
    分别在VMS00386、VMS00782 上执行
    sudo rabbitmqctl stop_app
    sudo rabbitmqctl join_cluster --ram rabbit@VMS00781
    sudo rabbitmqctl start_app
    sudo rabbitmqctl stop_app
    sudo rabbitmqctl join_cluster rabbit@VMS00781
    sudo rabbitmqctl start_app

    6.排错
    建集群过程中碰到如下错误:
    sudo rabbitmqctl join_cluster --ram rabbit@VMS00386
    Clustering node rabbit@VMS00782 with rabbit@VMS00386 ...
    Error: unable to connect to nodes [rabbit@VMS00386]: nodedown
    DIAGNOSTICS
    ===========
    attempted to contact: [rabbit@VMS00386]
    rabbit@VMS00386:
      * unable to connect to epmd (port 4369) on VMS00386: nxdomain (non-existing domain)
    current node details:
    - node name: 'rabbitmqctl-8666@VMS00782'
    - home dir: /var/lib/rabbitmq
    - cookie hash: 50YO3zK+HJHos0tab1vHjg==
    解决之道:
    集群节点间需能互相访问,故每个集群节点的hosts文件应包含集群内所有节点的信息以保证互相解析
    vim /etc/hosts
    781's IP   VMS00781
    782's IP   VMS00782
    386's IP   vms00386
    之后重启各节点中的rabbitmq

    7.其他问题

    cluster搭建起来后若在web管理工具中rabbitmq_management的Overview的Nodes部分看到“Node statistics not available”的信息,说明在该节点上web管理插件还未启用。

    直接在显示提示信息的节点上运行sudo rabbitmq-plugins enable rabbitmq_management即可。

    Error: mnesia_unexpectedly_running
    原因:忘记先停止stop_app
    解决:sudo rabbitmqctl stop_app

    若rabbitmq-server第一次启动后hostname不能被解析或者发生了更改则会导致启动失败
    需执行如下操作
    sudo rm -rf /var/lib/rabbitmq/mnesia(因为相关信息会记录在此数据库)
    重装RabbitMQ Server

    #####################################################
    RabbitMQ cluster 管理
    #####################################################
    1.查看集群状态
    可分别在集群中各个节点执行
    sudo rabbitmqctl cluster_status

    2.更改节点类型(内存型或磁盘型)
    sudo rabbitmqctl stop_app
    sudo rabbitmqctl change_cluster_node_type disc

    sudo rabbitmqctl change_cluster_node_type ram
    sudo rabbitmqctl start_app

    3.重启cluster中的节点
    停止某个节点或者节点down掉剩余节点不受影响
    [op1@vms00386 ~]$ sudo rabbitmqctl stop
    Stopping and halting node rabbit@vms00386 ...

    [op1@VMS00781 ~]$ sudo rabbitmqctl cluster_status
    Cluster status of node rabbit@VMS00781 ...
    [{nodes,[{disc,[rabbit@VMS00781,rabbit@VMS00782,rabbit@vms00386]}]},
     {running_nodes,[rabbit@VMS00782,rabbit@VMS00781]},
     {cluster_name,<<"rabbit@VMS00781">>},
     {partitions,[]}]

    [op1@VMS00782 ~]$ sudo rabbitmqctl cluster_status
    Cluster status of node rabbit@VMS00782 ...
    [{nodes,[{disc,[rabbit@VMS00781,rabbit@VMS00782,rabbit@vms00386]}]},
     {running_nodes,[rabbit@VMS00781,rabbit@VMS00782]},
     {cluster_name,<<"rabbit@VMS00781">>},
     {partitions,[]}]

    [op1@VMS00782 ~]$ sudo rabbitmqctl stop
    Stopping and halting node rabbit@VMS00782 ...

    [op1@VMS00781 ~]$ sudo rabbitmqctl cluster_status
    Cluster status of node rabbit@VMS00781 ...
    [{nodes,[{disc,[rabbit@VMS00781,rabbit@VMS00782,rabbit@vms00386]}]},
     {running_nodes,[rabbit@VMS00781]},
     {cluster_name,<<"rabbit@VMS00781">>},
     {partitions,[]}]

    待节点重启后自动追上其他节点
    [op1@vms00386 ~]$ sudo service rabbitmq-server start
    Starting rabbitmq-server: SUCCESS
    rabbitmq-server.

    [op1@VMS00781 ~]$ sudo rabbitmqctl cluster_status
    Cluster status of node rabbit@VMS00781 ...
    [{nodes,[{disc,[rabbit@VMS00781,rabbit@VMS00782,rabbit@vms00386]}]},
     {running_nodes,[rabbit@vms00386,rabbit@VMS00781]},
     {cluster_name,<<"rabbit@VMS00781">>},
     {partitions,[]}]

    [op1@VMS00782 ~]$ sudo service rabbitmq-server start
    Starting rabbitmq-server: SUCCESS
    rabbitmq-server.

    [op1@VMS00781 ~]$ sudo rabbitmqctl cluster_status
    Cluster status of node rabbit@VMS00781 ...
    [{nodes,[{disc,[rabbit@VMS00781,rabbit@VMS00782,rabbit@vms00386]}]},
     {running_nodes,[rabbit@VMS00782,rabbit@vms00386,rabbit@VMS00781]},
     {cluster_name,<<"rabbit@VMS00781">>},
     {partitions,[]}]

    [op1@VMS00782 ~]$ sudo rabbitmqctl cluster_status
    Cluster status of node rabbit@VMS00782 ...
    [{nodes,[{disc,[rabbit@VMS00781,rabbit@VMS00782,rabbit@vms00386]}]},
     {running_nodes,[rabbit@VMS00781,rabbit@vms00386,rabbit@VMS00782]},
     {cluster_name,<<"rabbit@VMS00781">>},
     {partitions,[]}]

    [op1@vms00386 ~]$ sudo rabbitmqctl cluster_status
    Cluster status of node rabbit@vms00386 ...
    [{nodes,[{disc,[rabbit@VMS00781,rabbit@VMS00782,rabbit@vms00386]}]},
     {running_nodes,[rabbit@VMS00782,rabbit@VMS00781,rabbit@vms00386]},
     {cluster_name,<<"rabbit@VMS00781">>},
     {partitions,[]}]

    几点注意:
    保证集群中至少有一个磁盘类型的节点以防数据丢失,在更改节点类型时尤其要注意。
    若整个集群被停掉了,应保证最后一个down掉的节点被最先启动,若不能则要使用forget_cluster_node命令将其移出集群
    若集群中节点几乎同时以不可控的方式down了此时在其中一个节点使用force_boot命令重启节点

    4.从集群移除节点
    [op1@vms00386 ~]$ sudo rabbitmqctl stop_app
    Stopping node rabbit@vms00386 ...
    [op1@vms00386 ~]$ sudo rabbitmqctl reset
    Resetting node rabbit@vms00386 ...
    [op1@vms00386 ~]$ sudo rabbitmqctl start_app
    Starting node rabbit@vms00386 ...

    [op1@vms00386 ~]$ sudo rabbitmqctl cluster_status
    Cluster status of node rabbit@vms00386 ...
    [{nodes,[{disc,[rabbit@vms00386]}]},
     {running_nodes,[rabbit@vms00386]},
     {cluster_name,<<"rabbit@vms00386">>},
     {partitions,[]}]

    [op1@VMS00781 ~]$ sudo rabbitmqctl cluster_status
    Cluster status of node rabbit@VMS00781 ...
    [{nodes,[{disc,[rabbit@VMS00781,rabbit@VMS00782]}]},
     {running_nodes,[rabbit@VMS00782,rabbit@VMS00781]},
     {cluster_name,<<"rabbit@VMS00781">>},
     {partitions,[]}]

    [op1@VMS00782 ~]$ sudo rabbitmqctl cluster_status
    Cluster status of node rabbit@VMS00782 ...
    [{nodes,[{disc,[rabbit@VMS00781,rabbit@VMS00782]}]},
     {running_nodes,[rabbit@VMS00781,rabbit@VMS00782]},
     {cluster_name,<<"rabbit@VMS00781">>},
     {partitions,[]}]
    可见rabbit@vms00386成为了独立的节点,原集群只剩rabbit@VMS00781,rabbit@VMS00782了

    也可在某个节点移除集群中其他节点
    如继续在rabbit@VMS00781上移除rabbit@VMS00782
    [op1@VMS00781 ~]$ sudo rabbitmqctl forget_cluster_node rabbit@VMS00782
    Removing node rabbit@VMS00782 from cluster ...

    [op1@VMS00781 ~]$ sudo rabbitmqctl cluster_status
    Cluster status of node rabbit@VMS00781 ...
    [{nodes,[{disc,[rabbit@VMS00781]}]},
     {running_nodes,[rabbit@VMS00781]},
     {cluster_name,<<"rabbit@VMS00781">>},
     {partitions,[]}]

    可见集群只剩rabbit@VMS00781一个节点了

    这里有个问题,在远程其他节点中被移除的节点会自认为仍属于集群

    [op1@VMS00782 ~]$ sudo rabbitmqctl start_app
    Starting node rabbit@VMS00782 ...
    BOOT FAILED
    ===========
    Error description:
       {error,{inconsistent_cluster,"Node rabbit@VMS00782 thinks it's clustered with node rabbit@VMS00781, but rabbit@VMS00781 disagrees"}}
    Log files (may contain more information):
       /var/log/rabbitmq/rabbit@VMS00782.log
       /var/log/rabbitmq/rabbit@VMS00782-sasl.log
    Stack trace:
       [{rabbit_mnesia,check_cluster_consistency,0},
        {rabbit,'-start/0-fun-0-',0},
        {rabbit,start_it,1},
        {rpc,'-handle_call_call/6-fun-0-',5}]
    Error: {rabbit,failure_during_boot,
               {error,
                   {inconsistent_cluster,
                       "Node rabbit@VMS00782 thinks it's clustered with node rabbit@VMS00781, but rabbit@VMS00781 disagrees"}}}
    需要重置一下
    [op1@VMS00782 ~]$ sudo rabbitmqctl reset
    Resetting node rabbit@VMS00782 ...
    [op1@VMS00782 ~]$ sudo rabbitmqctl start_app
    Starting node rabbit@VMS00782 ...

    此时三个节点均已成为独立的节点
    其中rabbit@vms00386、rabbit@VMS00782均被重置为了新的RabbitMQ broker而rabbit@VMS00781还保留着原cluster的残留状态可通过如下步骤重置
    [op1@VMS00781 ~]$ sudo rabbitmqctl stop_app
    Stopping node rabbit@VMS00781 ...
    [op1@VMS00781 ~]$ sudo rabbitmqctl reset
    Resetting node rabbit@VMS00781 ...
    [op1@VMS00781 ~]$ sudo rabbitmqctl start_app
    Starting node rabbit@VMS00781 ...

    5.自动配置cluster
    显然,这是通过配置文件而非命令行工具进行的
    首先重置各节点
    [op1@VMS00781 ~]$ sudo rabbitmqctl stop_app
    Stopping node rabbit@VMS00781 ...
    [op1@VMS00781 ~]$ sudo rabbitmqctl reset
    Resetting node rabbit@VMS00781 ...
    ...
    其次调整配置文件
    [{rabbit,
      [{cluster_nodes, {['rabbit@VMS00781', 'rabbit@VMS00782', 'rabbit@vms00386'], disc}}]}].
    ...
    之后启动各节点
    [op1@VMS00781 ~]$ sudo service rabbitmq-server start
    Starting rabbitmq-server: SUCCESS
    rabbitmq-server.

    查看集群状态
    [op1@VMS00781 ~]$ sudo rabbitmqctl cluster_status

    几点注意:
    无论通过命令行还是通过配置文件配置,请确保各节点上Erlang和RabbitMQ版本一致
    配置文件仅对新鲜节点有效,也即被reset或者第一次启动的节点。因此在重启节点后自动化集群过程并不会发生。也以为这通过rabbitmq进行的改变优先于自动化集群配置。

    在一台机器上部署集群,一般用户测试集群特性
    这里的关键是已不同的端口可节点名称启动多个rabbitmq-server实例,其余过程同多机器上部署集群类似

    其他注意事项:
    如防火墙策略等

    参考:
    http://www.rabbitmq.com/clustering.html

  • 相关阅读:
    oracle之check约束小结
    非归档模式下使用Rman进行备份和恢复
    R中,定义一个长度为0的向量
    R中,去掉dataframe中的NA行
    Oracle数据库的后备和恢复————关于检查点的一些知识
    关于oracle修复控制文件与数据文件不一致的问题----
    《SLAM机器人基础教程》第三章 单片机与STM32:GPIO实验及Keil软件使用WatchWindows进行Debug调试
    《SLAM导航机器人基础》第三章:单片机与STM32:单片机概述和Keil开发环境配置
    《SLAM导航机器人基础》第二章:C/C++编程(后)
    《SLAM导航机器人基础》第二章:C/C++编程(中)
  • 原文地址:https://www.cnblogs.com/ExMan/p/14445965.html
Copyright © 2020-2023  润新知