背景:
一台controller node,一台compute1节点
两台机器的host文件均已经进行hostname解析
两节点本已经加入了同一rabbitmq cluster
但controller node因为服务原因,还原至裸机状态,在yum安装rabbitmq-server.service之后,存在compute1节点无法加入到controller rabbitmq cluster的异常
相关异常如下
[root@compute1 ~]# rabbitmqctl join_cluster rabbit@controller Clustering node rabbit@compute1 with rabbit@controller ... Error: {cannot_start_mnesia, {{shutdown,{failed_to_start_child,mnesia_kernel_sup,killed}}, {mnesia_sup,start,[normal,[]]}}} [root@compute1 ~]# rabbitmqctl start_app Starting node rabbit@compute1 ... BOOT FAILED =========== Error description: {error,{inconsistent_cluster,"Node rabbit@compute1 thinks it's clustered with node rabbit@controller, but rabbit@controller disagrees"}} Log files (may contain more information): /var/log/rabbitmq/rabbit@compute1.log /var/log/rabbitmq/rabbit@compute1-sasl.log Stack trace: [{rabbit_mnesia,check_cluster_consistency,0, [{file,"src/rabbit_mnesia.erl"},{line,598}]}, {rabbit,'-start/0-fun-0-',0,[{file,"src/rabbit.erl"},{line,260}]}, {rabbit,start_it,1,[{file,"src/rabbit.erl"},{line,296}]}, {rpc,'-handle_call_call/6-fun-0-',5,[{file,"rpc.erl"},{line,206}]}] Error: {error,{inconsistent_cluster,"Node rabbit@compute1 thinks it's clustered with node rabbit@controller, but rabbit@controller disagrees"}}
其中报错说明是compute1 node认为controller node节点是其cluster,但是controller并不是
同时还有如下的error报错
[root@compute1 ~]# rabbitmqctl join_cluster rabbit@controller
Clustering node rabbit@compute1 with rabbit@controller ...
Error: {cannot_start_mnesia,
{{shutdown,{failed_to_start_child,mnesia_kernel_sup,killed}},
{mnesia_sup,start,[normal,[]]}}}
因为controller node是新安装,其icook信息也复制过去。compute1 node也执行stop_app,故应该推测应该是compute1 node之前残留的cluster信息,导致认证失败
在网上查询到因为mnesia的信息残留,故会认证失败。
其目录为/var/lib/rabbitmq/mnesia
mv /var/lib/rabbitmq/mnesia /tmp
然后再将controller节点的icook节点scp至compute1节点
重新使用 rabbitmqctl join_cluster rabbit@controller
完成cluster的加入
日常很难遇到,但在实验环境中很容易遇到,特此记录,以备后需