中午,公司群里面测试人员@笔者说,早上测试服务器异常,MQ起不来,重启os了也起不来,报错,上去看下了早上又因为内存oom被内核killed了,启动了下,确实启动报错,erl vm进程起来了,但是beam.smp进程没有起来,看下了MQ启动日志startup_log,有如下信息:
=INFO REPORT==== 10-Dec-2016::11:43:03 === Error description: {could_not_start,rabbit, {{badmatch, {error, {{{badmatch, {error, {not_a_dets_file, "/var/lib/rabbitmq/mnesia/rabbit@iZ23nn1p4mjZ/recovery.dets"}}}, [{rabbit_recovery_terms,open_table,0,[]}, {rabbit_recovery_terms,init,1,[]}, {gen_server,init_it,6,[{file,"gen_server.erl"},{line,328}]}, {proc_lib,init_p_do_apply,3, [{file,"proc_lib.erl"},{line,240}]}]}, {child,undefined,rabbit_recovery_terms, {rabbit_recovery_terms,start_link,[]}, transient,4294967295,worker, [rabbit_recovery_terms]}}}}, [{rabbit_queue_index,start,1,[]}, {rabbit_variable_queue,start,1,[]}, {rabbit_priority_queue,start,1,[]}, {rabbit_amqqueue,recover,0,[]}, {rabbit,recover,0,[]}, {rabbit,'-run_step/2-lc$^1/1-1-',1,[]}, {rabbit,run_step,2,[]}, {rabbit,'-run_boot_steps/1-lc$^0/1-0-',1,[]}]}} Log files (may contain more information): /var/log/rabbitmq/rabbit@iZ23nn1p4mjZ.log /var/log/rabbitmq/rabbit@iZ23nn1p4mjZ-sasl.log
初步查了下,应该是recovery.dets文件损坏的问题所致,发现该文件的大小为0,删除该文件后,问题解决。