故障如下:
root@drbd1:~# drbd-overview 0:data/0 StandAlone Primary/Unknown UpToDate/DUnknown /data/mysql ext3 3.9G 8.1M 3.7G 1% root@drbd2:~# drbd-overview 0:data/0 StandAlone Primary/Unknown UpToDate/DUnknown /data/mysql ext3 3.9G 8.1M 3.7G 1%
状态 StandAlone: 没有可用的网络配置(没有可用的复制或同步网路), 资源没有被连接, 或者是管理员使用drbdadm disconnect <resource> 进行了连接中断, 也有可能是认证失败或是产生脑裂而中断了连接
查看日志:
root@drbd1:~# tail -n 20 /var/log/syslog May 23 20:34:41 drbd1 kernel: [ 4629.177175] drbd data: Peer authenticated using 20 bytes HMAC May 23 20:34:41 drbd1 kernel: [ 4629.177389] drbd data: conn( WFConnection -> WFReportParams ) May 23 20:34:41 drbd1 kernel: [ 4629.177391] drbd data: Starting asender thread (from drbd_r_data [10450]) May 23 20:34:41 drbd1 kernel: [ 4629.186967] block drbd0: drbd_sync_handshake: May 23 20:34:41 drbd1 kernel: [ 4629.186970] block drbd0: self B4EF9EF8D6B328BD:1E9AC6C2E7980795:4B519345CD4008DE:4B509345CD4008DE bits:1024 flags:0 May 23 20:34:41 drbd1 kernel: [ 4629.186972] block drbd0: peer 7B0DFE0CF2812103:1E9AC6C2E7980794:4B519345CD4008DE:4B509345CD4008DE bits:1 flags:2 May 23 20:34:41 drbd1 kernel: [ 4629.186973] block drbd0: uuid_compare()=100 by rule 90 May 23 20:34:41 drbd1 kernel: [ 4629.186976] block drbd0: helper command: /sbin/drbdadm initial-split-brain minor-0 May 23 20:34:41 drbd1 kernel: [ 4629.188312] block drbd0: helper command: /sbin/drbdadm initial-split-brain minor-0 exit code 0 (0x0) May 23 20:34:41 drbd1 kernel: [ 4629.188324] block drbd0: Split-Brain detected but unresolved, dropping connection! May 23 20:34:41 drbd1 kernel: [ 4629.189831] block drbd0: helper command: /sbin/drbdadm split-brain minor-0 May 23 20:34:41 drbd1 kernel: [ 4629.191008] block drbd0: helper command: /sbin/drbdadm split-brain minor-0 exit code 0 (0x0) May 23 20:34:41 drbd1 kernel: [ 4629.191028] drbd data: conn( WFReportParams -> Disconnecting ) May 23 20:34:41 drbd1 kernel: [ 4629.191030] drbd data: error receiving ReportState, e: -5 l: 0! May 23 20:34:41 drbd1 kernel: [ 4629.191496] drbd data: asender terminated May 23 20:34:41 drbd1 kernel: [ 4629.191497] drbd data: Terminating drbd_a_data May 23 20:34:41 drbd1 kernel: [ 4629.218488] drbd data: Connection closed May 23 20:34:41 drbd1 kernel: [ 4629.218551] drbd data: conn( Disconnecting -> StandAlone ) May 23 20:34:41 drbd1 kernel: [ 4629.218553] drbd data: receiver terminated May 23 20:34:41 drbd1 kernel: [ 4629.218554] drbd data: Terminating drbd_r_data
解决方法:
1.确保卸载所有drbd设备
root@drbd1:~# umount /dev/drbd0
root@drbd2:~# umount /dev/drbd0
2.将所有节点设为Secondary
root@drbd1:~# drbdadm secondary data
root@drbd2:~# drbdadm secondary data
3.中断节点的连接
root@drbd2:~# drbdadm disconnect data
??: Failure: (162) Invalid configuration request
additional info from kernel:
unknown connection
Command 'drbdsetup-84 disconnect ipv4:10.11.8.158:7789 ipv4:10.11.8.145:7789' terminated with exit code 10
4.drbd2 上执行
root@drbd2:~# drbdadm connect data --discard-my-data
root@drbd2:~# drbd-overview
0:data/0 WFConnection Secondary/Unknown UpToDate/DUnknown
状态 WFConnection: 表示本节点将会等待, 直到对点网络实现连接
5.drbd1 上执行
root@drbd1:~# drbdadm connect data
root@drbd1:~# drbd-overview
0:data/0 Connected Secondary/Secondary UpToDate/UpToDate