一个有 3个 ocrvoting 磁盘的oracle RAC集群,若其中2个ocrvoting磁盘同时损坏后, 集群如何能使用剩下未损坏的单个 ocrvoting盘启动?
如何修复损坏的2个ocrvoting磁盘,使用oracle集群启动?
Exadata : 12c CRS Clusterware Cannot Start Due To Missing Voting Disks "CRS-1637: Unable to locate configured voting file with ID", "CRS-1705: Found 1 configured voting files but 2 voting files are required" ( Doc ID 2294135.1 )
在应急处理的操作层面,关键的是两个步骤:
[root@orcldbadm01 trace]# crsctl start crs -excl -nocrs #### This operation is execute on node #1 only.
Therefore, you need to move the voting disks from the original diskgroup ( e.g. +DBFS_DG)
to another different diskgroup (e.g. +DATA_ORCL) as follows:
[root@orcldbadm01 trace]# crsctl replace votedisk +DATA_ORCL #### This operation is execute on node #1 only.
亲爱的客户,您好
关于第一个提问,
MOS Troubleshooting Clusterware Node Evictions (Reboots) ( Doc ID 1050693.1 )
在
3.1 - COMMON CAUSES OF OCSSD EVICTIONS 段落
提到
Problems writing to or reading from the CSS voting disk. If the node cannot perform a disk heartbeat to the majority of its voting files, then the node will be evicted.
亲爱的客户,您好
以下是一个Exadata上的例子,版本与您登记SR时的版本不同:
Exadata : 12c CRS Clusterware Cannot Start Due To Missing Voting Disks "CRS-1637: Unable to locate configured voting file with ID", "CRS-1705: Found 1 configured voting files but 2 voting files are required" ( Doc ID 2294135.1 )
它也是原始配置了3个,后面少了两个:
2017-08-04 13:46:21.330 [OCSSD(37743)]CRS-1637: Unable to locate configured voting file with ID e20b974c-95d84f0f-bffb19b9-1fb60a42; details at (:CSSNM00020:) in /u01/app/grid/diag/crs/orcldbadm01/crs/trace/ocssd.trc
2017-08-04 13:46:21.330 [OCSSD(37743)]CRS-1637: Unable to locate configured voting file with ID 58c6462b-fed64f66-bfaca266-1a8ddc25; details at (:CSSNM00020:) in /u01/app/grid/diag/crs/orcldbadm01/crs/trace/ocssd.trc
2017-08-04 13:46:21.330 [OCSSD(37743)]CRS-1705: Found 1 configured voting files but 2 voting files are required, terminating to ensure data integrity; details at (:CSSNM00021:) in /u01/app/grid/diag/crs/orcldbadm01/crs/trace/ocssd.trc
2017-08-04 13:46:22.330 [OCSSD(37743)]CRS-1656: The CSS daemon is terminating due to a fatal error; Details at (:CSSSC00012:) in /u01/app/grid/diag/crs/orcldbadm01/crs/trace/ocssd.trc
虽然文章写的时候是从Exadata环境取得的信息,但从原理的角度上,请您看一下能否帮助到您?
在应急处理的操作层面,关键的是两个步骤:
[root@orcldbadm01 trace]# crsctl start crs -excl -nocrs #### This operation is execute on node #1 only.
Therefore, you need to move the voting disks from the original diskgroup ( e.g. +DBFS_DG)
to another different diskgroup (e.g. +DATA_ORCL) as follows:
[root@orcldbadm01 trace]# crsctl replace votedisk +DATA_ORCL #### This operation is execute on node #1 only.
Best Regards,
Oracle Support - Sunday [Call - Outbound]
Call - Outbound
Called customer.
为了提供可行的Oracle产品使用和配置,客户需要这方面的回答。
Sunday [Update from Customer]
您好,
我们需要针对这样的故障情况,做好应急处理的预案,制定故障恢复的详细策略。
故障场景:
集群中3个ocrvoting磁盘, 若其中2个voting disks针对所有节点都同时损坏不可访问的情况下,集群如何反应?
在此种情况 ,集群如何能使用剩下未损坏的单个 ocrvoting盘启动?
同时,如何修复损坏的2个ocrvoting磁盘,使用oracle集群正常启动?
谢谢!
Oracle Support - Sunday [ODM Action Plan]
亲爱的客户,您好
暂时没有看到理论上的表述,所有节点同时不能访问某些voting files之后,
这些节点都去访问剩余的那一个,每个节点仍然能够得到磁盘心跳。
您的环境实际发生了这样的情况吗? 或者能否在您的环境自行生成及测试这样的故障?
Best Regards,
Sunday [Update from Customer]
您好,
对,说的就是2个voting disks针对所有节点都同时损坏不可访问的情况下,集群如何反应?
在此种情况 ,集群如何能使用剩下未损坏的单个 ocrvoting盘启动?
同时,如何修复损坏的2个ocrvoting磁盘,使用oracle集群启动?
谢谢!
Saturday [Update from Customer]
您说的情况是2个voting disks针对所有节点都同时损坏不可访问是吗?
<<<<<<<<<<<
是的。说的就是2个voting disks针对所有节点都同时损坏不可访问。