• 换了网线异常了,CRS无法正常启动,clssnmSendingThread: sending status msg to all nodes


    换了网线异常了,CRS无法正常启动,clssnmSendingThread: sending status msg to all nodes
    同事换网线前我将节点2正常关闭了,换完网线告诉我,发现节点2死活起不来了,看上面的日志和一些帖子最后也没解决,尝试过重启、网线拔掉重新插上、查看过存储是否正常和存储重新挂载。。。。看过一个帖子说可能是OCR信息发生了改变,不过之前没备份,也没忘这方面深入考虑。
    最后还是没搞定,主要是技术有限,没准确的定位出具体问题也不敢轻易乱动。。。
    20xx-12-16 19:01:05.792: [ CSSD][3786819328]clssnmSendingThread: sending join msg to all nodes
    20xx-12-16 19:01:05.792: [ CSSD][3786819328]clssnmSendingThread: sent 5 join msgs to all nodes
    20xx-12-16 19:01:06.295: [GIPCHALO][3811858176] gipchaLowerProcessNode: no valid interfaces found to node for 7286464 ms, node 0x7fecd0028450 { host 'myrac1', haName 'CSS_myrac-cluster', srcLuid fac66ea4-f1a960af, dstLuid 00000000-00000000 numInf 0, contigSeq 0, lastAck 0, lastValidAck 0, sendSeq [249 : 249], createTime 7037424, sentRegister 1, localMonitor 1, flags 0x4 }
    20xx-12-16 19:01:06.303: [ CSSD][3789973248]clssgmWaitOnEventValue: after CmInfo State val 3, eval 1 waited 0
    20xx-12-16 19:01:06.420: [ CSSD][3799754496]clssnmvDHBValidateNcopy: node 1, myrac1, has a disk HB, but no network HB, DHB has rcfg 471981092, wrtcnt, 211618800, LATS 7286584, lastSeqNo 211618797, uniqueness 1576485880, timestamp 1576494065/8540734
    20xx-12-16 19:01:06.435: [ CSSD][3804591872]clssnmvDHBValidateNcopy: node 1, myrac1, has a disk HB, but no network HB, DHB has rcfg 471981092, wrtcnt, 211618802, LATS 7286594, lastSeqNo 211618799, uniqueness 1576485880, timestamp 1576494066/8541524
    20xx-12-16 19:01:07.304: [ CSSD][3789973248]clssgmWaitOnEventValue: after CmInfo State val 3, eval 1 waited 0
    20xx-12-16 19:01:07.421: [ CSSD][3799754496]clssnmvDHBValidateNcopy: node 1, myrac1, has a disk HB, but no network HB, DHB has rcfg 471981092, wrtcnt, 211618803, LATS 7287584, lastSeqNo 211618800, uniqueness 1576485880, timestamp 1576494066/8541734
    20xx-12-16 19:01:07.435: [ CSSD][3804591872]clssnmvDHBValidateNcopy: node 1, myrac1, has a disk HB, but no network HB, DHB has rcfg 471981092, wrtcnt, 211618805, LATS 7287604, lastSeqNo 211618802, uniqueness 1576485880, timestamp 1576494067/8542524
    20xx-12-16 19:01:08.304: [ CSSD][3789973248]clssgmWaitOnEventValue: after CmInfo State val 3, eval 1 waited 0
    20xx-12-16 19:01:08.422: [ CSSD][3799754496]clssnmvDHBValidateNcopy: node 1, myrac1, has a disk HB, but no network HB, DHB has rcfg 471981092, wrtcnt, 211618806, LATS 7288584, lastSeqNo 211618803, uniqueness 1576485880, timestamp 1576494067/8542734
    20xx-12-16 19:01:08.436: [ CSSD][3804591872]clssnmvDHBValidateNcopy: node 1, myrac1, has a disk HB, but no network HB, DHB has rcfg 471981092, wrtcnt, 211618808, LATS 7288604, lastSeqNo 211618805, uniqueness 1576485880, timestamp 1576494068/8543524
    20xx-12-16 19:01:09.304: [ CSSD][3789973248]clssgmWaitOnEventValue: after CmInfo State val 3, eval 1 waited 0
    20xx-12-16 19:01:09.422: [ CSSD][3799754496]clssnmvDHBValidateNcopy: node 1, myrac1, has a disk HB, but no network HB, DHB has rcfg 471981092, wrtcnt, 211618809, LATS 7289584, lastSeqNo 211618806, uniqueness 1576485880, timestamp 1576494068/8543744
    20xx-12-16 19:01:09.437: [ CSSD][3804591872]clssnmvDHBValidateNcopy: node 1, myrac1, has a disk HB, but no network HB, DHB has rcfg 471981092, wrtcnt, 211618811, LATS 7289604, lastSeqNo 211618808, uniqueness 1576485880, timestamp 1576494069/8544524
    20xx-12-16 19:01:09.803: [ CSSD][3785242368]clssnmRcfgMgrThread: Local Join
    20xx-12-16 19:01:09.803: [ CSSD][3785242368]clssnmLocalJoinEvent: begin on node(2), waittime 193000
    20xx-12-16 19:01:09.803: [ CSSD][3785242368]clssnmLocalJoinEvent: set curtime (7289964) for my node
    20xx-12-16 19:01:09.803: [ CSSD][3785242368]clssnmLocalJoinEvent: scanning 32 nodes
    20xx-12-16 19:01:09.803: [ CSSD][3785242368]clssnmLocalJoinEvent: Node myrac1, number 1, is in an existing cluster with disk state 3
    20xx-12-16 19:01:09.803: [ CSSD][3785242368]clssnmLocalJoinEvent: takeover aborted due to cluster member node found on disk
    20xx-12-16 19:01:10.305: [ CSSD][3789973248]clssgmWaitOnEventValue: after CmInfo State val 3, eval 1 waited 0
    20xx-12-16 19:01:10.423: [ CSSD][3799754496]clssnmvDHBValidateNcopy: node 1, myrac1, has a disk HB, but no network HB, DHB has rcfg 471981092, wrtcnt, 211618812, LATS 7290584, lastSeqNo 211618809, uniqueness 1576485880, timestamp 1576494069/8544744
    20xx-12-16 19:01:10.437: [ CSSD][3804591872]clssnmvDHBValidateNcopy: node 1, myrac1, has a disk HB, but no network HB, DHB has rcfg 471981092, wrtcnt, 211618814, LATS 7290604, lastSeqNo 211618811, uniqueness 1576485880, timestamp 1576494070/8545524
    20xx-12-16 19:01:10.794: [ CSSD][3786819328]clssnmSendingThread: sending join msg to all nodes
    20xx-12-16 19:01:10.794: [ CSSD][3786819328]clssnmSendingThread: sent 5 join msgs to all nodes


    20xx-12-16 20:36:02.919: [ CSSD][2756265728]clssgmUpdateGrpData: grock(CLSN.ONSNETPROC.MASTER), commissioner(-1/0)
    20xx-12-16 20:36:02.919: [ CSSD][2756265728]clssgmHandleGrockRcfgUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(118), status(0), sendresp(1)
    20xx-12-16 20:36:02.920: [ CSSD][2756265728]clssgmTestSetLastGrockUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(118) msgseq(119), lastupdt<0x7fbb58031e10>, ignoreseq(0)
    20xx-12-16 20:36:02.920: [ CSSD][2756265728]clssgmGrockOpTagProcess: Request to commission member(1) using key(1) for grock(CLSN.ONSNETPROC.MASTER)
    20xx-12-16 20:36:02.920: [ CSSD][2756265728]clssgmUpdateGrpData: grock(CLSN.ONSNETPROC.MASTER), commissioner(1/1)
    20xx-12-16 20:36:02.920: [ CSSD][2756265728]clssgmHandleGrockRcfgUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(119), status(0), sendresp(1)
    20xx-12-16 20:36:02.921: [ CSSD][2756265728]clssgmTestSetLastGrockUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(119) msgseq(120), lastupdt<0x7fbb5804d490>, ignoreseq(0)
    20xx-12-16 20:36:02.921: [ CSSD][2756265728]clssgmUpdateGrpData: grock(CLSN.ONSNETPROC.MASTER), private data(2052), incarn(40)
    20xx-12-16 20:36:02.921: [ CSSD][2756265728]clssgmHandleGrockRcfgUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(120), status(0), sendresp(1)
    20xx-12-16 20:36:02.922: [ CSSD][2756265728]clssgmTestSetLastGrockUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(120) msgseq(121), lastupdt<0x7fbb5803dee0>, ignoreseq(0)
    20xx-12-16 20:36:02.922: [ CSSD][2756265728]clssgmGrockOpTagProcess: Request to commission member(-1) using key(1) for grock(CLSN.ONSNETPROC.MASTER)
    20xx-12-16 20:36:02.922: [ CSSD][2756265728]clssgmUpdateGrpData: grock(CLSN.ONSNETPROC.MASTER), commissioner(-1/0)
    20xx-12-16 20:36:02.922: [ CSSD][2756265728]clssgmHandleGrockRcfgUpdate: grock(CLSN.ONSNETPROC.MASTER), updateseq(121), status(0), sendresp(1)
    20xx-12-16 20:36:05.064: [ CSSD][2753111808]clssnmSendingThread: sending status msg to all nodes
    20xx-12-16 20:36:05.064: [ CSSD][2753111808]clssnmSendingThread: sent 5 status msgs to all nodes
    20xx-12-16 20:36:09.065: [ CSSD][2753111808]clssnmSendingThread: sending status msg to all nodes
    20xx-12-16 20:36:09.065: [ CSSD][2753111808]clssnmSendingThread: sent 4 status msgs to all nodes
    20xx-12-16 20:36:14.066: [ CSSD][2753111808]clssnmSendingThread: sending status msg to all nodes
    ...

    根据日志能判断出bond信息变了吗?我当时没发现也没分析出来,最后同事说改了bond!当时不是说只换根网线重新排下线吗?我说改回去试试,果然如此,重启一切正常了

    胡乱重启了下,没起来。。。
    [root@myrac2 bin]# ./crsctl query crs activeversion
    Oracle Cluster Registry initialization failed accessing Oracle Cluster Registry device: PROC-26: Error while accessing the physical storage
    ORA-15077: could not locate ASM instance serving a required diskgroup

    [root@myrac2 bin]# ./ocrcheck
    PROT-602: Failed to retrieve data from the cluster registry
    PROC-26: Error while accessing the physical storage
    ORA-15077: could not locate ASM instance serving a required diskgroup

    [grid@myrac2 ~]$ cd /u01/app/11.2.0/grid/bin/
    [grid@myrac2 bin]$ srvctl start nodeapps -n myrac2
    PRCR-1070 : Failed to check if resource ora.gsd is registered
    Cannot communicate with crsd
    PRCR-1070 : Failed to check if resource ora.net1.network is registered
    Cannot communicate with crsd
    PRCR-1035 : Failed to look up CRS resource myrac2 for ora.cluster_vip.type
    PRCR-1068 : Failed to query resources
    Cannot communicate with crsd
    PRCR-1070 : Failed to check if resource ora.ons is registered
    Cannot communicate with crsd


    [grid@myrac2 bin]$ srvctl start asm -n myrac2
    PRCR-1070 : Failed to check if resource ora.asm is registered
    Cannot communicate with crsd


    [grid@myrac2 bin]$ srvctl start database -d testdb2
    PRCD-1027 : Failed to retrieve database testdb2
    PRCR-1115 : Failed to find entities of type resource that match filters ((NAME == ora.testdb2.db) && (TYPE == ora.database.type)) and contain attributes VERSION,ORACLE_HOME,DATABASE_TYPE
    Cannot communicate with crsd
    [grid@myrac2 bin]$

    节点2被修改的bond,明显跟1不一样
    [root@myrac2 11.2.0]# service network status
    Configured devices:
    lo bond0 bond1 em1 em2 em3 em4
    Currently active devices:
    lo em1 em2 em3 em4 bond0 bond1
    [root@myrac2 11.2.0]#

    节点1
    [root@myrac1 ~]# service network status
    Configured devices:
    lo bond0 em1 em2 em3 em4 idrac
    Currently active devices:
    lo em1 em2 em3 bond0

    抛开技术行不行先不说,单这件事来说,同事之间的合作有时候更重要。一不小心你就会给别人挖个坑或掉到别人给你挖的坑

  • 相关阅读:
    解决:导出excel身份证号码显示为科学计数法
    6个出色的基于JQuery的Tab选项卡实例2010/01/29 16:261. jQuery 选项卡界面 / 选项卡结构菜单教程
    dhl:禁用firefox缓存
    artDialog4.0.5
    dhl:解除ASP.NET上传文件大小限制
    dhl:样式在ie不同浏览器下呈现不出来的原因分析
    dhl:artDialog 3.0.4 跨框架下 穿越的问题
    jQuery 1.7 正式版已经可以下载使用。jQuery是一个JavaScript库,它简化了HTML文档遍历,事件处理,动画和为网络快速发展的Ajax交互。jQuery 1.7 版本加入了新的事件API .on() 和 .off(),提
    无法向会话状态服务器发出会话状态请求。请确保 ASP.NET State Service (ASP.NET 状态服务)已启动,并且客户端端口与服务器端口相同。
    专家视角看IT与架构
  • 原文地址:https://www.cnblogs.com/ritchy/p/12056084.html
Copyright © 2020-2023  润新知