• Oracle 11.2.0.1 RAC GRID 无法启动 : Oracle High Availability Services startup failed



    、在虚拟机上安装的11.2.0.1的RAC,之所以选择11.2.0.1,是因为public IP和Private 网段的问题。 安装实例过程中,电脑死机,重启后,CRS 无法启动。

    [root@rac1 bin]# ./crsctlstart crs

    CRS-4124: Oracle HighAvailability Services startup failed.

    CRS-4000: Command Startfailed, or completed with errors.

     

    [root@rac1 bin]# ps -ef|grep has

    root     8081     1  0 03:14 ?        00:00:00/u01/app/grid/11.2.0/bin/ohasd.bin reboot

    root     8137  4230  1 03:23 pts/0    00:00:00 grep has

    [root@rac1 bin]# kill -9 8081

    [root@rac1 bin]# ./crsctl start crs

    CRS-4124: Oracle High Availability Servicesstartup failed.

    CRS-4000: Command Start failed, orcompleted with errors.

    查看log

    [grid@rac2 rac2]$ ll

    total 72

    drwxr-x--- 2 grid oinstall 4096 Nov 2100:38 admin

    drwxrwxr-t 4 root oinstall 4096 Nov 2100:38 agent

    -rw-rw-r-- 1 rootroot     9693 Nov 21 02:26 alertrac2.log

    drwxr-x--- 2 grid oinstall 4096 Nov 2100:43 client

    drwxr-x--- 2 root oinstall 4096 Nov 2100:42 crsd

    drwxr-x--- 2 grid oinstall 4096 Nov 2100:39 cssd

    drwxr-x--- 2 root oinstall 4096 Nov 2100:41 ctssd

    drwxr-x--- 2 grid oinstall 4096 Nov 2100:39 diskmon

    drwxr-x--- 2 grid oinstall 4096 Nov 2100:42 evmd

    drwxr-x--- 2 grid oinstall 4096 Nov 2100:38 gipcd

    drwxr-x--- 2 root oinstall 4096 Nov 2100:38 gnsd

    drwxr-x--- 2 grid oinstall 4096 Nov 2100:40 gpnpd

    drwxr-x--- 2 grid oinstall 4096 Nov 2100:38 mdnsd

    drwxr-x--- 2 root oinstall 4096 Nov 2100:39 ohasd

    drwxrwxr-t 5 grid oinstall 4096 Nov 2100:38 racg

    drwxr-x--- 2 grid oinstall 4096 Nov 2100:42 srvm

    除了alertrac2.log 在宕机的时候有更新外,其他文件均无更新。到节点1重启了一下:

    [root@rac1 client]# ll

    total 124

    -rw-r--r-- 1 root root       193 Nov 21 00:31 clscfg.log

    -rw-rw-rw- 1 root root     28635 Nov 21 00:32 crsctl.log

    -rw-r--r-- 1 root root       114 Nov 21 00:32 crsctl.trc

    -rw-r--r-- 1 gridoinstall   663 Nov 21 03:08 css.log

    -rw-r--r-- 1 grid oinstall  1051 Nov 21 00:28 gpnptool_11653.log

    -rw-r--r-- 1 grid oinstall   114 Nov 21 00:28 gpnptool_11653.trc

    -rw-r--r-- 1 grid oinstall  1461 Nov 21 00:28 gpnptool_11660.log

    -rw-r--r-- 1 grid oinstall   114 Nov 21 00:28 gpnptool_11660.trc

    -rw-r--r-- 1 grid oinstall   551 Nov 21 00:35 oclskd.log

    -rw-r----- 1 root root      6100 Nov 21 00:27 ocrconfig_11312.log

    -rw-r--r-- 1 root root      3170 Nov 21 00:31 ocrconfig_12191.log

    -rw-r----- 1 root root       342 Nov 21 00:37 ocrconfig_13798.log

    -rw-r--r-- 1 grid oinstall 33862 Nov 2100:45 oifcfg.log

    -rw-r--r-- 1 grid oinstall   114 Nov 21 00:45 oifcfg.trc

    -rw-r--r-- 1 root root      1067 Nov 21 00:36 olsnodes.log

    -rw-r--r-- 1 grid oinstall   114 Nov 21 00:37 olsnodes.trc

    --css.log 的也只有如下错误:

    [root@rac1 client]# cat css.log

    Oracle Database 11g Clusterware Release11.2.0.1.0 - Production Copyright 1996, 2009 Oracle. All rights reserved.

    2012-11-21 03:08:22.764: [CSSCLNT][4171966208]clssscConnect: gipc request failed with 29 (0x13)

    2012-11-21 03:08:22.764: [ CSSCLNT][4171966208]clsssInitNative:connect failed, rc 29

    2012-11-21 03:08:28.140: [CSSCLNT][4171966208]clssscConnect: gipc request failed with 29 (0x13)

    2012-11-21 03:08:28.140: [CSSCLNT][4171966208]clsssInitNative: connect failed, rc 29

    2012-11-21 03:08:37.908: [CSSCLNT][4171966208]clssscConnect: gipc request failed with 29 (0x13)

    2012-11-21 03:08:37.908:[ CSSCLNT][4171966208]clsssInitNative: connect failed, rc 29

    根据MOS 说明:

    How toTroubleshoot Grid Infrastructure Startup Issues [ID 1050908.1]

    http://blog.csdn.net/tianlesoftware/article/details/6013763

    1. ocssd is fully up

    If ocssd.bin is not fully up, crsd.log will show messages like following:

    2010-02-03 22:37:51.638: [CSSCLNT][1548456880]clssscConnect: gipc request failed with 29 (0x16)
    2010-02-03 22:37:51.638: [ CSSCLNT][1548456880]clsssInitNative: connect failed,rc 29
    2010-02-03 22:37:51.639: [  CRSRTI][1548456880] CSS is not ready. Receivedstatus 3 from CSS. Waiting for good status ..

    是OCSSD 进程无法启动。那么为什么OCSS进程无法启动? 我们对ohasd进程进行strace:

    [root@rac1 client]# ps -ef|grep has

    root    12192     1  012:44 ?        00:00:00/u01/app/grid/11.2.0/bin/ohasd.bin reboot

    root    12281  8085  0 13:05 pts/2    00:00:00 grep has

    [root@rac1 client]# strace -p 12192 -o dave.log

    Process 12192 attached - interrupt to quit

    quit

    Process 12192 detached

    [root@rac1 client]#

    [root@rac1 client]# ls

    clscfg.log dave.log           gpnptool_11660.trc  ocrconfig_13798.log  olsnodes.trc

    crsctl.log gpnptool_11653.log oclskd.log           oifcfg.log

    crsctl.trc gpnptool_11653.trc ocrconfig_11312.log  oifcfg.trc

    css.log    gpnptool_11660.log ocrconfig_12191.log  olsnodes.log

    [root@rac1 client]# cat dave.log

    open("/var/tmp/.oracle/npohasd",O_WRONLY <unfinished ...>

    这里提示了一条很重要的信息。就是这里的文件,这个文件,我们在安装11.2.0.1的RAC时也会遇到,其应该说是11.2.0.1的一个bug。

    参考:

    Oracle 11gRAC ohasd failed to start at /u01/app/11.2.0/grid/crs/install/rootcrs.pl line443 解决方法

    http://blog.csdn.net/tianlesoftware/article/details/7697366

    所以在启动CRS之前,先在2个节点指定dd命令:

    [root@rac1 client]# /bin/ddif=/var/tmp/.oracle/npohasd of=/dev/null bs=1024 count=1

    然后启动,这没有问题了:

    [root@rac1 bin]# ./crsctlstart crs

    CRS-4123: Oracle High Availability Serviceshas been started.

    [root@rac2 bin]# ./crsctlstart crs

    CRS-4123: Oracle High Availability Serviceshas been started.

    [root@rac2 bin]#./crsctl check crs

    CRS-4638: Oracle High AvailabilityServices is online

    CRS-4535: Cannot communicate with ClusterReady Services

    CRS-4530: Communications failure contactingCluster Synchronization Services daemon

    CRS-4534: Cannotcommunicate with Event Manager

    [root@rac1 bin]# ./crsctlcheck crs

    CRS-4638: Oracle High Availability Servicesis online

    CRS-4535: Cannot communicate with ClusterReady Services

    CRS-4530: Communications failure contactingCluster Synchronization Services daemon

    CRS-4534: Cannot communicate with EventManager

    [root@rac1 bin]# ./crsctlstart cluster -all

    CRS-5702: Resource 'ora.crsd' is alreadyrunning on 'rac1'

    CRS-5702: Resource 'ora.crsd' is alreadyrunning on 'rac2'

    [root@rac1 bin]# ./crsctlcheck crs

    CRS-4638: Oracle High Availability Servicesis online

    CRS-4535: Cannot communicate with ClusterReady Services

    CRS-4529: Cluster Synchronization Servicesis online

    CRS-4533: Event Manager is online

    [root@rac2 bin]# ./crsctlcheck crs

    CRS-4638: Oracle High Availability Servicesis online

    CRS-4535: Cannot communicate with ClusterReady Services

    CRS-4529: Cluster Synchronization Servicesis online

    CRS-4533: Event Manager is online

    --查看进程,都拉起来了。注意11g的进程启动有些慢,多等一会。

    [root@rac2 u01]# sh crs_stat.sh

    Name                           Target     State     Host     

    ------------------------------ -------------------  -------  

    ora.DATA.dg                    ONLINE     ONLINE    rac1     

    ora.FRA.dg                     ONLINE     ONLINE    rac1     

    ora.LISTENER.lsnr              ONLINE     ONLINE    rac1     

    ora.LISTENER_SCAN1.lsnr        ONLINE     ONLINE    rac2      

    ora.OCRVOTING.dg               ONLINE     ONLINE    rac1     

    ora.asm                        ONLINE     ONLINE    rac1     

    ora.dave.db                    OFFLINE    OFFLINE             

    ora.eons                       ONLINE     ONLINE    rac1     

    ora.gsd                        OFFLINE    OFFLINE             

    ora.net1.network               ONLINE     ONLINE    rac1     

    ora.oc4j                       OFFLINE    OFFLINE             

    ora.ons                        ONLINE     ONLINE    rac1     

    ora.rac1.ASM1.asm              ONLINE     ONLINE    rac1     

    ora.rac1.LISTENER_RAC1.lsnr    ONLINE    ONLINE     rac1     

    ora.rac1.gsd                   OFFLINE    OFFLINE             

    ora.rac1.ons                   ONLINE     ONLINE    rac1     

    ora.rac1.vip                   ONLINE     ONLINE    rac1     

    ora.rac2.ASM2.asm              ONLINE     ONLINE    rac2     

    ora.rac2.LISTENER_RAC2.lsnr    ONLINE    ONLINE     rac2     

    ora.rac2.gsd                   OFFLINE    OFFLINE             

    ora.rac2.ons                   ONLINE     ONLINE    rac2     

    ora.rac2.vip                   ONLINE     ONLINE    rac2     

    ora.scan1.vip                  ONLINE     ONLINE    rac2     

    现在可以处理我们实例,弄好之后在升级到11.2.0.3.4. 免得每次都遇到这种问题。

    ---------------------------------------------------------------------------------------

    版权所有,文章允许转载,但必须以链接方式注明源地址,否则追究法律责任!

    Skype:    tianlesoftware

    QQ:       tianlesoftware@gmail.com

    Email:    tianlesoftware@gmail.com

    Blog:     http://blog.csdn.net/tianlesoftware

    Weibo:    http://weibo.com/tianlesoftware

    Twitter:  http://twitter.com/tianlesoftware

    Facebook: http://www.facebook.com/tianlesoftware

    Linkedin: http://cn.linkedin.com/in/tianlesoftware


  • 相关阅读:
    November 07th, 2017 Week 45th Tuesday
    November 06th, 2017 Week 45th Monday
    November 05th, 2017 Week 45th Sunday
    November 04th, 2017 Week 44th Saturday
    November 03rd, 2017 Week 44th Friday
    Asp.net core 学习笔记 ( Area and Feature folder structure 文件结构 )
    图片方向 image orientation Exif
    Asp.net core 学习笔记 ( Router 路由 )
    Asp.net core 学习笔记 ( Configuration 配置 )
    qrcode render 二维码扫描读取
  • 原文地址:https://www.cnblogs.com/tianlesoftware/p/3609160.html
Copyright © 2020-2023  润新知