• Oracle 10g Rac root.sh Failure at final check of Oracle CRS stack 10 解决方法


    一.问题说明

    安装环境:Oracle linux 6.1

    数据库: 10.2.0.1

    安装Oracle 10g的RAC,在第一个节点执行root.sh 时报错,如下:

    [root@rac1 ~]# /u01/app/10.2.0/grid/root.sh

    WARNING: directory '/u01/app/10.2.0' is notowned by root

    WARNING: directory '/u01/app' is not ownedby root

    WARNING: directory '/u01' is not owned byroot

    Checking to see if Oracle CRS stack isalready configured

    Setting the permissions on OCR backupdirectory

    Setting up NS directories

    Oracle Cluster Registry configurationupgraded successfully

    WARNING: directory '/u01/app/10.2.0' is notowned by root

    WARNING: directory '/u01/app' is not ownedby root

    WARNING: directory '/u01' is not owned byroot

    Successfully accumulated necessary OCRkeys.

    Using ports: CSS=49895 CRS=49896 EVMC=49898and EVMR=49897.

    node <nodenumber>: <nodename><private interconnect name> <hostname>

    node 1: rac1 rac1-priv rac1

    node 2: rac2 rac2-priv rac2

    Creating OCR keys for user 'root', privgrp'root'..

    Operation successful.

    Now formatting voting device: /dev/raw/raw3

    Now formatting voting device: /dev/raw/raw4

    Now formatting voting device: /dev/raw/raw5

    Format of 3 voting devices complete.

    Startup will be queued to init within 90seconds.

    Adding daemons to inittab

    Expecting the CRS daemons to be up within600 seconds.

    Failure at final check of Oracle CRS stack.

    10

    二.MOS上有篇文档说明这个问题:

    2.1 文档一:

    Root.sh failed at Failure at final check ofOracle CRS stack 10 [ID 725878.1]

    Case

    This particular case is caused by the OSinit system does not working.

    " Failure at final check of Oracle CRS stack.
    10" 
    means CRS daemon did not startup during 600 seconds period.

    In the root.sh script, it adds CRS relatedentry in /etc/inittab, run "init q" and expect 3 CRS related daemonprocesses to start, eg:

    init.cssd
    init.crsd
    init.evmd

    With init system problem, none of thesedaemon processes are spawned, this causes CRS process startup failure as theyrely on the CRS daemon processes to start first.
    --这里说明是init system problem 出现问题,导致进程无法启动。可以通过以下方法验证这个问题:


    This can be verified by adding a simple entry in /etc/inittab:

    test:2:once:/usr/bin/echo "HELLOTEST" > /tmp/test.log


    run "init q" as root user. If the init is working, then there shouldbe a file /tmp/test.log generated.

    Solution

    --MOS上仅给出了AIX上的解决方案,如下:

    Please consult with system administrator tofix initissue.

    Here the solution is only valid for AIXplatform:

    1. Starting the script install_assist (AIXGUI utility Installation Assistance)
    2. Updating for example the date, then exit install_assist properly
    3. Reboot the system
    After that daemon process in /etc/inittab started, CRS installation completed.

    2.2 文档二:

    Clusterware Fails To Start DuringRoot.sh -- "Failure at final check of Oracle CRS stack 10" [ID329450.1]

    The Oracle Clusterware runs as root, but for some operations itneed to run as the oracle user, and uses the "su -l" which invokesthe oracle user shell login/profile script. If that shell profile script hasinteractive or cpu bound operations or prompts this may affect theClusterware operation.

    --这边文档说的是.bash_profile中的参数有交互性的参数,删除这些参数就可以了。

    其他文档:

    Troubleshooting 10g or 11.1 OracleClusterware Root.sh Problems [ID 240001.1]

    三.问题分析

    查看相关log:

    [oracle@rac1 client]$ pwd

    /u01/app/10.2.0/grid/log/rac1/client

    [oracle@rac1 client]$ ls

    clscfg_6337.log  clsc.log css.log  ocrconfig_6285.log

    [oracle@rac1 client]$ tail -30 css.log

    2012-07-12 23:23:15.565: [CSSCLNT][3681171200]clsssInitNative: connect failed, rc 9

    2012-07-12 23:23:16.977: [CSSCLNT][3681171200]clsssInitNative: connect failed, rc 9

    2012-07-12 23:23:18.390: [CSSCLNT][3681171200]clsssInitNative: connect failed, rc 9

    2012-07-12 23:23:19.885: [CSSCLNT][3681171200]clsssInitNative: connect failed, rc 9

    [oracle@rac1 client]$ tail -10 clsc.log

    Oracle Database 10g CRS Release 10.2.0.1.0Production Copyright 1996, 2005 Oracle. All rights reserved.

    2012-07-12 23:24:51.389: [default][4093163264]Terminating clsd session

    2012-07-12 23:25:00.274: [default][135894784]Terminating clsd session

    [oracle@rac1 client]$ tail clscfg_6337.log

    Oracle Database 10g CRS Release 10.2.0.1.0Production Copyright 1996, 2005 Oracle. All rights reserved.

    2012-07-12 23:12:24.477: [  CLSCFG][1566725888]clscfg: Nodelist is [rac1rac2 ]

    [oracle@rac1 rac1]$ cat alertrac1.log

    2012-07-12 10:06:10.703

    [client(6285)]CRS-1006:The OCR location/dev/raw/raw2 is inaccessible. Details in/u01/app/10.2.0/grid/log/rac1/client/ocrconfig_6285.log.

    2012-07-12 10:06:11.076

    [client(6285)]CRS-1001:The OCR wasformatted using version 2.

    2012-07-12 10:12:24.479

    [client(6337)]CRS-1801:Cluster crsconfigured with nodes rac1 rac2 .

    --在一个节点用root执行如下命令,清除OCR上的信息:

    [root@rac1 ~]# sh /u01/app/10.2.0/grid/install/rootdeinstall.sh

    Removing contents from OCR mirror device

    2560+0 records in

    2560+0 records out

    10485760 bytes (10 MB) copied, 2.46509 s,4.3 MB/s

    Removing contents from OCR device

    2560+0 records in

    2560+0 records out

    10485760 bytes (10 MB) copied, 1.18886 s,8.8 MB/s

    然后在运行root.sh 问题依旧。

    尝试使用了如下方法:

    1.     关闭防火墙

    我在安装之前已经把防火墙关闭,所以这里只是检查一下。

    [root@rac1 tmp]# service iptables status

    iptables: Firewall is not running.

    [root@rac1 tmp]# chkconfig iptables --list

    iptables        0:off  1:off   2:off   3:off  4:off   5:off   6:off

    2.     注释了如下文件:

    [root@rac1 tmp]# cat /etc/pam.d/other

    #%PAM-1.0

    auth    required       pam_deny.so

    account required       pam_deny.so

    password required       pam_deny.so

    session required       pam_deny.so

    3.     删除相关socket

    # rm -f /usr/tmp/.oracle/*

    # rm -f /tmp/.oracle/*

    # rm -f /var/tmp/.oracle/*

    Unable To Connect To Cluster ManagerOra-29701 as Network Socket Files are Removed [ID 391790.1]

    运行sh/u01/app/10.2.0/grid/install/rootdeinstall.sh清除后再次安装,问题依旧,可能还是兼容性的问题。

    后来把OS换成Redhat 5.4,成功安装了,可能还是Oracle 10g在Oracle Linux 6上的兼容性问题,在Oracle Linux 6上,我测试过Oracle 11.2.0.3的RAC,安装没有问题。

    -------------------------------------------------------------------------------------------------------

    版权所有,文章允许转载,但必须以链接方式注明源地址,否则追究法律责任!

    Skype: tianlesoftware

    QQ:              tianlesoftware@gmail.com

    Email:   tianlesoftware@gmail.com

    Blog:     http://www.tianlesoftware.com

    Weibo: http://weibo.com/tianlesoftware

    Twitter: http://twitter.com/tianlesoftware

    Facebook: http://www.facebook.com/tianlesoftware

    Linkedin: http://cn.linkedin.com/in/tianlesoftware

    -------加群需要在备注说明Oracle表空间和数据文件的关系,否则拒绝申请----

    DBA1 群:62697716(满);   DBA2 群:62697977(满)  DBA3 群:62697850(满)  

    DBA 超级群:63306533(满);  DBA4 群:83829929   DBA5群: 142216823

    DBA6 群:158654907    DBA7 群:172855474   DBA总群:104207940

  • 相关阅读:
    JS定时循环
    JS分组
    中位数 题解
    NOIP2017 D2T3 题解
    CF949E Binary Cards 题解
    友善的树形DP
    300英雄的危机(heroes)
    [北京省选集训2019]图的难题 题解
    洛谷 P1268 树的重量 题解
    洛谷 P2633 Count on a tree 题解
  • 原文地址:https://www.cnblogs.com/tianlesoftware/p/3609218.html
Copyright © 2020-2023  润新知