• Oracle 11g R2(11.2.0.4.0)+udev搭建RAC


    Oracle 11g R2(11.2.0.4.0)+udev搭建RAC

    返璞归真素闲人 2017-09-12 08:33:14 8869 收藏 6
    分类专栏: oracle 高可用 linux
    版权
    准备工作:
    硬件配置要求
    1 . 确定操作系统平台,64bit or 32bit。
    查看系统平台

    # uname -m
    1
    2 . 安装GI至少 1.5gb内存;安装GI和RAC数据库至少2.5GB内存
    查看物理内存大小:

    # grep MemTotal /proc/meminfo
    1
    3 . 置换区跟内存之间的关系 如下:

    查看swap大小:


    # grep SwapTotal /proc/meminfo
    1
    or

    # free
    1
    4 . 1GB /tmp临时空间
    查看临时空间大小:

    # df -k /tmp
    1
    如果发现/tmp空间小于1G,设置TEMP和TMPDIR环境变量来指定这个文件系统上的临时目录。

    Bourne, Bash, or Korn shell:

    $ TEMP=/mount_point/tmp
    $ TMPDIR=/mount_point/tmp
    $ export TEMP TMPDIR
    1
    2
    3
    C shell:

    % setenv TEMP /mount_point/tmp
    % setenv TMPDIR /mount_point/tmp
    1
    2
    5 . 至少4.5GB磁盘空间,用于安装GI软件。
    6 . x86 Linux系统,至少4G磁盘空间,用于安装rac数据库软件。
    7 . x86_64 Linux系统,至少4.6GB磁盘空间,用于安装rac数据库软件。

    网络配置要求
    1 . 至少2张网卡,一个公网一个私网
    2 . 公网和私网网口名字在各个节点必须一致
    3 . 公网必须支持TCP/IP传输协议。
    4 . 私网必须支持UDP传输协议和TCP/IP传输协议(至少1GB速率)。

    修改主机名(node1,node2都需要配置)
    [root@node1 ~]# vi /etc/sysconfig/network
    NETWORKING=yes
    HOSTNAME=node1.localdomain
    GATEWAY=10.37.2.1
    1
    2
    3
    4
    IP配置
    [root@slave2 ~]# vi /etc/sysconfig/network-scripts/ifcfg-eth1
    DEVICE=eth1
    TYPE=Ethernet
    UUID=684a1a37-17ff-4450-965a-22f1c5ff7594
    ONBOOT=yes
    NM_CONTROLLED=yes
    BOOTPROTO=none
    HWADDR=00:0C:29:BE:F6:A5
    IPADDR=192.168.52.170
    PREFIX=24
    DEFROUTE=yes
    IPV4_FAILURE_FATAL=yes
    IPV6INIT=no
    NAME="System eth1"
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    IP规划(node1,node2都需要配置)
    如果使用/etc/hosts解析scan的话,需要在/etc/hosts中指定scan名和对应的VIP地址(scan vip与scan名一对一)。
    如果使用DNS解析scan的话(最多可以指定3个scan vip对应同一个scan 名字),需要在集群每个节点的/etc/resolv.conf中配置对应的DNS解析服务器。
    oracle也支持GNS(Grid Naming Service)解析,但是必须在同一个网段才能解析,不建议采用这个方式。

    使用GNS(Grid Naming Service)解析方式:

    使用DNS解析方式(推荐方式):


    这里演示使用本地hosts解析方式:

    [root@slave2 ~]# vi /etc/hosts
    #127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
    ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
    #node1
    10.37.2.170 node1.localdomain node1
    192.168.52.170 node1-pri.localdomain node1-pri
    10.37.2.174 node1-vip.localdomain node1-vip

    #node2
    10.37.2.171 node2.localdomain node2
    192.168.52.171 node2-pri.localdomain node2-pri
    10.37.2.173 node2-vip.localdomain node2-vip

    #sacn-ip
    10.37.2.175 rac-scan.localdomain rac-scan
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    配置内核参数(node1,node2都需要配置)
    [root@node1 ~]# echo '
    fs.aio-max-nr = 1048576
    fs.file-max = 6815744
    kernel.shmmni = 4096
    kernel.sem = 250 32000 100 128
    net.ipv4.ip_local_port_range = 9000 65500
    net.core.rmem_default = 262144
    net.core.rmem_max = 4194304
    net.core.wmem_default = 262144
    net.core.wmem_max = 1048586
    '>> /etc/sysctl.conf
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    用户资源配置
    1 . 为了提升性能,需要在grid,oracle用户环境变量设置资源限制

    [root@node1 ~]# echo '
    oracle soft nproc 2047
    oracle hard nproc 16384
    oracle soft nofile 1024
    oracle hard nofile 65536
    oracle soft stack 10240

    grid soft nproc 2047
    grid hard nproc 16384
    grid soft nofile 1024
    grid hard nofile 65536
    grid soft stack 10240
    ' >> /etc/security/limits.conf
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    创建用户(node1,node2都需要配置)
    # /usr/sbin/groupadd -g 1000 oinstall
    # /usr/sbin/groupadd -g 1100 asmadmin
    # /usr/sbin/groupadd -g 1200 dba
    # /usr/sbin/groupadd -g 1201 oper
    # /usr/sbin/groupadd -g 1300 asmdba
    # /usr/sbin/groupadd -g 1301 asmoper

    useradd -u 5001 -g oinstall -G dba,asmdba,oper -d /home/oracle oracle
    useradd -u 5002 -g oinstall -G dba,asmdba,asmadmin,asmoper,asmdba -d /home/grid grid
    [root@node1 ~]# passwd grid
    [root@node1 ~]# passwd oracle
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    看到这里是不是有懵,需要设置这么多组,关于用户组的具备分工和权限这里简单说明一下:
    oinstall : 这个组是GI何数据库软件用者组。
    数据库组:
    1 .OSDBA用户组(一般是dba)
    安装oracle数据库软件必须要创建这个用户组,这个用户组定义操作系统用户拥有数据库管理员权限(sysdba权限)。在asm实例,如果没有创建osdba,osoper和osasm组,那么osdba组成员必须有sysoper和sysasm权限,这个组在Oracle代码示例中被称为dba。如果您不指定一个单独的组作为OSASM组,那么您定义的OSDBA组也默认为OSASM组。
    要指定除了默认的dba组之外的组名称,则必须选择高级安装类型来安装软件。
    OSDBA组的成员以前被授予了Oracle ASM实例的SYSASM特权,可以安装和卸载磁盘组。如果不同的操作系统组被指定为OSDBA和OSASM组,则在11gr2被移除。如果同一个组用于OSDBA和OSASM,则保留。
    2 . Oracle 数据库OSOPER组(一般都是oper)
    这是个可选组。如果您想要一个独立的操作系统用户组拥有部分数据库管理权限,则可以使用osoper。osdba用户默认具有osoper所有的权限。
    要使用OSOPER组创建一个比默认dba组更少的权限的数据库管理员组,那么您必须选择高级安装类型来安装软件。
    ASM组:
    SYSASM是一种新的系统特权,可以将Oracle ASM存储管理特权从SYSDBA分离出来。在ASM 11gr2(11.2),数据库OSDBA组成员没有授予SYSASM权限。
    3 . ASM组(一般是asmadmin)
    这是一个必需的组。创建一个单独的组作为管理Oracle ASM和Oracle数据库。如oracle文档所述,操作系统组成员被授于一个称为OSASM组的权限,在oracle代码实例则称为asmadmin。
    OSASM组成员可以通过操作系统认证以SYSASM角色使用sql命令去连接asm实例。sysasm角色可以管理磁盘组,但是不能访问RDMS。
    4 . ASM 数据库管理组(ASM的OSDBA权限,一般是asmdba)
    ASM数据库管理组成员可以从oracle asm读写文件。安装GI和数据库软件必须在这个组里面,另外所有需要访问ASM的OSDBA成员也需要在这个组里面。可以启动/关闭实例,挂载/卸载 ASM磁盘组。
    5 . ASM操作组(ASM的OSOPER权限,一般是asmoper)
    这是个可选组。oracle asm部分管理权限,如:启动关闭asm实例。OSASM组成员默认具有asmoper的所有权限。
    asmoper组成员权限比asmadmin权限小,这个组只有在高级安装才有。
    如果您想有OSOPER,那么GI必须是这个组的成员。

    设置用户环境变量(node1,node2都需要配置)
    shell配置
    确定当前shell环境

    $ echo $SHELL
    1
    Bash shell (bash):

    $ vi .bash_profile
    1
    Bourne shell (sh) or Korn shell (ksh):

    $ vi .profile
    1
    C shell (csh or tcsh):

    % vi .login
    1
    2 . 每个节点执行:

    echo 'session required pam_limits.so ' >>/etc/pam.d/login
    1
    3 . 配置shell变量 ,包括oracel,grid用户
    For the Bourne, Bash, or Korn shell, add lines similar to the following to the /etc/profile file (or the file on SUSE systems)/etc/profile.local

    if [ $USER = "oracle" ] || [ $USER = "grid" ]; then
    if [ $SHELL = "/bin/ksh" ]; then
    ulimit -p 16384
    ulimit -n 65536
    else
    ulimit -u 16384 -n 65536
    fi
    umask 022
    fi
    1
    2
    3
    4
    5
    6
    7
    8
    9
    For the C shell (csh or tcsh), on Red Hat, OEL, or Asianux, add the following lines to the /etc/csh.login file. On SUSE systems add the lines to /etc/csh.login.local:

    if ( $USER == "oracle" || $USER == "grid" ) then
    limit maxproc 16384
    limit descriptors 65536
    endif
    1
    2
    3
    4
    4 .

    oracle用户

    [root@node1 ~]# vi /home/oracle/.bash_profile

    # .bash_profile

    # Get the aliases and functions
    if [ -f ~/.bashrc ]; then
    . ~/.bashrc
    fi

    # User specific environment and startup programs

    PATH=$PATH:$HOME/bin

    export PATH
    export PS1="[u@H$]"
    export TMP=/tmp
    export TMPDIR=$TMP
    export ORACLE_HOSTNAME=node1.localdomain
    export ORACLE_SID=jhdb1
    export ORACLE_BASE=/u01/app/oracle
    export ORACLE_HOME=$ORACLE_BASE/product/11.2.0/db_1
    export ORACLE_UNQNAME=devdb
    export TNS_ADMIN=$ORACLE_HOME/network/admin
    export ORACLE_TERM=xterm
    export PATH=/usr/sbin:$PATH
    export PATH=$ORACLE_HOME/bin:$PATH
    export LD_LIBRARY_PATH=$ORACLE_HOME/lib:/lib:/usr/lib
    export CLASSPATH=$ORACLE_HOME/JRE:$ORACLE_HOME/jlib:$ORACLE_HOME/rdbms/jlib
    export EDITOR=vi
    export LANG=en_US
    export NLS_LANG=american_america.AL32UTF8
    export NLS_DATE_FORMAT='yyyy/mm/dd hh24:mi:ss'
    umask 022
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    grid用户

    [root@node1 ~]# vi /home/grid/.bash_profile

    # .bash_profile

    # Get the aliases and functions
    if [ -f ~/.bashrc ]; then
    . ~/.bashrc
    fi

    # User specific environment and startup programs

    PATH=$PATH:$HOME/bin

    export PATH
    export PS1="[u@H$]"
    export TMP=/tmp
    export TMPDIR=$TMP
    export ORACLE_SID=+ASM1
    export ORACLE_BASE=/u01/app/grid
    export ORACLE_HOME=/u01/app/11.2.0/grid
    export ORACLE_TERM=xterm
    export NLS_DATE_FORMAT='yyyy/mm/dd hh24:mi:ss'
    export TNS_ADMIN=$ORACLE_HOME/network/admin
    export PATH=/usr/sbin:$PATH
    export PATH=$ORACLE_HOME/bin:$PATH
    export LD_LIBRARY_PATH=$ORACLE_HOME/lib:/lib:/usr/lib
    export CLASSPATH=$ORACLE_HOME/JRE:$ORACLE_HOME/jlib:$ORACLE_HOME/rdbms/jlib
    export EDITOR=vi
    export LANG=en_US
    export NLS_LANG=american_america.AL32UTF8
    umask 022
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    启用用户默认shell环境变量:
    Bash shell:

    $ . ./.bash_profile
    1
    Bourne, Bash, or Korn shell:

    $ . ./.profile
    1
    C shell:

    % source ./.login
    1
    提示
    PS1有那些配置,或者说PS1里头都能配置些命令提示符的什么东西:

    d :代表日期,格式为weekday month date,例如:"Mon Aug 1"
    H :完整的主机名称。例如:我的机器名称为:fc4.linux,则这个名称就是fc4.linux
    h :仅取主机的第一个名字,如上例,则为fc4,.linux则被省略
    :显示时间为24小时格式,如:HH:MM:SS
    T :显示时间为12小时格式
    A :显示时间为24小时格式:HH:MM
    u :当前用户的账号名称
    v :BASH的版本信息
    w :完整的工作目录名称。家目录会以 ~代替
    W :利用basename取得工作目录名称,所以只会列出最后一个目录
    # :下达的第几个命令
    $ :提示字符,如果是root时,提示符为:# ,普通用户则为:$
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    设置目录权限(node1,node2都需要配置)
    mkdir -p /u01/app/11.2.0/grid
    chown -R grid:oinstall /u01
    mkdir /u01/app/oracle
    chown oracle:oinstall /u01/app/oracle
    chmod -R 775 /u01/
    1
    2
    3
    4
    5
    配置ssh对等性(node1,node2都需要配置)
    [oracle@node2$]ssh-keygen -t rsa
    [oracle@node2$]ssh-keygen -t dsa
    [oracle@node2$]cat ~/.ssh/*.pub >>~/.ssh/aythorized_keys
    [oracle@node2$]su - grid
    [grid@node2$]ssh-keygen -t rsa
    [grid@node2$]ssh-keygen -t dsa
    [grid@node2$]cat ~/.ssh/*.pub >>~/.ssh/authorized_keys

    [grid@node2$]scp ~/.ssh/authorized_keys node1:/home/grid/.ssh/keys

    [oracle@node1$]ssh-keygen -t rsa
    [oracle@node1$]ssh-keygen -t dsa
    [oracle@node1$]cat ~/.ssh/*.pub >>~/.ssh/aythorized_keys
    [oracle@node1$]su - grid
    [grid@node1$]ssh-keygen -t rsa
    [grid@node1$]ssh-keygen -t dsa
    [grid@node1$]cat ~/.ssh/*.pub >>~/.ssh/authorized_keys
    [grid@node1$]cat ~/.ssh/keys >>~/.ssh/authorized_keys
    [grid@node1$]scp ~/.ssh/authorized_keys node2:/home/grid/.ssh/authorized_keys
    [oracle@node1$]ssh node1.localdomain date
    [oracle@node1$]ssh node1 date
    [oracle@node1$]ssh node1-pri date
    [oracle@node1$]ssh node1-pri.localdomain date
    [oracle@node1$]ssh node2.localdomain date
    [oracle@node1$]ssh node2 date
    [oracle@node1$]ssh node2-pri date
    [oracle@node1$]ssh node2-pri.localdomain date
    ##如果没有scp这个命令,就要‘ yum -y install openssh-clients.x86_64’
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    在做了ssh对等性之后,修改oracle,grid用户系统密码和修改/etc/hosts ip地址,对等性仍然有效。

    DNS配置(可选性配置):
    对于使用/etc/hosts做IP解析就不需要配置dns/gns。
    如果配置有多个scan-ip 最好还是配置上dns,多个sacn-ip绑定同一个域名,用户通过scan-ip域名来连接数据库。

    1.软件包:
    bind-9.3.6-4.Pl.el5_4.2.x86_64.rpm
    bind-chroot-9.3.6-4.Pl.el5_4.2.x86_64.rpm
    caching-nameserver-9.3.6-4.Pl.el5_4.2.x86_64.rpm

    2.配置/var/named/chroot/etc/named.conf文件
    由拷贝而来
    cp -p named.caching-nameserver.conf named.conf
    把 127.0.0.1 改成 “ any; ”允许所有的ip去访问
    3.配置ZONE文件
    修改/var/named/chroot/etc/named.rfrc1912.zons文件
    主要目的是为了能正确解析SCAN-IP,
    正向zone文件

    zone "localdomain" IN {
    type master;
    file "localdomain.zone";
    allow-update { none; };
    };

    # scan
    192.168.56.140 rac.scan.localdomain rac-scan
    1
    2
    3
    4
    5
    6
    7
    8
    反向zone文件

    zone "56.168.192.in-addr.arpa" IN{
    type master;
    file "56.168.192.in-addr.arpa";
    allow-update { none; };
    };
    1
    2
    3
    4
    5
    6
    配置正反向解析数据库文件:
    /var/named/chroot/var/named
    正向解析数据库文件 localdomain.zone
    rac-scan IN A 192.168.56.140

    反向解析数据库文件 cp -p named.local 56.168.192.in-addr.arpa
    140 IN PTR rac-scan.localdomain.

    启动DNS服务器 /etc/init.d/named start

    校验
    rac1:
    配置 /etc/resolv.conf
    search localdomain
    nameserver 192.168.56.120

    rac2:
    配置 /etc/resolv.conf
    search localdomain
    nameserver 192.168.56.120

    验证 nslookup rac-scan、nslookup rac-scan.localdomain nslookup 10.37.4.173

    centos 6 DNS配置:
    1:yum -y install bind-chroot.x86_64 bind.x86_64
    2: vi /etc/named.conf
    把 127.0.0.1 、localhost 改成 any

    反向解析

    zone "4.37.10.in-addr.arpa" IN {
    type master;
    file "4.37.10.in-addr.arpa.zone";
    allow-update { none; };
    };
    1
    2
    3
    4
    5
    正向解析

    zone "localdomain" IN {
    type master;
    file "named.localhost";
    allow-update { none; };
    };
    1
    2
    3
    4
    5
    3:正向解析DNS库

    vi /var/named/named.localhost
    $TTL 86400
    @ IN SOA @ root.localdomain. (
    42 ; serial
    3H ; refresh
    15M ; retry
    15W ; expire
    1D ) ; minimum

    NS @
    A 10.37.4.170
    rac-scan IN A 10.37.4.173
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    4:反向解析DNS库

    vi /var/named/4.37.10.in-addr.arpa.zone
    $TTL 86400
    @ IN SOA 4.37.10.in-addr.arpa. localhost.localdomain. (
    0 ; serial
    1D ; refresh
    1H ; retry
    1W ; expire
    3H ) ; minimum

    NS @
    A 10.37.4.170
    173 IN PTR rac-scan.localdomain.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    重启DNS服务器
    service named restart

    **CentOS DNS resolv重启无效的解决方法
    在此要强调一点的是,直接修改/etc/resolv.conf这个文件是没用的,网络服务重启以后会根据/etc/sysconfig/network-scripts/ifcfg-eth0来重载配置,如果ifcfg-eth0没有配置DNS,那么resolv.conf会被冲掉,重新变成空值。
    解决方案:在/etc/sysconfig/network-scripts/ifcfg-eth0单独增加一行“DNS=10.37.4.170”

    查看会话端口 netstat -ltunp |grep named

    ASM共享存储配置:
    在oracle 11g之前还可以使用raw来配置asm共享磁盘,11g之后已经被抛弃。
    如果使用网络访问(NAS存储),可以使用iscsi协议来分配存储。由于在linux系统下,所有设备卷标都不是固定的,所以不能直接通过卷标来创建共享磁盘,这里提供3种方法:
    1. udev
    2. multipath
    3. asmlib
    这里着重介绍udev,udev通过使用磁盘或者分区的uuid来区分磁盘,通过把卷标跟uuid做绑定从而实现磁盘固话操作。
    这里引用一段话:

    Linux UUID的作用及意义
    原因1:它是真正的唯一标志符
    UUID为系统中的存储设备提供唯一的标识字符串,不管这个设备是什么类型的。如果你在系统中添加了新的存储设备如硬盘,很可能会造成一些麻烦,比如说启动的时候因为找不到设备而失败,而使用UUID则不会有这样的问题。
    原因2:设备名并非总是不变的
    自动分配的设备名称并非总是一致的,它们依赖于启动时内核加载模块的顺序。如果你在插入了USB盘时启动了系统,而下次启动时又把它拔掉了,就有可能导致设备名分配不一致。
    使用UUID对于挂载移动设备也非常有好处──例如我有一个24合一的读卡器,它支持各种各样的卡,而使用UUID总可以使同一块卡挂载在同一个地方。
    注意*UUID只能在创建文件系统之后才能得到。
    如何安装和配置iscsi 请参考另外一篇博文 http://blog.csdn.net/a743044559/article/details/77841069。
    共享磁盘配置请参考我的另外一篇博文http://blog.csdn.net/a743044559/article/details/77840960。
    注意:在虚拟机搭建oracle rac越来越多,很多企业的生产环境也是部署在虚拟机中,在默认情况下虚拟机是不开启uuid的,这个时候要么开启要么使用udevadm info获取磁盘或者分区起始位置来绑定磁盘。

    由于资源所限 使用iscsi来共享磁盘, udev来绑定磁盘。
    node1安装iscsi-target,绑定sdb1,sdb2;node2安装iscsi-init,通过iqn访问node1的sdb1,sdb2共享磁盘。

    安装iscsi 服务端
    [root@node1 ~]# yum -y install scsi-target-utils.x86_64
    [root@node1 ~]# service tgtd start
    Starting SCSI target daemon: [ OK ]
    [root@node1 ~]# chkconfig tgtd on
    [root@node1 ~]# vi /etc/tgt/targets.conf

    <target iqn.1994-05.com.redhat:5cd487a6635d>
    backing-store /dev/sdb1
    backing-store /dev/sdb2
    </target>
    1
    2
    3
    4
    [root@node1 ~]# service tgtd restart
    Stopping SCSI target daemon: [ OK ]
    Starting SCSI target daemon: [ OK ]

    查看tgt绑定信息:

    [root@node1 ~]# tgtadm --lld iscsi --mode target --op show

    Target 1: iqn.1994-05.com.redhat:5cd487a6635d
    System information:
    Driver: iscsi
    State: ready
    I_T nexus information:
    LUN information:
    LUN: 0
    Type: controller
    SCSI ID: IET 00010000
    SCSI SN: beaf10
    Size: 0 MB, Block size: 1
    Online: Yes
    Removable media: No
    Prevent removal: No
    Readonly: No
    Backing store type: null
    Backing store path: None
    Backing store flags:
    LUN: 1
    Type: disk
    SCSI ID: IET 00010001
    SCSI SN: beaf11
    Size: 107381 MB, Block size: 512
    Online: Yes
    Removable media: No
    Prevent removal: No
    Readonly: No
    Backing store type: rdwr
    Backing store path: /dev/sdb1
    Backing store flags:
    LUN: 2
    Type: disk
    SCSI ID: IET 00010002
    SCSI SN: beaf12
    Size: 107365 MB, Block size: 512
    Online: Yes
    Removable media: No
    Prevent removal: No
    Readonly: No
    Backing store type: rdwr
    Backing store path: /dev/sdb2
    Backing store flags:
    Account information:
    ACL information:
    10.37.2.171/24
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    由上可知,sdb1,sdb2已完成绑定。

    安装iscsi

    [root@node1 ~]# yum -y install iscsi-initiator-utils.x86_64
    1
    查看iscsi-iqn

    [root@node1 ~]# vi /etc/iscsi/initiatorname.iscsi

    InitiatorName=iqn.1994-05.com.redhat:5cd487a6635d
    1
    2
    3
    启动iscsi服务:

    [root@node2 bak]# service iscsi start
    [root@node2 bak]# service iscsid start
    1
    2
    在node2探测node1共享的磁盘:

    [root@node2 bak]# iscsiadm -m discovery -t sendtargets -p 10.37.2.170
    10.37.2.170:3260,1 iqn.1994-05.com.redhat:5cd487a6635d
    1
    2
    登录

    [root@node2 bak]# iscsiadm -m node -d 1 -T iqn.1994-05.com.redhat:5cd487a6635d -l
    Logging in to [iface: default, target: iqn.1994-05.com.redhat:5cd487a6635d, portal: 10.37.2.170,3260] (multiple)
    Login to [iface: default, target: iqn.1994-05.com.redhat:5cd487a6635d, portal: 10.37.2.170,3260] successful.
    1
    2
    3
    通过fdisk -l 可以看到node1绑定的sdb1,sdb2已共享到node2中。

    设置iscsi target自动绑定

    [root@node2 bak]# iscsiadm -m node -T iqn.1994-05.com.redhat:5cd487a6635d -p 10.37.2.170 --op update -n node.startup -v automatic
    1
    设置iscsi-init自启动

    [root@node2 bak]# chkconfig iscsi on
    [root@node2 bak]# chkconfig iscsid on
    1
    2
    udev固化磁盘
    在node1执行,获取uuid
    [root@node1 ~]# /sbin/scsi_id /dev/sdb1
    [root@node1 ~]# /sbin/scsi_id /dev/sdb2
    对于rhel 6 获取磁盘uuid发生变化,要使用“/sbin/scsi_id -g -u /dev/sdc”来获取。
    结果发现为空,在虚拟环境,uuid默认关闭,这个时候我可就需要使用udevadm info获取磁盘或者分区的起始位置点。

    [root@node2 ~]# fdisk -l

    Disk /dev/sdb: 107.4 GB, 107380998144 bytes
    255 heads, 63 sectors/track, 13054 cylinders
    Units = cylinders of 16065 * 512 = 8225280 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk identifier: 0x00000000


    Disk /dev/sdc: 107.4 GB, 107364579840 bytes
    255 heads, 63 sectors/track, 13053 cylinders
    Units = cylinders of 16065 * 512 = 8225280 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk identifier: 0x00000000
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    在node1执行udevadm info获取磁盘起始位置点:

    [root@node1 ~]# udevadm info -a -p /sys/block/sdb/sdb1
    looking at device '/devices/pci0000:00/0000:00:10.0/host0/target0:0:1/0:0:1:0/block/sdb/sdb1':
    KERNEL=="sdb1"
    SUBSYSTEM=="block"
    DRIVER==""
    ATTR{partition}=="1"
    ATTR{start}=="63"
    ATTR{size}=="209728512"
    ATTR{alignment_offset}=="0"
    ATTR{discard_alignment}=="0"
    ATTR{stat}==" 342 180 4176 77 0 0 0 0 0 77 77"
    ATTR{inflight}==" 0 0"

    [root@node1 ~]# udevadm info -a -p /sys/block/sdb/sdb2
    looking at device '/devices/pci0000:00/0000:00:10.0/host0/target0:0:1/0:0:1:0/block/sdb/sdb2':
    KERNEL=="sdb2"
    SUBSYSTEM=="block"
    DRIVER==""
    ATTR{partition}=="2"
    ATTR{start}=="209728575"
    ATTR{size}=="209696445"
    ATTR{alignment_offset}=="0"
    ATTR{discard_alignment}=="0"
    ATTR{stat}==" 348 2523 2883 28 0 0 0 0 0 23 27"
    ATTR{inflight}==" 0 0"
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    node1编辑udev绑定文件:

    [root@node1 ~]# vi /etc/udev/rules.d/99-oracle-asmdevices.rules
    KERNEL=="sdb1",SUBSYSTEM=="block",SYSFS{size}=="209728512",SYSFS{start}=="63",NAME="asm-disk1", OWNER="grid", GROUP="asmadmin", MODE="0660"
    KERNEL=="sdb2",SUBSYSTEM=="block",SYSFS{size}=="209696445",SYSFS{start}=="209728575",NAME="asm-disk2", OWNER="grid", GROUP="asmadmin", MODE="0660"
    1
    2
    3
    node1绑定的sdb1,sdb2在node2节点上共享访问卷标为 sdb、sdc;但是在node2通过udevadm info 得不到磁盘起始位置,node1的分区对于node2来说是一块完整的磁盘,如下:
    [root@node2 ~]# fdisk -l

    Disk /dev/sdb: 107.4 GB, 107380998144 bytes
    255 heads, 63 sectors/track, 13054 cylinders
    Units = cylinders of 16065 * 512 = 8225280 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk identifier: 0x00000000

    Disk /dev/sdc: 107.4 GB, 107364579840 bytes
    255 heads, 63 sectors/track, 13053 cylinders
    Units = cylinders of 16065 * 512 = 8225280 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk identifier: 0x00000000
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    node2节点执行udevadm info获取磁盘起始位置

    [root@node2 ~]# udevadm info -a -p /sys/block/sdb/
    looking at device '/devices/platform/host3/session1/target3:0:0/3:0:0:1/block/sdb':
    KERNEL=="sdb"
    SUBSYSTEM=="block"
    DRIVER==""
    ATTR{range}=="16"
    ATTR{ext_range}=="256"
    ATTR{removable}=="0"
    ATTR{ro}=="0"
    ATTR{size}=="209728512"
    ATTR{alignment_offset}=="0"
    ATTR{discard_alignment}=="0"
    ATTR{capability}=="52"
    ATTR{stat}==" 174 13 1496 65 0 0 0 0 0 65 65"
    ATTR{inflight}==" 0 0"

    [root@node2 ~]# udevadm info -a -p /sys/block/sdc
    looking at device '/devices/platform/host3/session1/target3:0:0/3:0:0:2/block/sdc':
    KERNEL=="sdc"
    SUBSYSTEM=="block"
    DRIVER==""
    ATTR{range}=="16"
    ATTR{ext_range}=="256"
    ATTR{removable}=="0"
    ATTR{ro}=="0"
    ATTR{size}=="209696445"
    ATTR{alignment_offset}=="0"
    ATTR{discard_alignment}=="0"
    ATTR{capability}=="52"
    ATTR{stat}==" 187 1582 1769 72 0 0 0 0 0 72 72"
    ATTR{inflight}==" 0 0"
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    node2节点获取磁盘uuid又可以得到

    [root@node2 ~]# /sbin/scsi_id -g -u /dev/sdc
    1IET_00010002
    [root@node2 ~]# /sbin/scsi_id -g -u /dev/sdb
    1IET_00010001
    1
    2
    3
    4
    由上可见,udevadm info获取不到磁盘的开始位置,但是磁盘大小是一样。
    遇到这个问题是始料未及的,有两种想法:
    1.既然两个节点磁盘一样,都按照node1的信息来绑定,
    2.node2按照uuid来绑定。

    方法1:这里试验按照第一种方法,按照node1磁盘信息来再node2上绑定磁盘

    [root@node2 ~]# vi /etc/udev/rules.d/99-oracle-asmdevices.rules

    KERNEL=="sdb",SUBSYSTEM=="block",SYSFS{size}=="209728512",SYSFS{start}=="63",NAME="asm-disk1", OWNER="grid", GROUP="asmadmin", MODE="0660"
    KERNEL=="sdc",SUBSYSTEM=="block",SYSFS{size}=="209696445",SYSFS{start}=="209728575",NAME="asm-disk*", OWNER="grid", GROUP="asmadmin", MODE="0660"
    1
    2
    3
    4
    启动udev:

    On SLES10:

    # /etc/init.d/boot.udev stop
    # /etc/init.d/boot.udev start
    1
    2
    On RHEL5/OEL5/OL5:

    # /sbin/udevcontrol reload_rules
    # /sbin/start_udev
    1
    2
    On RHEL6/OL6:

    #/sbin/udevadm control --reload-rules
    #/sbin/start_udev
    1
    2
    验证:
    node1

    [root@node1 ~]# ls -al /dev/asm-*
    brw-rw----. 1 grid asmadmin 8, 17 Sep 13 08:58 /dev/asm-disk1
    brw-rw----. 1 grid asmadmin 8, 18 Sep 13 08:58 /dev/asm-disk2
    node2

    [root@node2 ~]# ls -al /dev/asm-*
    ls: cannot access /dev/asm-*: No such file or directory
    1
    2
    3
    4
    5
    6
    7
    由此可见,node2使用udevadm info绑定失败。

    方法2:
    node2使用uuid绑定

    [root@node2 ~]# /sbin/scsi_id -g -u /dev/sdd
    1IET_00010002
    [root@node2 ~]# /sbin/scsi_id -g -u /dev/sdc
    1IET_00010001

    [root@node2 ~]# vi /etc/udev/rules.d/99-oracle-asmdevices.rules
    KERNEL=="sd*", BUS=="scsi",PROGRAM=="/sbin/scsi_id --whitelisted --replace-whitespace --device=/dev/$name" RESULT=="1IET_00010001", NAME="asm-disk1", OWNER="grid", GROUP="asmadmin", MODE="0660"
    KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id --whitelisted --replace-whitespace --device=/dev/$name" RESULT=="1IET_00010002", NAME="asm-disk2", OWNER="grid", GROUP="asmadmin", MODE="0660"
    1
    2
    3
    4
    5
    6
    7
    8
    启动,验证

    [root@node2 ~]# /sbin/udevadm control --reload-rules
    [root@node2 ~]# /sbin/start_udev
    Starting udev: [ OK ]

    [root@node2 ~]# ls -al /dev/asm-*
    brw-rw----. 1 grid asmadmin 8, 16 Sep 13 09:15 /dev/asm-disk1
    brw-rw----. 1 grid asmadmin 8, 32 Sep 13 09:15 /dev/asm-disk2
    1
    2
    3
    4
    5
    6
    7
    结论:node2使用uuid绑定成功,node1使用udevadm info绑定成功;这样不管磁盘符如何变,呈现给oracle始终是asm-disk1,asm-disk2,从而实现固化作用。

    先留几个疑问:
    1 . 通过iscsi共享的磁盘,udevadm info在第二个节点获取不到磁盘起始信息;
    2 . 通过iscsi共享的磁盘,使用/sbin/scsi_id -g -u 获取不到分区的UUID。

    配置 yum
    [root@node1 ~]# mount -t iso9660 /dev/cdrom /mnt
    [root@node1 ~]# mount /dev/cdrom /mnt
    [root@node1 ~]# vi /etc/yum.repos.d/rhel-source.repo

    [rhel-source]
    name=Red Hat Enterprise Linux $releasever - Source
    baseurl=file:///mnt
    enabled=1
    gpgcheck=0
    1
    2
    3
    4
    5
    6
    7
    8
    9
    挂载iso文件
    [root@node1 ~]# mount -o loop CentOS-6.5-x86_64-bin-DVD1.iso /mnt
    1
    安装依赖包
    for x86 (32-bit) :
    1 . Linux x86 (32-bit) 操作系统内核要求:

    2 . Linux x86 (32-bit)依赖软件包:

    2.1 系统: Asianux 2, Enterprise Linux 4, and Red Hat Enterprise Linux 4

    The following packages (or later versions) must be installed:
    binutils-2.15.92.0.2
    compat-libstdc++-33.2.3
    elfutils-libelf-0.97
    elfutils-libelf-devel-0.97
    gcc-3.4.6
    gcc-c++-3.4.6
    glibc-2.3.4-2.41
    glibc-common-2.3.4
    glibc-devel-2.3.4
    glibc-headers-2.3.4
    libaio-devel-0.3.105
    libaio-0.3.105
    libgcc-3.4.6
    libstdc++-3.4.6
    libstdc++-devel-3.4.6
    make-3.80
    pdksh-5.2.14
    sysstat-5.0.5
    unixODBC-2.2.11
    unixODBC-devel-2.2.11
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    2.2 系统:Asianux Server 3, Enterprise Linux 5, and Red Hat Enterprise Linux 5

    The following packages (or later versions) must be installed:
    binutils-2.17.50.0.6
    compat-libstdc++-33-3.2.3
    elfutils-libelf-0.125
    elfutils-libelf-devel-0.125
    elfutils-libelf-devel-static-0.125
    gcc-4.1.2
    gcc-c++-4.1.2
    glibc-2.5-24
    glibc-common-2.5
    glibc-devel-2.5
    glibc-headers-2.5
    kernel-headers-2.6.18
    ksh-20060214
    libaio-0.3.106
    libaio-devel-0.3.106
    libgcc-4.1.2
    libgomp-4.1.2
    libstdc++-4.1.2
    libstdc++-devel-4.1.2
    make-3.81
    sysstat-7.0.2
    unixODBC-2.2.11
    unixODBC-devel-2.2.11
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    2.3 系统:SUSE 10

    The following packages (or later versions) must be installed:
    binutils-2.16.91.0.5
    compat-libstdc++-5.0.7
    gcc-4.1.2
    gcc-c++-4.1.2
    glibc-2.5-24
    glibc-devel-2.4
    ksh-93r-12.9
    libaio-0.3.104
    libaio-devel-0.3.104
    libelf-0.8.5
    libgcc-4.1.2
    libstdc++-4.1.2
    libstdc++-devel-4.1.2
    make-3.80
    sysstat-8.0.4
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    2.4 系统SUSE 11

    The following packages (or later versions) must be installed:
    binutils-2.19
    gcc-4.3
    gcc-c++-4.3
    glibc-2.9
    glibc-devel-2.9
    ksh-93t
    libstdc++33-3.3.3
    libstdc++43-4.3.3_20081022
    libstdc++43-devel-4.3.3_20081022
    libaio-0.3.104
    libaio-devel-0.3.104
    libgcc43-4.3.3_20081022
    libstdc++-devel-4.3
    make-3.81
    sysstat-8.1.5
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    for x86-64:
    1 . Linux x86-64 系统内核要求:

    # cat /proc/version
    1


    2 . Linux x86-64 依赖软件包要求:

    2.1 . 系统:Asianux 2, Enterprise Linux 4, and Red Hat Enterprise Linux 4

    The following packages (or later versions) must be installed:
    binutils-2.15.92.0.2
    compat-libstdc++-33-3.2.3
    compat-libstdc++-33-3.2.3 (32 bit)
    elfutils-libelf-0.97
    elfutils-libelf-devel-0.97
    expat-1.95.7
    gcc-3.4.6
    gcc-c++-3.4.6
    glibc-2.3.4-2.41
    glibc-2.3.4-2.41 (32 bit)
    glibc-common-2.3.4
    glibc-devel-2.3.4
    glibc-headers-2.3.4
    libaio-0.3.105
    libaio-0.3.105 (32 bit)
    libaio-devel-0.3.105
    libaio-devel-0.3.105 (32 bit)
    libgcc-3.4.6
    libgcc-3.4.6 (32-bit)
    libstdc++-3.4.6
    libstdc++-3.4.6 (32 bit)
    libstdc++-devel 3.4.6
    make-3.80
    pdksh-5.2.14
    sysstat-5.0.5
    unixODBC-2.2.11
    unixODBC-2.2.11 (32 bit)
    unixODBC-devel-2.2.11
    unixODBC-devel-2.2.11 (32 bit)
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    2.2 . 系统:Asianux Server 3, Enterprise Linux 5, and Red Hat Enterprise Linux 5

    The following packages (or later versions) must be installed:
    binutils-2.17.50.0.6
    compat-libstdc++-33-3.2.3
    compat-libstdc++-33-3.2.3 (32 bit)
    elfutils-libelf-0.125
    elfutils-libelf-devel-0.125
    gcc-4.1.2
    gcc-c++-4.1.2
    glibc-2.5-24
    glibc-2.5-24 (32 bit)
    glibc-common-2.5
    glibc-devel-2.5
    glibc-devel-2.5 (32 bit)
    glibc-headers-2.5
    ksh-20060214
    libaio-0.3.106
    libaio-0.3.106 (32 bit)
    libaio-devel-0.3.106
    libaio-devel-0.3.106 (32 bit)
    libgcc-4.1.2
    libgcc-4.1.2 (32 bit)
    libstdc++-4.1.2
    libstdc++-4.1.2 (32 bit)
    libstdc++-devel 4.1.2
    make-3.81
    sysstat-7.0.2
    unixODBC-2.2.11
    unixODBC-2.2.11 (32 bit)
    unixODBC-devel-2.2.11
    unixODBC-devel-2.2.11 (32 bit)
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    2.3 . SUSE 10

    The following packages (or later versions) must be installed:
    binutils-2.16.91.0.5
    compat-libstdc++-5.0.7
    gcc-4.1.0
    gcc-c++-4.1.2
    glibc-2.5-24
    glibc-devel-2.4
    glibc-devel-32bit-2.4
    ksh-93r-12.9
    libaio-0.3.104
    libaio-32bit-0.3.104
    libaio-devel-0.3.104
    libaio-devel-32bit-0.3.104
    libelf-0.8.5
    libgcc-4.1.2
    libstdc++-4.1.2
    make-3.80
    sysstat-8.0.4
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    2.4 SUSE 11

    The following packages (or later versions) must be installed:
    binutils-2.19
    gcc-4.3
    gcc-32bit-4.3
    gcc-c++-4.3
    glibc-2.9
    glibc-32bit-2.9
    glibc-devel-2.9
    glibc-devel-32bit-2.9
    ksh-93t
    libaio-0.3.104
    libaio-32bit-0.3.104
    libaio-devel-0.3.104
    libaio-devel-32bit-0.3.104
    libstdc++33-3.3.3
    libstdc++33-32bit-3.3.3
    libstdc++43-4.3.3_20081022
    libstdc++43-32bit-4.3.3_20081022
    libstdc++43-devel-4.3.3_20081022
    libstdc++43-devel-32bit-4.3.3_20081022
    libgcc43-4.3.3_20081022
    libstdc++-devel-4.3
    make-3.81
    sysstat-8.1.5
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    时间同步设置
    11gr2(11.2)版本支持NTP(Network Time Protocol )和CTSS( Cluster Time Synchronization Service)时间同步方式。
    如果使用CTSS时间同步,就需要把NTP服务停止掉。

    # /sbin/service ntpd stop
    # chkconfig ntpd off
    # rm /etc/ntp.conf
    or,
    mv /etc/ntp.conf to /etc/ntp.conf.org.
    1
    2
    3
    4
    5
    在安装GI的时候,如果检测到NTP非活动状态,系统会自动安装CTSS服务。

    安装cvu包
    cvu 是oracle集群提供的检查工具,用于检查集群环境。

    # CVUQDISK_GRP=oinstall; export CVUQDISK_GRP
    # rpm -iv cvuqdisk-1.0.7-1.rpm
    1
    2
    安装pdksh
    rpm -ivh pdksh-5.2.14-37.el5_8.1.x86_64.rpm --force --nodeps
    1
    IPMI(Intelligent Platform Management Interface (可选)
    impl需要有硬件支持,条件不具备可以不安装。

    为了方便我一般都安装如下依赖包:

    [root@node1 ~]# yum -y install binutils compat-libstdc++-33 elfutils-libelf elfutils-libelf-devel gcc gcc-c++ glibc-2.5 glibc-common glibc-devel glibc-headers ksh libaio libaio-devel libgcc libstdc++ libstdc++-devel make sysstat
    1
    GI环境验证
    [root@node1 ~]# su - grid
    [grid@node1.localdomain$]./runcluvfy.sh stage -pre crsinst -n node1,node2 -fixup -verbose
    1
    2
    问题 1:
    [root@node1 CVU_11.2.0.4.0_grid]# sh /tmp/CVU_11.2.0.4.0_grid/runfixup.sh
    Response file being used is :/tmp/CVU_11.2.0.4.0_grid/fixup.response
    Enable file being used is :/tmp/CVU_11.2.0.4.0_grid/fixup.enable
    Log file location: /tmp/CVU_11.2.0.4.0_grid/orarun.log
    Installing Package /tmp/CVU_11.2.0.4.0_grid//cvuqdisk-1.0.9-1.rpm
    Preparing... ########################################### [100%]
    ls: cannot access /usr/sbin/smartctl: No such file or directory
    /usr/sbin/smartctl not found.
    error: %pre(cvuqdisk-1.0.9-1.x86_64) scriptlet failed, exit status 1
    error: install: %pre scriptlet failed (2), skipping cvuqdisk-1.0.9-1
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    解决:提示找不到安装目录,可见缺少安装包,安装smartmontools
    [root@node1 CVU_11.2.0.4.0_grid]# yum install -y smartmontools
    [root@node1 CVU_11.2.0.4.0_grid]# rpm -ivh /home/pdksh-5.2.14-37.el5_8.1.x86_64.rpm –force –nodeps
    安装grid软件

    [root@node1 ~]# export DISPLAY=:0.0(图形映射到本地)
    [root@node1 ~]# xhost +

    [root@node1 ~]# su - grid
    [grid@node1.localdomain$]./runIstaller
    1
    2
    3
    4
    5
    问题2:
    在centos 6.6 ,oracle 11.2.0.4.0 的时候在执行root.sh有一个bug,

    Failed to create keys in the OLR, rc = 127, Message:
    /u01/app/11.2.0/grid/bin/clscfg.bin: error while loading shared libraries: libcap.so.1: cannot open shared object file: No such file or directory
    1
    2
    –解决方案:

    [root@node2 CVU_11.2.0.4.0_grid]# find / -name "libcap.so*"
    /lib64/libcap.so.2.16
    /lib64/libcap.so.2
    [root@node2 CVU_11.2.0.4.0_grid]# ln /lib64/libcap.so.2 /lib64/libcap.so.1
    1
    2
    3
    4
    问题3:
    在node2执行root.sh报错

    Disk Group DATA creation failed with the following message:
    ORA-15018: diskgroup cannot be created
    ORA-15080: synchronous I/O operation to a disk failed
    ORA-27061: waiting for async I/Os failed
    1
    2
    3
    4
    在node1执行root.sh可以成功,然后在node2查看磁盘信息,发现共享磁盘没了,没了??!!
    难道iscsi共享磁盘同时只能被一方访问?
    在node1节点查看tgt绑定信息发现“ Backing store path: None”为空,难怪node2节点看不到共享磁盘.最后发现targets.conf 配置iqn不对,汗颜!!

    [root@node1 CVU_11.2.0.4.0_grid]# tgtadm --lld iscsi --mode target --op show
    Target 1: iqn.1994-05.com.redhat:5cd487a6635d
    System information:
    Driver: iscsi
    State: ready
    I_T nexus information:
    I_T nexus: 1
    Initiator: iqn.1994-05.com.redhat:fcdisk.sdb
    Connection: 0
    IP Address: 10.37.2.170
    I_T nexus: 2
    Initiator: iqn.1994-05.com.redhat:11dd4f35feeb
    Connection: 0
    IP Address: 10.37.2.171
    LUN information:
    LUN: 0
    Type: controller
    SCSI ID: IET 00010000
    SCSI SN: beaf10
    Size: 0 MB, Block size: 1
    Online: Yes
    Removable media: No
    Prevent removal: No
    Readonly: No
    Backing store type: null
    Backing store path: None
    Backing store flags:
    Account information:
    ACL information:
    ALL
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    编辑GI响应文件
    vi /home/grid/grid_install.rsp

    oracle.install.responseFileVersion=/oracle/install/rspfmt_crsinstall_response_schema_v11_2_0
    ORACLE_HOSTNAME=node1.localdomain
    INVENTORY_LOCATION=/u01/app/oraInventory
    SELECTED_LANGUAGES=en
    oracle.install.option=CRS_CONFIG
    ORACLE_BASE=/u01/app/grid
    ORACLE_HOME=/u01/app/11.2.0/grid
    oracle.install.asm.OSDBA=asmdba
    oracle.install.asm.OSOPER=asmoper
    oracle.install.asm.OSASM=asmadmin
    oracle.install.crs.config.gpnp.scanName=rac-scan
    oracle.install.crs.config.gpnp.scanPort=1521
    oracle.install.crs.config.clusterName=node-cluster
    oracle.install.crs.config.gpnp.configureGNS=false
    oracle.install.crs.config.gpnp.gnsSubDomain=
    oracle.install.crs.config.gpnp.gnsVIPAddress=
    oracle.install.crs.config.autoConfigureClusterNodeVIP=false
    oracle.install.crs.config.clusterNodes=node1.localdomain:node1-vip.localdomain,node2.localdomain:node2-vip.localdomain
    oracle.install.crs.config.networkInterfaceList=eth0:10.37.2.0:1,eth1:192.168.52.0:2
    oracle.install.crs.config.storageOption=ASM_STORAGE
    oracle.install.crs.config.sharedFileSystemStorage.diskDriveMapping=
    oracle.install.crs.config.sharedFileSystemStorage.votingDiskLocations=
    oracle.install.crs.config.sharedFileSystemStorage.votingDiskRedundancy=NORMAL
    oracle.install.crs.config.sharedFileSystemStorage.ocrLocations=
    oracle.install.crs.config.sharedFileSystemStorage.ocrRedundancy=NORMAL
    oracle.install.crs.config.useIPMI=false
    oracle.install.crs.config.ipmi.bmcUsername=
    oracle.install.crs.config.ipmi.bmcPassword=
    oracle.install.asm.SYSASMPassword=oracle
    oracle.install.asm.diskGroup.name=DATA
    oracle.install.asm.diskGroup.redundancy=EXTERNAL
    oracle.install.asm.diskGroup.AUSize=1
    oracle.install.asm.diskGroup.disks=/dev/asm-disk1
    oracle.install.asm.diskGroup.diskDiscoveryString=/dev/asm-*
    oracle.install.asm.monitorPassword=oracle
    oracle.install.crs.upgrade.clusterNodes=
    oracle.install.asm.upgradeASM=false
    oracle.installer.autoupdates.option=SKIP_UPDATES
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    注意:
    1 . scan listener port :在默认情况下1521,scan listener 和scan vip是一一对应的。
    2 . GNS (Grid Naming Service): GNS的主要功能是在负载的网络情况下,配置DHCP为集群的VIP, SCAN VIP自动分配IP地址。 如果使用GNS的话, 需要为GNS指定一个固定的VIP地址并且将 GNS名称和 GNSVIP 添加到DNS上, 而且GNSVIP 需要和 集群公网 在同一个网段上。 一般情况下, 建议不要使用GNS。

    静默方式安装GI
    ./runInstaller -responsefile /home/grid/grid_install.rsp -silent -ignoreprereq -showprogress

    ./runInstaller -responsefile /home/grid/grid_install.rsp -silent -ignoreprereq -showprogress

    Starting Oracle Universal Installer...

    Checking Temp space: must be greater than 120 MB. Actual 23362 MB Passed
    Checking swap space: must be greater than 150 MB. Actual 3743 MB Passed
    Preparing to launch Oracle Universal Installer from /tmp/OraInstall2017-09-14_02-19-16PM. Please wait ...[grid@node1.localdomain$][WARNING] [INS-30011] The SYS password entered does not conform to the Oracle recommended standards.
    CAUSE: Oracle recommends that the password entered should be at least 8 characters in length, contain at least 1 uppercase character, 1 lower case character and 1 digit [0-9].
    ACTION: Provide a password that conforms to the Oracle recommended standards.
    [WARNING] [INS-30011] The ASMSNMP password entered does not conform to the Oracle recommended standards.
    CAUSE: Oracle recommends that the password entered should be at least 8 characters in length, contain at least 1 uppercase character, 1 lower case character and 1 digit [0-9].
    ACTION: Provide a password that conforms to the Oracle recommended standards.
    You can find the log of this install session at:
    /u01/app/oraInventory/logs/installActions2017-09-14_02-19-16PM.log

    Prepare in progress.
    .................................................. 9% Done.

    Prepare successful.

    Copy files in progress.
    .................................................. 15% Done.
    .................................................. 20% Done.
    .................................................. 25% Done.
    .................................................. 30% Done.
    .................................................. 35% Done.
    .................................................. 40% Done.
    .................................................. 45% Done.
    ........................................
    Copy files successful.

    Link binaries in progress.

    Link binaries successful.
    .................................................. 62% Done.

    Setup files in progress.

    Setup files successful.
    .................................................. 76% Done.

    Perform remote operations in progress.
    .................................................. 89% Done.

    Perform remote operations successful.
    SEVERE:Remote 'AttachHome' failed on nodes: 'node2'. Refer to '/u01/app/oraInventory/logs/installActions2017-09-14_02-19-16PM.log' for details.
    It is recommended that the following command needs to be manually run on the failed nodes:
    /u01/app/11.2.0/grid/oui/bin/runInstaller -attachHome -noClusterEnabled ORACLE_HOME=/u01/app/11.2.0/grid ORACLE_HOME_NAME=Ora11g_gridinfrahome1 CLUSTER_NODES=node1,node2 "INVENTORY_LOCATION=/u01/app/oraInventory" LOCAL_NODE=<node on which command is to be run>.
    Please refer 'AttachHome' logs under central inventory of remote nodes where failure occurred for more details.
    The installation of Oracle Grid Infrastructure 11g was successful on the local node but failed on remote nodes.
    Please check '/u01/app/oraInventory/logs/silentInstall2017-09-14_02-19-16PM.log' for more details.
    .................................................. 94% Done.

    Execute Root Scripts in progress.

    As a root user, execute the following script(s):
    1. /u01/app/oraInventory/orainstRoot.sh
    2. /u01/app/11.2.0/grid/root.sh

    Execute /u01/app/oraInventory/orainstRoot.sh on the following nodes:
    [node1, node2]
    Execute /u01/app/11.2.0/grid/root.sh on the following nodes:
    [node1, node2]

    .................................................. 100% Done.

    Execute Root Scripts successful.
    As install user, execute the following script to complete the configuration.
    1. /u01/app/11.2.0/grid/cfgtoollogs/configToolAllCommands RESPONSE_FILE=<response_file>

    Note:
    1. This script must be run on the same host from where installer was run.
    2. This script needs a small password properties file for configuration assistants that require passwords (refer to install guide documentation).


    Successfully Setup Software.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    编辑cfgrsp.properties配置文件
    [grid@node1$]vi cfgrsp.properties

    oracle.assistants.server|S_SYSPASSWORD=oracle123
    oracle.assistants.server|S_SYSTEMPASSWORD=oracle123
    oracle.assistants.server|S_SYSMANPASSWORD=oracle123
    oracle.assistants.server|S_DBSNMPPASSWORD=oracle123
    oracle.assistants.server|S_HOSTUSERPASSWORD=oracle123
    oracle.assistants.server|S_ASMSNMPPASSWORD=oracle123
    1
    2
    3
    4
    5
    6
    7
    8
    执行configToolAllCommands命令 :

    [root@node1 ~]# /u01/app/11.2.0/grid/cfgtoollogs/configToolAllCommands RESPONSE_FILE=/home/grid/cfgrsp.properties
    Setting the invPtrLoc to /u01/app/11.2.0/grid/oraInst.loc

    perform - mode is starting for action: configure


    perform - mode finished for action: configure

    You can see the log file: /u01/app/11.2.0/grid/cfgtoollogs/oui/configActions2017-09-28_05-34-14-PM.log
    1
    2
    3
    4
    5
    6
    7
    8
    9
    GI 测试
    [grid@node2$]crs_stat -t -v
    Name Type R/RA F/FT Target State Host
    ----------------------------------------------------------------------
    ora.DATA.dg ora....up.type 0/5 0/ ONLINE ONLINE node2
    ora....N1.lsnr ora....er.type 0/5 0/0 ONLINE ONLINE node2
    ora.asm ora.asm.type 0/5 0/ ONLINE ONLINE node1
    ora.cvu ora.cvu.type 0/5 0/0 ONLINE ONLINE node2
    ora.gsd ora.gsd.type 0/5 0/ OFFLINE OFFLINE
    ora....network ora....rk.type 0/5 0/ ONLINE ONLINE node1
    ora....SM1.asm application 0/5 0/0 ONLINE ONLINE node1
    ora.node1.gsd application 0/5 0/0 OFFLINE OFFLINE
    ora.node1.ons application 0/3 0/0 ONLINE ONLINE node1
    ora.node1.vip ora....t1.type 0/0 0/0 ONLINE OFFLINE
    ora....SM2.asm application 0/5 0/0 ONLINE ONLINE node2
    ora.node2.gsd application 0/5 0/0 OFFLINE OFFLINE
    ora.node2.ons application 0/3 0/0 ONLINE ONLINE node2
    ora.node2.vip ora....t1.type 0/0 0/0 ONLINE ONLINE node2
    ora.oc4j ora.oc4j.type 0/1 0/2 ONLINE ONLINE node2
    ora.ons ora.ons.type 0/3 0/ ONLINE ONLINE node1
    ora.scan1.vip ora....ip.type 0/0 0/0 ONLINE ONLINE node2
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    node1节点的VIP没在线,可以使用 如下命令来启动:
    srvctl start vip -n node1

    [grid@node2$]srvctl start vip -n node1
    PRCR-1079 : Failed to start resource ora.node1.vip
    CRS-5017: The resource action "ora.node1.vip start" encountered the following error:
    CRS-5005: IP Address: 10.37.4.201 is already in use in the network
    . For details refer to "(:CLSN00107:)" in "/u01/app/11.2.0/grid/log/node1/agent/crsd/orarootagent_root/orarootagent_root.log".

    CRS-2674: Start of 'ora.node1.vip' on 'node1' failed
    CRS-5017: The resource action "ora.node1.vip start" encountered the following error:
    CRS-5005: IP Address: 10.37.4.201 is already in use in the network
    . For details refer to "(:CLSN00107:)" in "/u01/app/11.2.0/grid/log/node2/agent/crsd/orarootagent_root/orarootagent_root.log".

    CRS-2674: Start of 'ora.node1.vip' on 'node2' failed
    CRS-2632: There are no more servers to try to place resource 'ora.node1.vip' on that would satisfy its placement policy
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    报了一个错误,原来node1的vip地址已被占用,释放即可。

    安装rac database
    ./runInstaller

    问题1:
    安装rac database时候未找到节点信息

    1; hosts文件解析不对
    2;DNS配置问题
    3;ssh对等新不对
    4;inventory.xml产品目录问题
    分析:GI安装成功,可以排除ssh对等性问题和 hosts解析问题;这次是单scan-ip ,多以没有使用DNS解析,也可以排除;
    通过对比发现/u01/app/oraInventory/ContentsXML/inventory.xml有问题:
    解决方法1:

    [root@node2 ~]# vi /u01/app/oraInventory/ContentsXML/inventory.xml

    <?xml version="1.0" standalone="yes" ?>
    <!-- Copyright (c) 1999, 2013, Oracle and/or its affiliates.
    All rights reserved. -->
    <!-- Do not modify the contents of this file by hand. -->
    <INVENTORY>
    <VERSION_INFO>
    <SAVED_WITH>11.2.0.4.0</SAVED_WITH>
    <MINIMUM_VER>2.1.0.6.0</MINIMUM_VER>
    </VERSION_INFO>
    <HOME_LIST>
    <HOME NAME="Ora11g_gridinfrahome1" LOC="/u01/app/11.2.0/grid" TYPE="O" IDX="1" >
    <NODE_LIST>
    <NODE NAME="node1"/>
    <NODE NAME="node2"/>
    </NODE_LIST>
    </HOME>
    </HOME_LIST>
    <COMPOSITEHOME_LIST>
    </COMPOSITEHOME_LIST>
    </INVENTORY>
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    缺少“ CRS=”true” ”,连个节点都都添加上( < HOME NAME=”Ora11g_gridinfrahome1” LOC=”/u01/app/11.2.0/grid” TYPE=”O” IDX=”1” CRS=”true”> )在安装。不大明白为啥inv目录时出现问题,毕竟GI安装完成过,实际上在安装完CRS再安装RAC的时候,整个环境是否是集群环境,就是依靠inventory.xml来判定的,当然我不知道这是不是唯一判断的标准。可以参考Metalink Note [ID 798203.1],如果inventory.xml中的项目缺少了CRS=”true”这几个字,那么安装RAC时也同样会报错。
    导致这种现象的原因是因为Clusterware 的Inventory信息不准确,重新更新一下即可。

    解决方法2:

    切到GRID用户,进入$ORACLE_HOME/bin

    [grid@rac1 bin]$ pwd

    /u01/app/11.2.0/grid/oui/bin/

    [grid@rac1 bin]$ ls

    addLangs.sh attachHome.sh filesList.bat filesList.sh resource runInstaller runSSHSetup.sh

    addNode.sh detachHome.sh filesList.properties lsnodes runConfig.sh runInstaller.sh

    [grid@rac1 bin]$


    [grid@rac1 bin]$ ./runInstaller -silent -ignoreSysPrereqs -updateNodeList ORACLE_HOME="/u01/app/11.2.0/grid" LOCAL_NODE="node1" CLUSTER_NODES="{node1,node2}" CRS=true

    Starting Oracle Universal Installer...

    Checking swap space: must be greater than500 MB. Actual 15999 MB Passed

    The inventory pointer is located at/etc/oraInst.loc

    The inventory is located at/u01/oraInventory

    'UpdateNodeList' was successful.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    继续安装

    问题2:


    1;首先去排查磁盘权限问题
    [grid@node1$]ls -l /dev/oracleasm/disks/
    total 0
    brw-rw—- 1 grid oinstall 8, 17 Oct 9 11:19 VOL1
    2;查看oracle用户环境变量NLS_LANG设置,这个变量如果设置错误也会导致oracle不能读取ASM磁盘组,一般都不设置这个变量,安装完成之后在设置。
    3;检查GI目录oracle文件权限

    [root@node1 ~]# ls -l /u01/app/11.2.0/grid/bin/oracle
    -rwsr-s--x 1 grid oinstall 209914479 Oct 12 12:19 /u01/app/11.2.0/grid/bin/oracle
    1
    2
    4;检查GI 实例asm_diskstring参数

    测试:
    crs_stat -t -v /查看crs管理的资源
    crsctl status -d database ;//查看crs管理的数据库运行状态

    srvctl start|stop database|instance -d :启动|停止数据库服务
    srvctl start|stop instance -d -n < node_name>:启动|停止实例服务

    注意:
    rac读取系统参数首先会搜索本地,在搜索ASM,如果本地存在则读取本地系统参数文件启动数据库,RAC是一个共享的数据库所以建议系统参数文件也共享,把系统参数文件放在共享存储中。
    在rac数据库中,两个实例同时工作,一个数据库服务对外提供服务,用户通过连接数据库服务,在分发请求到下面的各个实例。

    oracle 11g rac磁盘组及文件权限问题回顾:

    [root@node1 ~]# id oracle
    uid=5001(oracle) gid=501(oinstall) groups=501(oinstall),502(dba),503(oper),506(asmdba)
    [root@node1 ~]# id grid
    uid=5002(grid) gid=501(oinstall) groups=501(oinstall),502(dba),504(asmadmin),505(asmoper),506(asmdba)
    [root@node1 ~]#
    1
    2
    3
    4
    5
    查看GI主目录和rac database主目录Oracle可执行文件权限

    [root@node1 ~]# ls -l /u01/app/oracle/product/11.2.0/db_1/bin/oracle
    -rwsr-s--x 1 oracle asmadmin 239626641 Oct 12 13:50 /u01/app/oracle/product/11.2.0/db_1/bin/oracle

    [root@node1 ~]# ls -l /u01/app/11.2.0/grid/bin/oracle
    -rwsr-s--x 1 grid oinstall 209914479 Oct 12 12:19 /u01/app/11.2.0/grid/bin/oracle
    1
    2
    3
    4
    5
    oracle属于oinstall组,所以oracle可以读取grid的oracle可执行文件去访问asm 实例。
    grid属于asmadmin组,所以grid用户可以去访问oracle database文件。
    若果oracle_home下的oracle执行文件权限出现问题,就会导致无法登陆数据库,ora-600错误,访问asm磁盘组出错等问题。解决方法很简单,可以通过< GI_HOME>/bin/setasmgidwrap -o < ORACLE_HOME>/bin/oracle来解决。
    ————————————————
    版权声明:本文为CSDN博主「返璞归真素闲人」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
    原文链接:https://blog.csdn.net/a743044559/article/details/77940832

  • 相关阅读:
    java学习笔记(二)分布式框架Dubbo+zookeeper搭建
    java学习笔记(一) 服务器的认识
    用slf4j+logback实现多功能日志解决方案 --- 转
    9.3.2 The force and release procedural statements
    3.7.4 Tri0 and tri1 nets
    9.3.1 The assign and deassign procedural statements
    10. Tasks and functions
    6.1.2 The continuous assignment statement
    df 查看磁盘使用情况
    信息学竞赛知识点整理
  • 原文地址:https://www.cnblogs.com/yaoyangding/p/14879437.html
Copyright © 2020-2023  润新知