• ASMFD的使用过程中遇到的问题


    现象:在启动crs的时候,启动到下面的进程OSYSMOND的时候,服务器cpu被该进程占用100%。在/var/log/message中看到有下列报错。

    初步判断asmfd加载失败。导致asm没启动起来,导致后面的crsd.bin也没有启动成功。通过卸载acfs,使用udev,重启集群

    查看/var/log/messages

    Mar 12 09:04:29 lxtrac04 journal: Oracle Clusterware: 2018-03-12 09:04:29.909#012[(27517)]CRS-8500:Oracle Clusterware OSYSMOND process is starting with operating syste
    m process ID 27517
    Mar 12 09:04:31 lxtrac04 kernel: blk_update_request: I/O error, dev fd0, sector 0
    Mar 12 09:04:31 lxtrac04 kernel: floppy: error -5 while reading block 0
    Mar 12 09:04:31 lxtrac04 kernel: loop: module loaded
    Mar 12 09:04:31 lxtrac04 kernel: Unknown ioctl -2146954638
    Mar 12 09:04:31 lxtrac04 kernel: Unknown ioctl 4731
    Mar 12 09:04:31 lxtrac04 kernel: Unknown ioctl 4712
    Mar 12 09:04:31 lxtrac04 kernel: Unknown ioctl 4712
    Mar 12 09:04:31 lxtrac04 kernel: PPP generic driver version 2.4.2
    Mar 12 09:04:31 lxtrac04 kernel: Bluetooth: Core ver 2.20
    Mar 12 09:04:31 lxtrac04 kernel: NET: Registered protocol family 31
    Mar 12 09:04:31 lxtrac04 kernel: Bluetooth: HCI device and connection manager initialized
    Mar 12 09:04:31 lxtrac04 kernel: Bluetooth: HCI socket layer initialized
    Mar 12 09:04:31 lxtrac04 kernel: Bluetooth: L2CAP socket layer initialized
    Mar 12 09:04:31 lxtrac04 kernel: Bluetooth: SCO socket layer initialized
    Mar 12 09:04:31 lxtrac04 kernel: Bluetooth: Virtual HCI driver ver 1.5
    Mar 12 09:04:31 lxtrac04 kernel: lp0: using parport0 (interrupt-driven).
    Mar 12 09:04:31 lxtrac04 kernel: lp0: console ready
    Mar 12 09:04:31 lxtrac04 systemd: Reached target Printer.
    Mar 12 09:04:31 lxtrac04 systemd: Starting Printer.
    Mar 12 09:04:35 lxtrac04 journal: Oracle Clusterware: 2018-03-12 09:04:35.070#012[(27757)]CRS-8500:Oracle Clusterware OLOGGERD process is starting with operating syste
    m process ID 27757
    Mar 12 09:04:45 lxtrac04 journal: Oracle Clusterware: 2018-03-12 09:04:45.059#012[(27794)]CRS-8500:Oracle Clusterware OLOGGERD process is starting with operating syste
    m process ID 27794
    Mar 12 09:04:58 lxtrac04 kernel: NMI watchdog: BUG: soft lockup - CPU#3 stuck for 23s! [osysmond.bin:27570]
    Mar 12 09:04:58 lxtrac04 kernel: Modules linked in: cuse vhost_net vhost macvtap macvlan lp uinput hci_vhci bluetooth rfkill uhid ppp_generic slhc loop rds oracleacfs(
    PO) oracleadvm(PO) oracleoks(PO) tcp_lp fuse oracleafd(PO) xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack
    _ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter vmw_vsock
    _vmci_transport vsock coretemp crct10dif_pclmul crc32_pclmul aesni_intel lrw gf128mul glue_helper ablk_helper cryptd ppdev vmw_balloon pcspkr sg shpchp vmw_vmci i2c_pi
    ix4 parport_pc parport acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc binfmt_misc ip_tables xfs libcrc32c sd_mod sr_mod cdrom ata_generic pata_acpi vmwgfx
    Mar 12 09:04:58 lxtrac04 kernel: drm_kms_helper crc32c_intel ttm serio_raw ata_piix mptspi drm scsi_transport_spi mptscsih libata e1000 mptbase i2c_core floppy dm_mirr
    or dm_region_hash dm_log dm_mod
    Mar 12 09:04:58 lxtrac04 kernel: CPU: 3 PID: 27570 Comm: osysmond.bin Tainted: P O 4.1.12-61.1.18.el7uek.x86_64 #2
    Mar 12 09:04:58 lxtrac04 kernel: Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/17/2015
    Mar 12 09:04:58 lxtrac04 kernel: task: ffff880eff728000 ti: ffff880eff770000 task.ti: ffff880eff770000
    Mar 12 09:04:58 lxtrac04 kernel: RIP: 0010:[<ffffffff81720fd8>] [<ffffffff81720fd8>] _raw_spin_lock+0x38/0x60
    Mar 12 09:04:58 lxtrac04 kernel: RSP: 0018:ffff880eff773ca0 EFLAGS: 00000202
    Mar 12 09:04:58 lxtrac04 kernel: RAX: 000000000000711b RBX: ffff880ff7946cc0 RCX: 0000000000000002
    Mar 12 09:04:58 lxtrac04 kernel: RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff880f657d4810
    Mar 12 09:04:58 lxtrac04 kernel: RBP: ffff880eff773d38 R08: 0000000000000002 R09: 0000000000000000
    Mar 12 09:04:58 lxtrac04 kernel: R10: ffff880fefdede58 R11: ffff880f0541d010 R12: ffffffff81213120
    Mar 12 09:04:58 lxtrac04 kernel: R13: ffff880eff773c18 R14: ffffffff8120bcdb R15: ffff880eff773c48
    Mar 12 09:04:58 lxtrac04 kernel: FS: 00007f5f899ab700(0000) GS:ffff88103fcc0000(0000) knlGS:0000000000000000
    Mar 12 09:04:58 lxtrac04 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Mar 12 09:04:58 lxtrac04 kernel: CR2: 00007fece1138240 CR3: 0000000eff5be000 CR4: 00000000000006e0
    Mar 12 09:04:58 lxtrac04 kernel: Stack:
    Mar 12 09:04:58 lxtrac04 kernel: ffffffffa059fef1 ffff88103e427000 0000000000000000 ffff880f657d4810
    Mar 12 09:04:58 lxtrac04 kernel: 0000000000000024 ffff880ff2ba8000 ffff880f0541d000 ffff880eff773d8c
    Mar 12 09:04:58 lxtrac04 kernel: ffff881000000000 ffff880ff2ba8000 0000000000000041 ffff880fefdede58
    Mar 12 09:04:58 lxtrac04 kernel: Call Trace:
    Mar 12 09:04:58 lxtrac04 kernel: [<ffffffffa059fef1>] ? fuse_abort_conn+0x31/0x270 [fuse]
    Mar 12 09:04:58 lxtrac04 kernel: [<ffffffffa0bf23c0>] ? cuse_read_iter+0x70/0x70 [cuse]
    Mar 12 09:04:58 lxtrac04 kernel: [<ffffffffa0bf2414>] cuse_process_init_reply+0x54/0x490 [cuse]
    Mar 12 09:04:58 lxtrac04 kernel: [<ffffffffa0bf23c0>] ? cuse_read_iter+0x70/0x70 [cuse]
    Mar 12 09:04:58 lxtrac04 kernel: [<ffffffffa059dbbf>] request_end+0xbf/0x170 [fuse]
    Mar 12 09:04:58 lxtrac04 kernel: [<ffffffffa059fd16>] end_queued_requests.isra.19+0x86/0x160 [fuse]
    Mar 12 09:04:58 lxtrac04 kernel: [<ffffffffa059fe8f>] fuse_dev_release+0x9f/0xd0 [fuse]
    Mar 12 09:04:58 lxtrac04 kernel: [<ffffffffa0bf211a>] cuse_channel_release+0x8a/0xa0 [cuse]
    Mar 12 09:04:58 lxtrac04 kernel: [<ffffffff81210224>] __fput+0xe4/0x220
    Mar 12 09:04:58 lxtrac04 kernel: [<ffffffff812103ae>] ____fput+0xe/0x10
    Mar 12 09:04:58 lxtrac04 kernel: [<ffffffff810a3ba7>] task_work_run+0xb7/0xf0
    Mar 12 09:04:58 lxtrac04 kernel: [<ffffffff81017c6d>] do_notify_resume+0x8d/0xa0
    Mar 12 09:04:58 lxtrac04 kernel: [<ffffffff8172147c>] int_signal+0x12/0x17
    Mar 12 09:04:58 lxtrac04 kernel: Code: 07 89 c2 c1 ea 10 66 39 c2 75 01 c3 89 d1 0f b7 f2 b8 00 80 00 00 eb 0a 0f 1f 00 f3 90 83 e8 01 74 20 0f b7 17 41 89 d0 41 31 c8
    <41> 81 e0 fe ff 00 00 75 e7 55 0f b7 f2 48 89 e5 e8 8b 39 ff ff
    Mar 12 09:04:59 lxtrac04 sh: abrt-dump-oops: Found oopses: 1
    Mar 12 09:04:59 lxtrac04 sh: abrt-dump-oops: Creating problem directories
    Mar 12 09:04:59 lxtrac04 sh: abrt-dump-oops: Not going to make dump directories world readable because PrivateReports is on
    Mar 12 09:04:59 lxtrac04 abrt-server: Duplicate: core backtrace

    在mos上面搜索blk_update_request: I/O error, dev fd0, sector 0。确实有查到asmfd相关的文档。

    1.使用独占模式nocrs启动集群
    [root@lxtrac04 bin]# ./crsctl start crs -excl -nocrs
    CRS-4123: Oracle High Availability Services has been started.
    CRS-2672: Attempting to start 'ora.cssdmonitor' on 'lxtrac04'
    CRS-2672: Attempting to start 'ora.evmd' on 'lxtrac04'
    CRS-2672: Attempting to start 'ora.mdnsd' on 'lxtrac04'
    CRS-2676: Start of 'ora.cssdmonitor' on 'lxtrac04' succeeded
    CRS-2676: Start of 'ora.mdnsd' on 'lxtrac04' succeeded
    CRS-2676: Start of 'ora.evmd' on 'lxtrac04' succeeded
    CRS-2672: Attempting to start 'ora.gpnpd' on 'lxtrac04'
    CRS-2676: Start of 'ora.gpnpd' on 'lxtrac04' succeeded
    CRS-2672: Attempting to start 'ora.gipcd' on 'lxtrac04'
    CRS-2676: Start of 'ora.gipcd' on 'lxtrac04' succeeded
    CRS-2672: Attempting to start 'ora.cssd' on 'lxtrac04'
    CRS-2672: Attempting to start 'ora.diskmon' on 'lxtrac04'
    CRS-2676: Start of 'ora.diskmon' on 'lxtrac04' succeeded
    CRS-2676: Start of 'ora.cssd' on 'lxtrac04' succeeded
    CRS-2672: Attempting to start 'ora.drivers.acfs' on 'lxtrac04'
    CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'lxtrac04'
    CRS-2672: Attempting to start 'ora.ctssd' on 'lxtrac04'
    CRS-2676: Start of 'ora.drivers.acfs' on 'lxtrac04' succeeded
    CRS-2676: Start of 'ora.ctssd' on 'lxtrac04' succeeded
    CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'lxtrac04' succeeded
    CRS-2672: Attempting to start 'ora.asm' on 'lxtrac04'
    CRS-2676: Start of 'ora.asm' on 'lxtrac04' succeeded
    2.查看集群的asm_diskstring
    [root@lxtrac04 bin]# ./asmcmd dsget
    parameter: AFD:*
    profile:AFD:*
    3.修改asm_diskstring
    [root@lxtrac04 bin]# ./asmcmd dsset "/dev/sd*"
    [root@lxtrac04 bin]# ./asmcmd dsget
    parameter:/dev/sd*
    profile:/dev/sd*
    [root@lxtrac04 bin]#
    4.使用udev进行绑定,并重启udev ([root@lxtrac04 ~]# systemctl restart systemd-udevd.service)
    (服务器版本
    [grid@lxtrac04 ~]$ uname -a
    Linux lxtrac04 4.1.12-61.1.18.el7uek.x86_64 #2 SMP Fri Nov 4 15:48:30 PDT 2016 x86_64 x86_64 x86_64 GNU/Linux
    [grid@lxtrac04 ~]$
    )
    [grid@lxtrac04 ~]$ cat /etc/udev/rules.d/99-oracle-asm.rules
    KERNEL=="sdd[1-9]",ACTION=="add",OWNER="grid", GROUP="asmadmin", MODE="0660"
    KERNEL=="sde[1-9]",ACTION=="add",OWNER="grid", GROUP="asmadmin", MODE="0660"
    [grid@lxtrac04 ~]$
    5.停止集群
    [root@lxtrac04 bin]# ./crsctl stop crs
    CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'lxtrac04'
    CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'lxtrac04'
    CRS-2673: Attempting to stop 'ora.mdnsd' on 'lxtrac04'
    CRS-2673: Attempting to stop 'ora.gpnpd' on 'lxtrac04'
    CRS-2673: Attempting to stop 'ora.ctssd' on 'lxtrac04'
    CRS-2673: Attempting to stop 'ora.evmd' on 'lxtrac04'
    CRS-2673: Attempting to stop 'ora.asm' on 'lxtrac04'
    CRS-2677: Stop of 'ora.drivers.acfs' on 'lxtrac04' succeeded
    CRS-2677: Stop of 'ora.mdnsd' on 'lxtrac04' succeeded
    CRS-2677: Stop of 'ora.gpnpd' on 'lxtrac04' succeeded
    CRS-2677: Stop of 'ora.ctssd' on 'lxtrac04' succeeded
    CRS-2677: Stop of 'ora.evmd' on 'lxtrac04' succeeded
    CRS-2677: Stop of 'ora.asm' on 'lxtrac04' succeeded
    CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'lxtrac04'
    CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'lxtrac04' succeeded
    CRS-2673: Attempting to stop 'ora.cssd' on 'lxtrac04'
    CRS-2677: Stop of 'ora.cssd' on 'lxtrac04' succeeded
    CRS-2673: Attempting to stop 'ora.gipcd' on 'lxtrac04'
    CRS-2677: Stop of 'ora.gipcd' on 'lxtrac04' succeeded
    CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'lxtrac04' has completed
    CRS-4133: Oracle High Availability Services has been stopped.
    [root@lxtrac04 bin]#
    6.停止acfs、afd
    # acfsload stop # stop the ACFS driver stack
    # afdload stop # stop the ASMFD driver
    7.清理label
    [root@lxtrac04 bin]# ./asmcmd afd_unlabel /dev/sdd1 -f
    [root@lxtrac04 bin]# ./asmcmd afd_unlabel /dev/sdd2 -f
    [root@lxtrac04 bin]# ./asmcmd afd_unlabel /dev/sdd3 -f
    …………

    8.卸载 ASMFD
    # ./asmcmd afd_deconfigure
    AFD-632:Existing AFD installation detected.
    AFD-634:Removing previous AFD installation.
    AFD-635:Previous AFD components successfully removed.
    Modifying resource dependencies-thismay take some time.
    # ls -ltr /dev/oracleafd/disks/
    ls:cannot access/dev/oracleafd/disks/:No such file ordirectory
    9.重启crs,启动成功
    [root@lxtrac04 bin]# ./crsctl start crs
    CRS-4123: Oracle High Availability Services has been started.
    [root@lxtrac04 bin]#

  • 相关阅读:
    使用C# lock同时访问共享数据
    将两个DataTable合并成一个DataTable
    嵌套存储过程返回值的调用
    在 Sql Server 中使用 MD5 加密
    用DIV制作即时提示层 防止被select控件遮挡的方法
    操作Cookie公用代码
    JS实现回调例子
    ASP存储过程参数数据类型
    在asp中使用js的encodeURIComponent方法
    Uva 10250 The Other Two Trees
  • 原文地址:https://www.cnblogs.com/erwadba/p/8601787.html
Copyright © 2020-2023  润新知