Oracle 19c AFD不支持存储的多路复用
Oracle版本:
SQL*Plus: Release 19.0.0.0.0 - Production on Mon May 31 09:11:51 2021
Version 19.11.0.0.0
经过实验发现,当存储盘存在多路复用的情况,如果不使用多路径会导致AFD在afd_scan的时候忽略存储盘。
[root@dev-devdb ~]# lsscsi --scsi_id -s [0:1:0:0] disk HP LOGICAL VOLUME 1.64 /dev/sda 3600508b100103035424b323031370300 299GB [0:3:0:0] storage HP P712m 1.64 - - - [1:0:0:0] storage HP HSV300 1100 - - - [1:0:0:1] disk HP HSV300 1100 /dev/sdb 36001438009b025a00000b000002b0000 5.36GB [1:0:0:2] disk HP HSV300 1100 /dev/sdc 36001438009b025a00000b00000300000 5.36GB [1:0:0:3] disk HP HSV300 1100 /dev/sdd 36001438009b025a00000b00000350000 5.36GB [1:0:0:4] disk HP HSV300 1100 /dev/sde 36001438009b025a00000b000003a0000 322GB [1:0:0:5] disk HP HSV300 1100 /dev/sdf 36001438009b025a00000b000003f0000 214GB [1:0:0:6] disk HP HSV300 1100 /dev/sdg 36001438009b025a00000b00000470000 53.6GB [1:0:1:0] storage HP HSV300 1100 - - - [1:0:1:1] disk HP HSV300 1100 /dev/sdh 36001438009b025a00000b000002b0000 5.36GB [1:0:1:2] disk HP HSV300 1100 /dev/sdi 36001438009b025a00000b00000300000 5.36GB [1:0:1:3] disk HP HSV300 1100 /dev/sdj 36001438009b025a00000b00000350000 5.36GB [1:0:1:4] disk HP HSV300 1100 /dev/sdk 36001438009b025a00000b000003a0000 322GB [1:0:1:5] disk HP HSV300 1100 /dev/sdl 36001438009b025a00000b000003f0000 214GB [1:0:1:6] disk HP HSV300 1100 /dev/sdm 36001438009b025a00000b00000470000 53.6GB [2:0:0:0] storage HP HSV300 1100 - - - [2:0:0:1] disk HP HSV300 1100 /dev/sdn 36001438009b025a00000b000002b0000 5.36GB [2:0:0:2] disk HP HSV300 1100 /dev/sdo 36001438009b025a00000b00000300000 5.36GB [2:0:0:3] disk HP HSV300 1100 /dev/sdp 36001438009b025a00000b00000350000 5.36GB [2:0:0:4] disk HP HSV300 1100 /dev/sdq 36001438009b025a00000b000003a0000 322GB [2:0:0:5] disk HP HSV300 1100 /dev/sdr 36001438009b025a00000b000003f0000 214GB [2:0:0:6] disk HP HSV300 1100 /dev/sds 36001438009b025a00000b00000470000 53.6GB [2:0:1:0] storage HP HSV300 1100 - - - [2:0:1:1] disk HP HSV300 1100 /dev/sdt 36001438009b025a00000b000002b0000 5.36GB [2:0:1:2] disk HP HSV300 1100 /dev/sdu 36001438009b025a00000b00000300000 5.36GB [2:0:1:3] disk HP HSV300 1100 /dev/sdv 36001438009b025a00000b00000350000 5.36GB [2:0:1:4] disk HP HSV300 1100 /dev/sdw 36001438009b025a00000b000003a0000 322GB [2:0:1:5] disk HP HSV300 1100 /dev/sdx 36001438009b025a00000b000003f0000 214GB [2:0:1:6] disk HP HSV300 1100 /dev/sdy 36001438009b025a00000b00000470000 53.6GB
在GI启动之前,执行afdload start的时候,会根据/etc/oracleafd.conf文件中afd_diskstring去扫描盘。
此时,查看日志/u01/app/grid/diag/afdboot/user_root/host_3462916399_110/trace/alert.log有如下提示:
2021-05-31T08:56:04.752260+08:00 afdb_scandisk: Start afdb_getdiscstr: Start afd_discovery string: /dev/sd* afdb_getdiscstr: end Tokenized diskstring /dev/sd* No label in disk /dev/sda No label in disk /dev/sda1 No label in disk /dev/sda2 No label in disk /dev/sda3 Ignoring dup disk: /dev/sdi label: OCR2 Ignoring dup disk: /dev/sdc label: OCR2 Ignoring dup disk: /dev/sdh label: OCR1 Ignoring dup disk: /dev/sdb label: OCR1 Ignoring dup disk: /dev/sdk label: DATA1 Ignoring dup disk: /dev/sde label: DATA1 Ignoring dup disk: /dev/sdj label: OCR3 Ignoring dup disk: /dev/sdd label: OCR3 Ignoring dup disk: /dev/sdm label: MGMT1 Ignoring dup disk: /dev/sdg label: MGMT1 Ignoring dup disk: /dev/sdl label: ARCH1 Ignoring dup disk: /dev/sdf label: ARCH1 Ignoring dup disk: /dev/sdn label: OCR1 Ignoring dup disk: /dev/sds label: MGMT1 Ignoring dup disk: /dev/sdo label: OCR2 Ignoring dup disk: /dev/sdp label: OCR3 Ignoring dup disk: /dev/sdu label: OCR2 Ignoring dup disk: /dev/sdt label: OCR1 Ignoring dup disk: /dev/sdq label: DATA1 Ignoring dup disk: /dev/sdv label: OCR3 Ignoring dup disk: /dev/sdw label: DATA1 Ignoring dup disk: /dev/sdx label: ARCH1 Ignoring dup disk: /dev/sdy label: MGMT1 Ignoring dup disk: /dev/sdr label: ARCH1 There are no devices to discover. afdb_scandisk: Successful afdb_scandisk: end
此时,也无法使用asmcmd afd_scan来将磁盘扫描出来,也就是说/dev/oracleafd/disks/会是空的。
注意:asmcmd afd_scan如果不加参数,默认也是读取/etc/oracleafd.conf文件。
详情可以查看/u01/app/grid/diag/asmcmd/user_root/dev-devdb/alert/alert.log日志。
当使用多路径绑定,消除多路复用多出来的额外的存储磁盘后,同时/etc/oracleafd.conf文件修改为afd_diskstring='/dev/mapper/asm*',
再次执行afdload start,此时查看日志/u01/app/grid/diag/afdboot/user_root/host_3462916399_110/trace/alert.log有如下提示:
2021-05-31T08:58:10.319379+08:00 afdb_scandisk: Start afdb_getdiscstr: Start afd_discovery string: /dev/mapper/asm* afdb_getdiscstr: end Tokenized diskstring /dev/mapper/asm* Scan count 6 Scanned disk /dev/mapper/asm-ocr1 Scanned disk /dev/mapper/asm-ocr2 Scanned disk /dev/mapper/asm-ocr3 Scanned disk /dev/mapper/asm-data1 Scanned disk /dev/mapper/asm-arch1 Scanned disk /dev/mapper/asm-mgmt1 afdb_scandisk: Successful afdb_scandisk: end
这个表明当前afd不支持存储的多路复用。
若执意要使用afd_diskstring='/dev/sd*',那么afdload start后,使用asmcmd afd_scan '/dev/sdb'这样子单独一块一块手工扫描出来。
最后,将集群启动后, /etc/oracleafd.conf文件的afd_diskstring会被还原为OLR记录的AFD_DISKSTRING路径。
使用asmcmd dsget需要的集群启动状态下才可以成功执行,原因就是因为该命令会读取OLR信息,同时,
会更新/etc/oracleafd.conf文件的afd_diskstring为OLR记录的AFD_DISKSTRING路径。
如果由原来的/dev/sd*要变更为/dev/mapper/asm*的话,在集群启动后需要使用asmcmd afd_dsset '/dev/mapper/asm*'来更新OLR的信息。
获取OLR的dump文件命令:ocrdump -local
附上OLR的dump文件部分内容:
[SYSTEM.ASM.AFD_DISKSTRING] ORATEXT : /dev/mapper/asm* SECURITY : {USER_PERMISSION : PROCR_ALL_ACCESS, GROUP_PERMISSION : PROCR_READ, OTHER_PERMISSION : PROCR_READ, USER_NAME : grid, GROUP_NAME : oinstall}