• Oracle归档日志所在目录时间不对&&Oracle集群日志时间显示错误


    Oracle归档日志所在目录时间不对&&Oracle集群日志时间显示错误

    前言

    这个问题在18年的时候遇到了,基本不注意并且集群或者数据库运行正常是很难注意到的。

    忘记当时怎么发现的了,最近拿出来重现一下问题。

    具体描述一下问题:

    比如当前时间是2020年07月03日0点,数据库对redo log发起归档。

    在OMF下,归档会按照日期进行分类归档。

    但是发起归档之后,新的归档文件会放在的2020_07_02的目录中。

    另外,db和asm和集群的警告日志的时间也和操作系统的date时间不一致。

    现象

    版本:11.2.0.4.0

    在我的测试环境中,关闭所有节点的集群之后,更改时间:

    [root@rac1 ~]# date -s "2020-07-03 00:00:00"
    Fri Jul  3 00:00:00 CST 2020

    然后,启动所有节点的集群,可以依次观察

    集群的警告日志,grid用户下,$ORACLE_HOME/log/rac1/alertrac1.log

    2020-07-02 09:00:26.667:
    [ohasd(24600)]CRS-2112:The OLR service started on node rac1.
    2020-07-02 09:00:26.675:
    [ohasd(24600)]CRS-1301:Oracle High Availability Service started on node rac1.
    2020-07-02 09:00:26.676:
    [ohasd(24600)]CRS-8017:location: /etc/oracle/lastgasp has 2 reboot advisory log files, 0 were announced and 0 errors occurred
    2020-07-02 09:00:30.080:
    [/u01/app/11.2.0/grid/bin/orarootagent.bin(24643)]CRS-2302:Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running).
    2020-07-02 09:00:34.563:
    [gpnpd(24737)]CRS-2328:GPNPD started on node rac1.
    2020-07-02 09:00:36.992:
    [cssd(24806)]CRS-1713:CSSD daemon is started in clustered mode
    2020-07-02 09:00:38.798:
    [ohasd(24600)]CRS-2767:Resource state recovery not attempted for 'ora.diskmon' as its target state is OFFLINE
    2020-07-02 09:01:01.319:
    [cssd(24806)]CRS-1707:Lease acquisition for node rac1 number 1 completed
    2020-07-02 09:01:02.612:
    [cssd(24806)]CRS-1605:CSSD voting file is online: /dev/mapper/asm-ocr02; details in /u01/app/11.2.0/grid/log/rac1/cssd/ocssd.log.
    2020-07-02 09:01:02.620:
    [cssd(24806)]CRS-1605:CSSD voting file is online: /dev/mapper/asm-ocr03; details in /u01/app/11.2.0/grid/log/rac1/cssd/ocssd.log.
    2020-07-02 09:01:02.623:
    [cssd(24806)]CRS-1605:CSSD voting file is online: /dev/mapper/asm-ocr01; details in /u01/app/11.2.0/grid/log/rac1/cssd/ocssd.log.
    2020-07-02 09:01:15.917:
    [cssd(24806)]CRS-1601:CSSD Reconfiguration complete. Active nodes are rac1 rac2 .
    2020-07-02 09:01:18.061:
    [ctssd(25418)]CRS-2403:The Cluster Time Synchronization Service on host rac1 is in observer mode.
    2020-07-02 09:01:18.421:
    [ctssd(25418)]CRS-2407:The new Cluster Time Synchronization Service reference node is host rac1.
    2020-07-02 09:01:18.422:
    [ctssd(25418)]CRS-2401:The Cluster Time Synchronization Service started on host rac1.
    2020-07-02 09:01:20.023:
    [ohasd(24600)]CRS-2767:Resource state recovery not attempted for 'ora.diskmon' as its target state is OFFLINE
    [client(25475)]CRS-10001:02-Jul-20 09:01 ACFS-9391: Checking for existing ADVM/ACFS installation.
    [client(25480)]CRS-10001:02-Jul-20 09:01 ACFS-9392: Validating ADVM/ACFS installation files for operating system.
    [client(25482)]CRS-10001:02-Jul-20 09:01 ACFS-9393: Verifying ASM Administrator setup.
    [client(25485)]CRS-10001:02-Jul-20 09:01 ACFS-9308: Loading installed ADVM/ACFS drivers.
    [client(25488)]CRS-10001:02-Jul-20 09:01 ACFS-9154: Loading 'oracleoks.ko' driver.
    [client(25521)]CRS-10001:02-Jul-20 09:01 ACFS-9154: Loading 'oracleadvm.ko' driver.
    [client(25548)]CRS-10001:02-Jul-20 09:01 ACFS-9154: Loading 'oracleacfs.ko' driver.
    ........

    ASM实例警告日志,

    Thu Jul 02 09:01:26 2020
    NOTE: No asm libraries found in the system
    NOTE: No asm libraries found in the system
    MEMORY_TARGET defaulting to 1128267776.
    * instance_number obtained from CSS = 1, checking for the existence of node 0...
    * node 0 does not exist. instance_number = 1
    Starting ORACLE instance (normal)
    WARNING: You are trying to use the MEMORY_TARGET feature. This feature requires the /dev/shm file system to be mounted for at least 1140850688 bytes. /dev/shm is either not mounted or is mounted with available space less than this size. Please fix this so that MEMORY_TARGET can work as expected. Current available is 976789504 and used is 73728 bytes. Ensure that the mount point is /dev/shm for this directory.
    LICENSE_MAX_SESSION = 0
    LICENSE_SESSIONS_WARNING = 0
    Initial number of CPU is 4
    Number of processor cores in the system is 4
    Number of processor sockets in the system is 2
    Private Interface 'eth1:1' configured from GPnP for use as a private interconnect.
      [name='eth1:1', type=1, ip=169.254.146.181, mac=00-0c-29-60-50-8b, net=169.254.0.0/16, mask=255.255.0.0, use=haip:cluster_interconnect/62]
    Public Interface 'eth0' configured from GPnP for use as a public interface.
      [name='eth0', type=1, ip=192.168.180.100, mac=00-0c-29-60-50-81, net=192.168.180.0/24, mask=255.255.255.0, use=public/1]
    Picked latch-free SCN scheme 3
    Using LOG_ARCHIVE_DEST_1 parameter default value as /u01/app/11.2.0/grid/dbs/arch
    Autotune of undo retention is turned on.
    LICENSE_MAX_USERS = 0
    SYS auditing is disabled
    NOTE: Volume support  enabled
    Starting up:
    Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
    With the Real Application Clusters and Automatic Storage Management options.
    ORACLE_HOME = /u01/app/11.2.0/grid
    System name:    Linux
    Node name:      rac1.example.com
    Release:        2.6.32-573.el6.x86_64
    Version:        #1 SMP Wed Jul 1 18:23:37 EDT 2015
    Machine:        x86_64
    VM name:        VMWare Version: 6
    Using parameter settings in server-side spfile +OCR/rac-cluster/asmparameterfile/registry.253.1034884609

    DB的警告日志,

    Thu Jul 02 09:02:47 2020
    Adjusting the default value of parameter parallel_max_servers
    from 160 to 120 due to the value of parameter processes (150)
    Starting ORACLE instance (normal)
    LICENSE_MAX_SESSION = 0
    LICENSE_SESSIONS_WARNING = 0
    Initial number of CPU is 4
    Number of processor cores in the system is 4
    Number of processor sockets in the system is 2
    Private Interface 'eth1:1' configured from GPnP for use as a private interconnect.
      [name='eth1:1', type=1, ip=169.254.146.181, mac=00-0c-29-60-50-8b, net=169.254.0.0/16, mask=255.255.0.0, use=haip:cluster_interconnect/62]
    Public Interface 'eth0' configured from GPnP for use as a public interface.
      [name='eth0', type=1, ip=192.168.180.100, mac=00-0c-29-60-50-81, net=192.168.180.0/24, mask=255.255.255.0, use=public/1]
    Public Interface 'eth0:2' configured from GPnP for use as a public interface.
      [name='eth0:2', type=1, ip=192.168.180.111, mac=00-0c-29-60-50-81, net=192.168.180.0/24, mask=255.255.255.0, use=public/1]
    Picked latch-free SCN scheme 3
    Autotune of undo retention is turned on.
    LICENSE_MAX_USERS = 0
    SYS auditing is disabled
    Starting up:
    Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
    With the Partitioning, Real Application Clusters, OLAP, Data Mining
    and Real Application Testing options.
    ORACLE_HOME = /u01/app/oracle/product/11.2.0/db_1
    System name:    Linux
    Node name:      rac1.example.com
    Release:        2.6.32-573.el6.x86_64

    可以看到,日志开头全都是2020年07月02日的9点初。

    我们再看看归档的情况,

    在此之前,先删除之前的所有归档便于观察。

    [oracle@rac2 ~]$ rman target /
    
    Recovery Manager: Release 11.2.0.4.0 - Production on Fri Jul 3 00:20:01 2020
    
    Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
    
    connected to target database: ZKM (DBID=4210149549)
    
    RMAN> delete force noprompt archivelog all;
    
    using target database control file instead of recovery catalog
    allocated channel: ORA_DISK_1
    channel ORA_DISK_1: SID=200 instance=zkm2 device type=DISK
    List of Archived Log Copies for database with db_unique_name ZKM
    =====================================================================
    
    Key     Thrd Seq     S Low Time 
    ------- ---- ------- - ---------
    91      1    56      A 02-JUL-20
            Name: +ARCH/zkm/archivelog/2020_07_03/thread_1_seq_56.303.1044695873
    
    92      2    21      A 02-JUL-20
            Name: +ARCH/zkm/archivelog/2020_07_02/thread_2_seq_21.299.1044695875
    
    deleted archived log
    archived log file name=+ARCH/zkm/archivelog/2020_07_03/thread_1_seq_56.303.1044695873 RECID=91 STAMP=1044749873
    deleted archived log
    archived log file name=+ARCH/zkm/archivelog/2020_07_02/thread_2_seq_21.299.1044695875 RECID=92 STAMP=1044695874
    Deleted 2 objects

    其中一个节点发出命令归档redo log,

    SYS@zkm1> alter session set nls_date_format='yyyy-mm-dd hh24:mi:ss';
    
    Session altered.
    
    SYS@zkm1> 
    SYS@zkm1> select sysdate from dual;
    
    SYSDATE
    ------------------------
    2020-07-03 00:20:27
    
    SYS@zkm1> alter system archive log current;
    
    System altered.

    观察警告日志,显示的时间是“Thu Jul 02 09:20:32 2020”。

    Thu Jul 02 09:20:32 2020
    Thread 1 advanced to log sequence 58 (LGWR switch)
      Current log# 2 seq# 58 mem# 0: +DATA/zkm/onlinelog/group_2.258.1034889845
      Current log# 2 seq# 58 mem# 1: +ARCH/zkm/onlinelog/group_2.258.1034889845
    Thu Jul 02 09:20:32 2020
    Archived Log entry 93 added for thread 1 sequence 57 ID 0xfaf1eead dest 1:

    进去ASM磁盘查看文件情况,

    ASMCMD> ls -l
    Type  Redund  Striped  Time             Sys  Name
                                            Y    2020_07_02/
    ASMCMD> ls -l *
    Type        Redund  Striped  Time             Sys  Name
    ARCHIVELOG  UNPROT  COARSE   JUL 02 09:00:00  Y    thread_1_seq_57.299.1044696033
    ARCHIVELOG  UNPROT  COARSE   JUL 02 09:00:00  Y    thread_2_seq_22.303.1044696033
    ASMCMD> pwd
    +arch/zkm/ARCHIVELOG

    至此,问题完全重现成功。

    原因

    不知道各位发现没有,实际上最开始启动集群(crsctl start crs)那一瞬间(大概是2020-07-03 00:00:00多,刚改完时间不到1分钟),

    集群的警告日志最开始出现的时间为“2020-07-02 09:00:26.667:”,基本相差15小时。

    因此很容易也可以猜出,是由于时区的原因。

    系统的时区为:

    [grid@rac1 ~]$ cat /etc/sysconfig/clock 
    # The time zone of the system is defined by the contents of /etc/localtime.
    # This file is only for evaluation by system-config-date, do not rely on its
    # contents elsewhere.
    ZONE="Asia/Shanghai"

    那么,集群应该也有自己的配置文件对应的时区,也就是集群不依赖OS的时区(-_-||)。

    mos可以找到这么一篇文章:How To Change Timezone for Grid Infrastructure (Doc ID 1209444.1)

    因此可以发现Grid Infrastructure的时区配置文件位置在$GRID_HOME/crs/install/s_crsconfig_<nodename>_env.txt。

    所以,现象的模拟其实,就是我把该文件的时区改成如下了:

    [grid@rac1 ~]$ cd $ORACLE_HOME/crs/install
    [grid@rac1 install]$ ls
    cmdllroot.sh              crsconfig_lib.pm      crsdelete.pm   inittab       installRemove.excl  oracle-ohasd.conf     paramfile.crs           rootcrs.pl  s_crsconfig_defs          s_crsconfig_rac1_env.txt.bak
    crsconfig_addparams.sbs   crsconfig_params      crspatch.pm    install.excl  onsconfig           oracle-ohasd.service  ParentDirPerm_rac1.txt  roothas.pl  s_crsconfig_lib.pm        tfa_setup.sh
    crsconfig_fileperms.excl  crsconfig_params.sbs  hasdconfig.pl  install.incl  oraacfs.pm          oracss.pm             preupdate.sh            rootofs.sh  s_crsconfig_rac1_env.txt
    [grid@rac1 install]$ cat s_crsconfig_rac1_env.txt
    ### This file can be used to modify the NLS_LANG environment variable, which determines the charset to be used for messages.
    ### For example, a new charset can be configured by setting NLS_LANG=JAPANESE_JAPAN.UTF8 
    ### Do not modify this file except to change NLS_LANG, or under the direction of Oracle Support Services
    
    TZ=US/Pacific
    NLS_LANG=AMERICAN_AMERICA.AL32UTF8
    TNS_ADMIN=
    ORACLE_BASE=

    网上找找看是不是刚好和上海时间相差15个小时。

    https://time.is/PT

     得证!!!

  • 相关阅读:
    linux驱动开发学习一:创建一个字符设备
    如何高效的对有序数组去重
    找到缺失的第一个正整数
    .NET不可变集合已经正式发布
    中国人唯一不认可的成功——就是家庭的和睦,人生的平淡【转】
    自己动手搭建 MongoDB 环境,并建立一个 .NET HelloWorld 程序测试
    ASP.NET MVC 中如何用自定义 Handler 来处理来自 AJAX 请求的 HttpRequestValidationException 错误
    自己动手搭建 Redis 环境,并建立一个 .NET HelloWorld 程序测试
    ServiceStack 介绍
    一步一步实战扩展 ASP.NET Route,实现小写 URL、个性化 URL
  • 原文地址:https://www.cnblogs.com/PiscesCanon/p/13227764.html
Copyright © 2020-2023  润新知