• 案例:记录一则强制开库遭遇ORA-16433的处理过程


    客户的一套开发环境,大概了解到的背景是清理空间时redo被运维人员当作log误删除,一线同事先接手处理,过程中遇到问题升级到我这里继续分析。
    接手后,数据库处于mount状态,之前恢复过程中已经做过resetlogs的操作,也设置了"_allow_resetlogs_corruption"隐藏参数为true,目前直接开库会提示需要恢复,重新进行resetlogs时报错ORA-600 [2662],起初看到这个错误心中略有些放松,根据经验,推下SCN就好了:

    SQL> alter database open;
    alter database open
    *
    ERROR at line 1:
    ORA-01113: file 1 needs media recovery
    ORA-01110: data file 1:
    '/oradata/ODSDB/datafile/data_D-ODSDB_I-3532120983_TS-SYSTEM_FNO-1_1bujd4ob.dbf'
    
    
    SQL> recover database ;
    ORA-00283: recovery session canceled due to errors
    ORA-00399: corrupt change description in redo log
    ORA-00353: log corruption near block 3 change 32581427074 time 08/25/2020
    11:39:04
    ORA-00312: online log 5 thread 1: '/oradata/ODSDB/datafile/redo05.log'
    
    
    SQL> recover datafile 1;
    ORA-00283: recovery session canceled due to errors
    ORA-00399: corrupt change description in redo log
    ORA-00353: log corruption near block 3 change 32581427074 time 08/25/2020
    11:39:04
    ORA-00312: online log 5 thread 1: '/oradata/ODSDB/datafile/redo05.log'
    
    SQL> alter database open resetlogs;
    alter database open resetlogs
    *
    ERROR at line 1:
    ORA-01139: RESETLOGS option only valid after an incomplete database recovery
    
    
    SQL> recover database using backup controlfile until cancel;
    ORA-00279: change 32581427072 generated at  needed for thread 1
    
    
    Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
    auto
    ORA-00266: name of archived log file needed
    
    
    ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
    ORA-01194: file 1 needs more recovery to be consistent
    ORA-01110: data file 1:
    '/oradata/ODSDB/datafile/data_D-ODSDB_I-3532120983_TS-SYSTEM_FNO-1_1bujd4ob.dbf'
    
    
    SQL>  alter database open resetlogs;
     alter database open resetlogs
    *
    ERROR at line 1:
    ORA-00603: ORACLE server session terminated by fatal error
    ORA-00600: internal error code, arguments: [2662], [7], [2516656013], [7],
    [2516660859], [327174016], [], [], [], [], [], []
    ORA-00600: internal error code, arguments: [2662], [7], [2516656012], [7],
    [2516660859], [327174016], [], [], [], [], [], []
    ORA-01092: ORACLE instance terminated. Disconnection forced
    ORA-00600: internal error code, arguments: [2662], [7], [2516656008], [7],
    [2516660859], [327174016], [], [], [], [], [], []
    Process ID: 3997864
    Session ID: 580 Serial number: 5
    

    重新启动到mount状态,根据上面报错计算要推进的SCN值:

    SQL> select open_mode from v$database;
    
    OPEN_MODE
    --------------------
    MOUNTED
    SQL> select current_scn from v$database;
    SQL> set num 30
    30217546463
    select 7*power(2,32) + 2516660859 from dual;
    
    select 7*power(2,32) + 2516661000 from dual;
    32581432072
    

    使用oradebug poke进行SCN推进:

    oradebug setmypid
    oradebug dumpvar sga kcsgscn_
    oradebug poke 0x700000000019B70 8 32581432072
    oradebug dumpvar sga kcsgscn_
    

    推进SCN后,尝试直接开库报错ORA-00399、ORA-00353、ORA-00312:

    SQL> oradebug setmypid
    Statement processed.
    SQL> oradebug dumpvar sga kcsgscn_
    kcslf kcsgscn_ [700000000019B70, 700000000019BA0) = 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 07000000 00019850
    SQL> oradebug poke 0x700000000019B70 8 32581432072
    BEFORE: [700000000019B70, 700000000019B78) = 00000000 00000000
    AFTER:  [700000000019B70, 700000000019B78) = 00000007 96013308
    SQL> oradebug dumpvar sga kcsgscn_
    kcslf kcsgscn_ [700000000019B70, 700000000019BA0) = 00000007 96013308 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 07000000 00019850
    SQL> alter database open;
    alter database open
    *
    ERROR at line 1:
    ORA-00399: corrupt change description in redo log
    ORA-00353: log corruption near block 3 change 32581427074 time 08/25/2020
    11:39:04
    ORA-00312: online log 5 thread 1: '/oradata/ODSDB/datafile/redo05.log'
    

    看起来有些麻烦,根据报错推断之前的resetlogs其实都没有完全成功。
    此时查询各个checkpoint_change#都是一致的,正常应该可以open成功:

    select distinct checkpoint_change# from v$database;
    select distinct checkpoint_change# from v$datafile;
    select distinct checkpoint_change# from v$datafile_header;
    

    可事实上,即使尝试read only开库都会报错ORA-16433:

    SQL> alter database open read only;
    alter database open read only
    *
    ERROR at line 1:
    ORA-16433: The database must be opened in read/write mode.
    

    查了下这个错误,看起来控制文件有些问题,于是想重建下控制文件,结果发现命令同样报错ORA-16433:

    --recreate controlfile
    alter database backup controlfile to trace;
    SQL> alter database backup controlfile to trace;
    alter database backup controlfile to trace
    *
    ERROR at line 1:
    ORA-16433: The database must be opened in read/write mode.
    

    而正常mount下,是可以执行此命令的。
    没办法,只能根据控制文件的语法,查询必要信息,然后手工构建控制文件的创建脚本:

    --dbname
    ODSDB
    --select userenv('language') from dual;
    AMERICAN_AMERICA.AL32UTF8
    --nomount
    STARTUP NOMOUNT
    --create_controlfile.sql
    CREATE CONTROLFILE REUSE DATABASE "ODSDB" NORESETLOGS  NOARCHIVELOG
        MAXLOGFILES 16
        MAXLOGMEMBERS 3
        MAXDATAFILES 1024
        MAXINSTANCES 8
        MAXLOGHISTORY 292
    LOGFILE
      GROUP 1 '/oradata/ODSDB/datafile/redo01.log' size 1024M blocksize 512,
      GROUP 2 '/oradata/ODSDB/datafile/redo02.log' size 1024M blocksize 512,
      GROUP 3 '/oradata/ODSDB/datafile/redo03.log' size 1024M blocksize 512,
      GROUP 4 '/oradata/ODSDB/datafile/redo04.log' size 1024M blocksize 512,
      GROUP 5 '/oradata/ODSDB/datafile/redo05.log' size 1024M blocksize 512,
      GROUP 6 '/oradata/ODSDB/datafile/redo06.log' size 1024M blocksize 512
    -- STANDBY LOGFILE
    DATAFILE
      '/oradata/ODSDB/datafile/data_D-ODSDB_I-3532120983_TS-SYSTEM_FNO-1_1bujd4ob.dbf',
      '/oradata/ODSDB/datafile/data_D-ODSDB_I-3532120983_TS-SYSAUX_FNO-2_1aujd4oa.dbf',
      '/oradata/ODSDB/datafile/data_D-ODSDB_I-3532120983_TS-ODS_FDM_P11_FNO-3_0fujd0al.dbf',
      '/oradata/ODSDB/datafile/data_D-ODSDB_I-3532120983_TS-USERS_FNO-4_30ujd66a.dbf',
    ..此处省略大量数据文件输出..
      '/oradata/ODSDB/datafile/data_D-ODSDB_I-3532120983_TS-ODS_FDM_FNO-89_08ujcris1.dbf',
      '/oradata/ODSDB/datafile/data_D-ODSDB_I-3532120983_TS-ODS_FDM_FNO-00_08ujcg7n.dbf'
    CHARACTER SET AL32UTF8
    ;
    

    重建控制文件后,再次推SCN,最终恢复成功:

    select 7*power(2,32)+2616661492 from dual;
    32681432564
    
    --recover again..
    conn /as sysdba
    startup nomount;
    @create_controlfile.sql
    oradebug setmypid
    oradebug dumpvar sga kcsgscn_
    oradebug poke 0x700000000019B70 8 32681432564
    oradebug dumpvar sga kcsgscn_
    
    recover database using backup controlfile until cancel;
    auto
    alter database open resetlogs;
    

    最后将隐藏参数"_allow_resetlogs_corruption"设置为false,正常重启数据库正常:

    alter system set "_allow_resetlogs_corruption" = false scope=spfile;
    shut immediate
    startup
    

    总结:主要还是推进SCN,只是这次遇到报错ORA-16433需要重建控制文件。

  • 相关阅读:
    Django 两张表的正向查找和反向查找
    表结构基类写法
    vue绑定用户页面--新浪
    新浪微博绑定用户接口
    vue新浪微博回调空页面
    新浪微博回调接口
    vue获取授权url
    Android AsyncTask完全解析,带你从源码的角度彻底理解
    Android 3D滑动菜单完全解析,实现推拉门式的立体特效
    Android中轴旋转特效实现,制作别样的图片浏览器
  • 原文地址:https://www.cnblogs.com/jyzhao/p/13632846.html
Copyright © 2020-2023  润新知