• troubleshooting ORA-600[6302] & ORA-600 [6200] corrupted index block


    今天同事的一套数据库遇到了alert 日志 一堆的ora-600,这是一套10.2.0.5 db 2nodes RAC on AIX , 找我帮着看看, 最终确认为一个索引出现了block corrupted 勘误块导致, 重新创建的了索引后解决, 下面简单记录一下我的诊断过程 。

    # db alert log file #

    Sun Feb 01 07:14:46 BEIST 2015
    Errors in file /oracle/app/oracle/admin/acctc/udump/acctc2_ora_16449844.trc:
    ORA-00600: internal error code, arguments: [6302], [1], [], [], [], [], [], []
    Sun Feb 01 07:15:07 BEIST 2015
    Errors in file /oracle/app/oracle/admin/acctc/udump/acctc2_ora_16449844.trc:
    ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
    ORA-00600: internal error code, arguments: [6302], [1], [], [], [], [], [], []
    Sun Feb 01 07:15:07 BEIST 2015
    Errors in file /oracle/app/oracle/admin/acctc/udump/acctc2_ora_16449844.trc:
    ORA-00600: internal error code, arguments: [ksu_register_tac-1], [], [], [], [], [], [], []
    ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
    ORA-00600: internal error code, arguments: [6302], [1], [], [], [], [], [], []
    Sun Feb 01 07:15:09 BEIST 2015
    Errors in file /oracle/app/oracle/admin/acctc/udump/acctc2_ora_16449844.trc:
    ORA-00600: internal error code, arguments: [ksu_register_tac-1], [], [], [], [], [], [], []
    ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
    ORA-00600: internal error code, arguments: [6302], [1], [], [], [], [], [], []
    Sun Feb 01 07:15:38 BEIST 2015
    Errors in file /oracle/app/oracle/admin/acctc/udump/acctc2_ora_16449844.trc:
    ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
    ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
    ORA-00600: internal error code, arguments: [6302], [1], [], [], [], [], [], []
    Sun Feb 01 07:15:46 BEIST 2015
    Errors in file /oracle/app/oracle/admin/acctc/udump/acctc2_ora_16449844.trc:
    ORA-00600: internal error code, arguments: [17182], [0x1105B14C8], [], [], [], [], [], []
    ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
    ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
    ORA-00600: internal error code, arguments: [6302], [1], [], [], [], [], [], []
    Sun Feb 01 07:16:09 BEIST 2015
    Errors in file /oracle/app/oracle/admin/acctc/udump/acctc2_ora_16449844.trc:
    ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
    ORA-00600: internal error code, arguments: [17182], [0x1105B14C8], [], [], [], [], [], []
    ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
    ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
    ORA-00600: internal error code, arguments: [6302], [1], [], [], [], [], [], []
    Sun Feb 01 07:16:40 BEIST 2015
    Errors in file /oracle/app/oracle/admin/acctc/udump/acctc2_ora_16449844.trc:
    ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
    ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
    ORA-00600: internal error code, arguments: [17182], [0x1105B14C8], [], [], [], [], [], []
    ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
    ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
    ORA-00600: internal error code, arguments: [6302], [1], [], [], [], [], [], []
    Sun Feb 01 07:16:46 BEIST 2015
    Errors in file /oracle/app/oracle/admin/acctc/udump/acctc2_ora_16449844.trc:
    ORA-07445: exception encountered: core dump [kghrst+020c] [SIGSEGV] [Address not mapped to object] [0xF08C704140910] [] []
    ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
    ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
    ORA-00600: internal error code, arguments: [17182], [0x1105B14C8], [], [], [], [], [], []
    ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
    ORA-00600: internal error code, arguments: [kghrcdepth:ds], [0x1105B9D20], [], [], [], [], [], []
    ORA-00600: internal error code, arguments: [6302], [1], [], [], [], [], [], []

    ===============
    Sun Feb 01 12:26:54 BEIST 2015
    Errors in file /oracle/app/oracle/admin/acctc/udump/acctc2_ora_56230004.trc:
    ORA-00600: internal error code, arguments: [kgh_heap_sizes:ds], [0x110583508], [], [], [], [], [], []
    ORA-00600: internal error code, arguments: [kgh_heap_sizes:ds], [0x110583508], [], [], [], [], [], []
    ORA-00600: internal error code, arguments: [kgh_heap_sizes:ds], [0x110583508], [], [], [], [], [], []
    ORA-00600: internal error code, arguments: [kgh_heap_sizes:ds], [0x110583508], [], [], [], [], [], []
    ORA-00600: internal error code, arguments: [kgh_heap_sizes:ds], [0x110583508], [], [], [], [], [], []
    ORA-00600: internal error code, arguments: [kgh_heap_sizes:ds], [0x110583508], [], [], [], [], [], []
    ORA-00600: internal error code, arguments: [kgh_heap_sizes:ds], [0x110583508], [], [], [], [], [], []
    ORA-00600: internal error code, arguments: [6302], [1], [], [], [], [], [], []
    Sun Feb 01 12:27:53 BEIST 2015
    SMON: Parallel transaction recovery tried
    Sun Feb 01 12:28:04 BEIST 2015
    Errors in file /oracle/app/oracle/admin/acctc/udump/acctc2_ora_53215708.trc:
    ORA-00600: internal error code, arguments: [6200], [18446744073709551615], [], [], [], [], [], []
    Sun Feb 01 12:28:04 BEIST 2015
    Errors in file /oracle/app/oracle/admin/acctc/udump/acctc2_ora_53215708.trc:
    ORA-07445: exception encountered: core dump [leftmost+000c] [SIGSEGV] [Address not mapped to object] [0x105C4172A2B1034] [] []
    ORA-00600: internal error code, arguments: [6200], [18446744073709551615], [], [], [], [], [], []
    Sun Feb 01 12:28:04 BEIST 2015
    Errors in file /oracle/app/oracle/admin/acctc/udump/acctc2_ora_53215708.trc:
    ORA-07445: exception encountered: core dump [leftmost+000c] [SIGSEGV] [Address not mapped to object] [0x105C4172A2B1034] [] []
    ORA-07445: exception encountered: core dump [leftmost+000c] [SIGSEGV] [Address not mapped to object] [0x105C4172A2B1034] [] []
    ORA-00600: internal error code, arguments: [6200], [18446744073709551615], [], [], [], [], [], []
    Sun Feb 01 12:28:04 BEIST 2015

    ==============
    Sun Feb 01 14:28:37 BEIST 2015
    Errors in file /oracle/app/oracle/admin/acctc/udump/acctc2_ora_51249546.trc:
    ORA-00600: internal error code, arguments: [6300], [], [], [], [], [], [], []
    Sun Feb 01 14:28:37 BEIST 2015
    Errors in file /oracle/app/oracle/admin/acctc/udump/acctc2_ora_51249546.trc:
    ORA-07445: exception encountered: core dump [leftmost+000c] [SIGSEGV] [Address not mapped to object] [0xD2007308C704241D] [] []
    ORA-00600: internal error code, arguments: [6300], [], [], [], [], [], [], []
    TIP:
    错误从早上7点开始,一直到下午都在不断有ORA-600 ORA-7445 的错误信息写入alert,错误信息眼花缭乱,当多个错误出现是通常是从最早产生的错误查起,也许后面是附加的错误, 我们先从第一个ora-600 开始,ora-600 6302 是验证index key 相关的错误。

    # ora-600 6302 trace file #

    key1 (15104):
    ab 08 c7 04 24 15 01 42 59 46 05 c4 11 04 5f 34 00 00 18 4d f0 ef 00 0f 08
    c7 04 24 15 01 42 59 46 05 c4 11 3d 45 2e 00 00 5b 47 7d 8b 00 0e 08 c7 04
    24 15 01 42 59 46 05 c4 12 32 33 0a 00 00 5c 4b 0b 1c 00 82 08 c7 04 24 15
    01 42 59 46 05 c4 12 3f 37 13 00 00 5e 4d 44 e3 00 e4 08 c7 04 24 15 01 42
    59 46 05 c4 12 41 4c 2d 00 00 5d 46 cb 5f 00 59 08 c7 04 24 15 01 42 59 46
    ...
    04 24 15 01 42 58 61 05 c4 12 45 25 3c 00 00 19 0d 8b 5c 00 4b 08 c7 04 24
    15 01 42 58
    key2 (4): 24 15 01 42
    *** 2015-02-01 07:14:46.313
    ksedmp: internal or fatal error
    ORA-00600: internal error code, arguments: [6302], [1], [], [], [], [], [], []
    Current SQL statement for this session:
    MERGE INTO TEST t1
    USING (SELECT :mincycle as MIN_CYCLE_ID,
    :maxcycle as MAX_CYCLE_ID,
    :freevol as FREE_VOLUME,
    :usagevol as USAGE_VOLUME,
    :overvol as OVER_VOLUME,
    :tariffplan as TARIFF_PLAN_ID,
    to_date(:endtime , 'YYYYMMDDhh24miss') as CDR_END_TIME,
    :key_id as KEY_ID,
    :counterid as COUNTER_ID,
    :attrid as ATTR_ID,
    :subsid as SUBS_ID
    from dual) t2
    ON (t1.KEY_ID = t2.KEY_ID and t1.COUNTER_ID = t2.COUNTER_ID and t1.ATTR_ID = t2.ATTR_ID and t1.SUBS_ID = t2.SUBS_ID)
    WHEN MATCHED THEN
    update
    set t1.MIN_CYCLE_ID = t2.MIN_CYCLE_ID,
    t1.MAX_CYCLE_ID = t2.MAX_CYCLE_ID,
    t1.FREE_VOLUME = t2.FREE_VOLUME,
    t1.USAGE_VOLUME = t2.USAGE_VOLUME,
    t1.OVER_VOLUME = t2.OVER_VOLUME,
    t1.TARIFF_PLAN_ID = t2.TARIFF_PLAN_ID,
    t1.CDR_END_TIME = t2.CDR_END_TIME
    WHEN NOT MATCHED THEN
    insert
    (t1.MIN_CYCLE_ID,
    t1.MAX_CYCLE_ID,
    t1.FREE_VOLUME,
    t1.USAGE_VOLUME,
    t1.OVER_VOLUME,
    t1.TARIFF_PLAN_ID,
    t1.CDR_END_TIME,
    t1.KEY_ID,
    t1.COUNTER_ID,
    t1.ATTR_ID,
    t1.SUBS_ID)
    values
    (t2.MIN_CYCLE_ID,
    t2.MAX_CYCLE_ID,
    t2.FREE_VOLUME,
    t2.USAGE_VOLUME,
    t2.OVER_VOLUME,
    t2.TARIFF_PLAN_ID,
    t2.CDR_END_TIME,
    t2.KEY_ID,
    t2.COUNTER_ID,
    t2.ATTR_ID,
    t2.SUBS_ID)
    ----- Call Stack Trace -----
    kgeriv kgeasi kdksep kdxtms kdisle kdiins0 kdiinsp kauxsin insidx insflush qerusfetch updaul upsexe

    ...
    Object id on Block? Y
    seg/obj: 0x1adb65 csc: 0xce8.dbee5287 itc: 3 flg: E typ: 1 - DATA
    brn: 0 bdba: 0x5c4e4d05 ver: 0x01 opc: 0
    inc: 0 exflg: 0

    Itl Xid Uba Flag Lck Scn/Fsc
    0x01 0x0f47.03a.000b6058 0x5e027858.67b7.06 --U- 3 fsc 0x0000.dbee5ae5
    0x02 0x1074.040.000858a2 0x010b3d2f.4bab.06 ---- 5 fsc 0x0000.00000000
    0x03 0x0ff1.000.000cda32 0x5e04bd1d.6e0b.8d C--- 0 scn 0x0ce8.dbe90591
    ...
    Tip:
    可以看到是有一条merge insert 引起,在MOS 中search一圈没有找到相关信息,我们知道是与index相关的,那么trace 中的key 是不是就是index key呢?先查一下trace中的错误对象

    SQL> select to_number('1adb65','xxxxxxxxxxx') from dual;

    TO_NUMBER('1ADB65','XXXXXXXXXXX')
    ---------------------------------
    1760101

    select owner,object_name,object_type from dba_objects where object_id=1760101;
    OWNER OBJECT_NAME OBJECT_TYPE
    -------------------- ------------------------------ -------------------------
    USERINFO TEST TABLE PARTITION
    TIP:
    确定是我们SQL语句中的表,那下一步查询表上的所有索引,重点是两列的组合索引

    SQL> select
    2 c.table_owner ind_table_owner1,
    3 c.table_name ind_table_name1,
    4 c.index_name ind_index_name1,
    5 c.column_position ind_column_position1,
    6 c.column_name ind_column_name1,
    7 decode(c.descend,'DESC','DESC',null) ind_desc1,i.index_type,t.data_type,i.status
    8 from
    9 dba_ind_columns c
    10 join dba_indexes i on c.index_name=i.index_name
    11 join dba_tab_columns t on t.owner=i.table_owner and t.table_name=i.table_name and t.column_name=c.column_name
    12 where
    13 i.table_owner='USERINFO' and i.table_name='TEST' order by 2,3;

    IND_TABLE_OWNER IND_TABLE_NAME1 IND_INDEX_NAME1 IND_COLUMN_POSITION1 IND_COLUMN_NAME IND_ INDEX_TY DATA_TYPE STATUS
    --------------- --------------- -------------------- -------------------- --------------- ---- -------- ---------- ----------
    USERINFO TEST IDX1_TEST 1 MIN_CYCLE_ID NORMAL NUMBER N/A
    USERINFO TEST IDX1_TEST 2 MAX_CYCLE_ID NORMAL NUMBER N/A
    USERINFO TEST INX_TEST 1 KEY_ID NORMAL NUMBER N/A
    USERINFO TEST INX_TEST 2 COUNTER_ID NORMAL NUMBER N/A
    Tip:
    表上有两个两列的组合索引,不过列上都是number 类型, 因为第二个key 值比较短,那我们比较容易转换成真实的十进制数值,然后确认是哪个索引,转换方法参考我以前的笔记还原data block dumps实际值

    key2 (4): 24 15 01 42

    SQL> select to_number('24','xxxxxxxxxxxxx') from dual;
    TO_NUMBER('24','XXXXXXXXXXXXX')
    -------------------------------
    36
    SQL> c/24/15
    1* select to_number('15','xxxxxxxxxxxxx') from dual
    SQL> /

    TO_NUMBER('15','XXXXXXXXXXXXX')
    -------------------------------
    21
    SQL> c/15/01
    1* select to_number('01','xxxxxxxxxxxxx') from dual
    SQL> /
    TO_NUMBER('01','XXXXXXXXXXXXX')
    -------------------------------
    1
    SQL> c/01/42
    1* select to_number('42','xxxxxxxxxxxxx') from dual
    SQL> /
    TO_NUMBER('42','XXXXXXXXXXXXX')
    -------------------------------
    66
    TIP:
    那拼进来就是3621166, 然后这也就是索引第二列的值,因为目前我们还没确认是哪个索引,所以尝试把两个索引的第二列都查一下,看哪个值存在?

    select min_cycle_id from userinfo.TEST where max_cycle_id=3621166 ;

    SQL> select key_id from userinfo.TEST where counter_id=3621166;

    KEY_ID
    ----------
    3.1508E+12
    3.1908E+12
    3.3520E+12
    Tip:
    那到这里我们就可以初步判断是INX_TEST这个索引出了问题。我们再找另一个ora-600 确认下,ora-600 6300的trace文件没有可用信息,看ora-600 6200这是一个index block corrupt相关的错误

    # ora-600 6200 trace file #

    Block Checking: DBA = 526172808, Block Type = KTB-managed data block
    **** row 19: key out of order
    **** row 40: key out of order
    **** row 101: key out of order
    **** row 123: row length 65543 past end of block
    **** row 123: row skipped so other stats may be wrong
    **** row 196: row length 65543 past end of block
    **** row 196: row skipped so other stats may be wrong
    **** row 197: key out of order
    **** row 243: row length 65543 past end of block
    **** row 243: row skipped so other stats may be wrong
    **** row 250: row length 65543 past end of block
    **** row 250: row skipped so other stats may be wrong
    **** row 251: row length 65543 past end of block
    **** row 251: row skipped so other stats may be wrong
    **** row 252: row length 65543 past end of block
    **** row 252: row skipped so other stats may be wrong
    **** row 253: row length 65543 past end of block
    **** row 253: row skipped so other stats may be wrong
    **** row 254: row length 65543 past end of block
    **** row 254: row skipped so other stats may be wrong
    **** row 255: row length 65543 past end of block
    **** row 255: row skipped so other stats may be wrong
    **** row 256: row length 65543 past end of block
    **** row 256: row skipped so other stats may be wrong
    **** row 257: key out of order
    **** row 518: row length 65543 past end of block
    **** row 518: row skipped so other stats may be wrong
    **** row 519: row length 65543 past end of block
    **** row 519: row skipped so other stats may be wrong
    **** row 520: row length 65543 past end of block
    **** row 520: row skipped so other stats may be wrong
    **** actual rows marked deleted = 1 != kdxlende = 0
    **** key (begin=0x23fd, len=71) overlaps with another
    begin = 0x2414 len = 23
    ---- end index block validation
    Block header dump: 0x1f5cc288
    Object id on Block? Y
    seg/obj: 0x1b0086 csc: 0xce8.dbee5f7f itc: 3 flg: E typ: 2 - INDEX
    brn: 3 bdba: 0x1d8d9e05 ver: 0x01 opc: 0
    inc: 0 exflg: 0

    Itl Xid Uba Flag Lck Scn/Fsc
    0x01 0x0032.056.01c81b4c 0x01070ee9.78ea.01 CBU- 0 scn 0x0ca6.df9c8ce1
    0x02 0x1090.00a.0004c828 0x5e015267.174a.32 C--- 0 scn 0x0ce8.db5c9770
    0x03 0x0ff4.062.00107720 0x0100e967.7562.4e C--- 0 scn 0x0ce8.d449ede4

    TIP:
    这是一个索引块,下面我们来确认错误对象

    SQL> select to_number('1b0086','xxxxxxxxxx') from dual;
    TO_NUMBER('1B0086','XXXXXXXXXX')
    --------------------------------
    1769606

    SQL> select owner,object_name,object_type from dba_objects where object_id=1769606;
    OWNER OBJECT_NAME OBJECT_TYPE
    ---------- ------------------------------ -------------------
    USERINFO INX_TEST INDEX PARTITION
    Note:
    和上面我们用ora-600 6302 trace 推断的对象是一样的,下面我们根据dba 确认一下数据块的位置

    SQL> select to_char(526172808,'xxxxxxxxxxxx') from dual;
    TO_CHAR(52617
    -------------
    1f5cc288

    SQL> @dba 1f5cc288
    RFILE# BLOCK# BIGFILE_BLOCK# DUMP_CMD
    ---------- ---------- -------------- ---------------
    125 1884808 526172808 -- alter system dump datafile 125 block 1884808

    SQL> select name from v$datafile where RFILE#=125;
    NAME
    --------------------------------------------
    /dev/rzwc_lv30_088

    > dbv file=/dev/rzwc_lv30_088 blocksize=16384

    DBVERIFY: Release 10.2.0.5.0 - Production on Sun Feb 1 15:58:23 2015

    Copyright (c) 1982, 2007, Oracle. All rights reserved.

    DBVERIFY - Verification starting : FILE = /dev/rzwc_lv30_088
    Block Checking: DBA = 526172808, Block Type = KTB-managed data block
    **** row 19: key out of order
    **** row 40: key out of order
    **** row 101: key out of order
    **** row 123: row length 65543 past end of block
    **** row 123: row skipped so other stats may be wrong
    **** row 196: row length 65543 past end of block
    **** row 196: row skipped so other stats may be wrong
    **** row 197: key out of order
    **** row 243: row length 65543 past end of block
    **** row 243: row skipped so other stats may be wrong
    **** row 250: row length 65543 past end of block
    **** row 250: row skipped so other stats may be wrong
    **** row 251: row length 65543 past end of block
    **** row 251: row skipped so other stats may be wrong
    **** row 252: row length 65543 past end of block
    **** row 252: row skipped so other stats may be wrong
    **** row 253: row length 65543 past end of block
    **** row 253: row skipped so other stats may be wrong
    **** row 254: row length 65543 past end of block
    **** row 254: row skipped so other stats may be wrong
    **** row 255: row length 65543 past end of block
    **** row 255: row skipped so other stats may be wrong
    **** row 256: row length 65543 past end of block
    **** row 256: row skipped so other stats may be wrong
    **** row 257: key out of order
    **** row 518: row length 65543 past end of block
    **** row 518: row skipped so other stats may be wrong
    **** row 519: row length 65543 past end of block
    **** row 519: row skipped so other stats may be wrong
    **** row 520: row length 65543 past end of block
    **** row 520: row skipped so other stats may be wrong
    **** actual rows marked deleted = 1 != kdxlende = 0
    **** key (begin=0x23fd, len=71) overlaps with another
    begin = 0x2414 len = 23
    ---- end index block validation
    Page 1884808 failed with check code 6401


    DBVERIFY - Verification complete

    Total Pages Examined : 1966016
    Total Pages Processed (Data) : 0
    Total Pages Failing (Data) : 0
    Total Pages Processed (Index): 1923445
    Total Pages Failing (Index): 1
    Total Pages Processed (Other): 24369
    Total Pages Processed (Seg) : 0
    Total Pages Failing (Seg) : 0
    Total Pages Empty : 18202
    Total Pages Marked Corrupt : 0
    Total Pages Influx : 0
    Highest block SCN : 4043020581 (3304.4043020581)
    kdzwc2_bds:/home/oracle>

    Note:
    最终我们确认了是数据文件/dev/rzwc_lv30_088 block# 1884808上存的index block corrupt导致的这一系列ora-600 ora-7445, 解决方法可以考虑drop index, re-create index。 后来也得到确认索引重建后错误没有再出现。这类corrupte block可能是因为突然断电,内存勘误,写错误,oracle bug等原因,bug问题可以提SR 确认。

  • 相关阅读:
    【尺取法】
    [USACO12MAR]花盆Flowerpot [单调队列]
    数据库笔记
    NYOJ 91 阶乘之和(贪心)
    NYOJ 71 独木舟上的旅行(贪心)
    水池数目(DFS)
    poj 1164城堡问题(DFS)
    NYOJ 12 喷水装置(二)( 贪心)
    NYOJ 6(贪心)
    NYOJ 45( 分治,大数)
  • 原文地址:https://www.cnblogs.com/travel6868/p/5016801.html
Copyright © 2020-2023  润新知