• Oracle ADG备库的2节点查询wrong result问题


    Oracle ADG备库的2节点查询wrong result问题

    应用组的向我反馈了一个很奇葩的问题。

    在备库查询同样的SQL,结果和主库不一样。如下:

    19:56:19 SYS@xxxxxx2(483)> select * from user.tablename t where t.seqse_no in ('22020002401');
    
    no rows selected
    
    Elapsed: 00:00:00.01
    19:57:18 SYS@xxxxxx2(483)> select count(*) from user.tablename t where t.seqse_no='22020002401';
    
      COUNT(*)
    ----------
             8
    
    Elapsed: 00:00:00.01
    19:57:38 SYS@xxxxxx2(483)> select * from user.tablename t where t.seqse_no = '22020002401';
    
    no rows selected
    
    Elapsed: 00:00:00.01
    19:58:43 SYS@xxxxxx2(483)> set pagesize 9999 long 9999 line 500 timing on
    19:58:50 SYS@xxxxxx2(483)> alter session set statistics_level=all;
    
    Session altered.
    
    Elapsed: 00:00:00.00
    19:58:50 SYS@xxxxxx2(483)> select count(*) from user.tablename t where t.seqse_no='22020002401';
    
      COUNT(*)
    ----------
             8
    
    Elapsed: 00:00:00.01
    19:58:52 SYS@xxxxxx2(483)> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));
    
    PLAN_TABLE_OUTPUT
    -------------------------------------------------------------------------------------------------------
    SQL_ID  903nj19yx1ku0, child number 1
    -------------------------------------
    select count(*) from user.tablename t where
    t.seqse_no='22020002401'
    
    Plan hash value: 533856606
    
    ------------------------------------------------------------------------------------------------
    | Id  | Operation         | Name             | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
    ------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT  |                  |      1 |        |      1 |00:00:00.01 |       4 |
    |   1 |  SORT AGGREGATE   |                  |      1 |      1 |      1 |00:00:00.01 |       4 |
    |*  2 |   INDEX RANGE SCAN| IDX_TDH_SEQSE_NO |      1 |      4 |      8 |00:00:00.01 |       4 |
    ------------------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       2 - access("T"."SEQSE_NO"='22020002401')
    
    
    20 rows selected.
    
    Elapsed: 00:00:00.08
    19:58:53 SYS@xxxxxx2(483)> select * from user.tablename t where t.seqse_no = '22020002401';
    
    no rows selected
    
    Elapsed: 00:00:00.01
    19:59:01 SYS@xxxxxx2(483)> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));
    
    PLAN_TABLE_OUTPUT
    --------------------------------------------------------------------------------------------------------------------
    SQL_ID  dtj0crstr2tgb, child number 1
    -------------------------------------
    select * from user.tablename t where t.seqse_no ='22020002401'
    
    Plan hash value: 176717312
    
    -----------------------------------------------------------------------------------------------------------------
    | Id  | Operation                          | Name             | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
    -----------------------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT                   |                  |      1 |        |      0 |00:00:00.01 |       4 |
    |   1 |  TABLE ACCESS BY GLOBAL INDEX ROWID| TABLENAME        |      1 |      4 |      0 |00:00:00.01 |       4 |
    |*  2 |   INDEX RANGE SCAN                 | IDX_TDH_SEQSE_NO |      1 |      4 |      8 |00:00:00.01 |       4 |
    -----------------------------------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       2 - access("T"."SEQSE_NO"='22020002401')
    
    
    19 rows selected.
    
    Elapsed: 00:00:00.04

    同时,警告日志可能伴有ORA-600的报错。

    Mon Jan 03 11:43:12 2022
    Errors in file /u01/app/oracle/diag/rdbms/xxxxxxstb/xxxxxx2/trace/xxxxxx2_ora_28759.trc  (incident=25033):
    ORA-00600: 内部错误代码, 参数: [12406], [], [], [], [], [], [], [], [], [], [], []
    Incident details in: /u01/app/oracle/diag/rdbms/xxxxxxstb/xxxxxx2/incident/incdir_25033/xxxxxx2_ora_28759_i25033.trc
    Errors in file /u01/app/oracle/diag/rdbms/xxxxxxstb/xxxxxx2/incident/incdir_25033/xxxxxx2_ora_28759_i25033.trc:

    xxxxxx2_ora_28759.trc部分内容:

    Incident 25033 created, dump file: /u01/app/oracle/diag/rdbms/xxxxxxstb/xxxxxx2/incident/incdir_25033/xxxxxx2_ora_28759_i25033.trc
    ORA-00600: 内部错误代码, 参数: [12406], [], [], [], [], [], [], [], [], [], [], []

    xxxxxx2_ora_28759_i25033.trc部分内容:

    *** 2022-01-03 11:43:12.994
    *** SESSION ID:(861.5305) 2022-01-03 11:43:12.994
    *** CLIENT ID:() 2022-01-03 11:43:12.994
    *** SERVICE NAME:() 2022-01-03 11:43:12.994
    *** MODULE NAME:(PL/SQL Developer) 2022-01-03 11:43:12.994
    *** ACTION NAME:(SQL Window - 常用脚本.sql) 2022-01-03 11:43:12.994
     
    Dump continued from file: /u01/app/oracle/diag/rdbms/xxxxxxstb/xxxxxx2/trace/xxxxxx2_ora_28759.trc
    ORA-00600: 内部错误代码, 参数: [12406], [], [], [], [], [], [], [], [], [], [], []
    
    ========= Dump for incident 25033 (ORA 600 [12406]) ========
    
    *** 2022-01-03 11:43:12.995
    dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0)
    ----- Current SQL Statement for this session (sql_id=du0p5v7647sbs) -----

    这种一般都是BUG导致的问题,而且这类查询结果有问题的叫“wrong result”,查询MOS,根据Known issues of Wrong Results in Active Data Guard(ADG) (文档 ID 2490469.1)

    unpublished Bug 19614585 - Wrong Results / ORA-600 [kksgaGetNoAlloc_Int0] / ORA-600 [12406] / ORA-7445 / ORA-8103 / ORA-1555 from query on RAC ADG Physical Standby Database (Note:19614585.8)比较符合:

    Affects versions: Versions BELOW 12.2
    Versions confirmed as being affected

    12.1.0.2 (Server Patch Set)
    11.2.0.4

    Symptoms:
    This bug is only relevant when using Real Application Clusters (RAC) and Partitioned Tables

    This problem is caused by the fix of bug 15969429.

    If the fix of bug 15969429 is present (which is the case of 11.2.0.4.0 and 12.1.0.1) and if running a query against
    a PARTITION object on a RAC Active Dataguard (ADG) Physical Standby database causes Wrong Results or errors like:

    ORA-600 [kksgaGetNoAlloc_Int0]
    ORA-600 [qesmaGetIdxSeg2]
    ORA-600 [kdsgrp1]
    ORA-600 [12406]
    ORA-7445 [kqrLocalInvalidateByHash]
    ORA-7445 [kslgetl]
    ORA-8103
    ORA-1555

    and the same query runs without problems on the primary database, then it could be a case of this bug.

    As an example of wrong results, for a RAC ADG database with two instances, the ADG instance 1 may see a different number of rows for the PARTITION table than ADG instance 2.

    fixed:
    12.2.0.1
    12.1.0.2 Bundle Patch 3 for Engineered Systems/DB in-Memory(DBBP)
    11.2.0.4 Bundle Patch 17 for Exadata Database(Jul 2015)
    11.2.0.4.160229 (Feb 2016) Bundle Patch for Windows Platforms

    Solution: Check for availability of One off patch using Patch 19614585.

    ORA-600[12406]以及备库1查询和2查询是不一样的,本次有问题的SQL涉及的表也是一个分区表。

    下载补丁p19614585_11204200114_Linux-x86-64.zip后在测试库上做冲突检查后发现有冲突补丁存在没法打......

    [oracle@dev-testdb 19614585]$ opatch lspatches
    18034737;
    13446078;
    14202396;
    31668908;OJVM PATCH SET UPDATE 11.2.0.4.201020
    31537677;Database Patch Set Update : 11.2.0.4.201020 (31537677)
    29938455;OCW Patch Set Update : 11.2.0.4.191015 (29938455)
    
    OPatch succeeded.
    [oracle@dev-testdb 19614585]$ opatch prereq CheckConflictAgainstOHWithDetail -ph ./
    Oracle Interim Patch Installer version 11.2.0.3.28
    Copyright (c) 2022, Oracle Corporation.  All rights reserved.
    
    PREREQ session
    
    Oracle Home       : /u01/app/oracle/product/11.2.0/db_1
    Central Inventory : /u01/app/oraInventory
       from           : /u01/app/oracle/product/11.2.0/db_1/oraInst.loc
    OPatch version    : 11.2.0.3.28
    OUI version       : 11.2.0.4.0
    Log file location : /u01/app/oracle/product/11.2.0/db_1/cfgtoollogs/opatch/opatch2022-01-11_17-30-44PM_1.log
    
    Invoking prereq "checkconflictagainstohwithdetail"
    
    ZOP-40: The patch(es) has conflicts with other patches installed in the Oracle Home (or) among themselves.
    
    Prereq "checkConflictAgainstOHWithDetail" failed.
    
    Summary of Conflict Analysis:
    
    There are no patches that can be applied now.
    
    Following patches have conflicts. Please contact Oracle Support and get the merged patch of the patches : 
    19614585, 31537677
    
    Whole composite patch Conflicts/Supersets are:
    
    Patch : 19614585
    
            Conflict with Composite Patch 31537677
    
    Detail Conflicts/Supersets for each patch are:
    
    Patch : 19614585
    
            Conflict with Sub-Patch 30670774
            Conflict details:
            /u01/app/oracle/product/11.2.0/db_1/lib/libgeneric11.a:kgl.o
    
    OPatch succeeded.

      

    由于目前备库作用除了容灾外,也是给内部人员做只读操作的分流分压的。

    像是PLSQL连接是长时间保持连着的,全部加起来可以达到100多个,连到备库可以分连接数。

    另外有相当一部分的统计需要全表扫描表,连到备库可以分摊IO和CPU的压力。

    无法应用补丁,最后决定将2节点实例关闭在观察。

  • 相关阅读:
    docker的安装
    Linux的常用命令
    HTTP协议,HTTPS协议,Websocket协议
    常用排序
    go的数组,切片,map
    if-else,switch,for循环
    go的函数,包以及mode的补充
    Android学习笔记——从源码看Handler的处理机制
    ElementUI
    关于IO的理解
  • 原文地址:https://www.cnblogs.com/PiscesCanon/p/15789281.html
Copyright © 2020-2023  润新知