• oracle隐式转换问题


    之前碰到一个隐式转换的问题,这里回顾总结下 

    业务需求是每天晚上要从生产环境同步数据到分析数据库,两个库的表按照系统设计之处表结构是要一样的,现在只模拟两个字段。

    创建测试表

    1 create table source_tab as select  a.col1 pc_id,a.col2 pc_name from sreal_tab a;--模拟生产环境表
    2 create table target_tab as select  col1 tpc_id,col2 tpc_name from treal_tab where 1=2;--模拟分析数据库表 
    3 create index idx_tpc_id  on target_tab (tpc_id);

    如果记录已经在分析环境中存在的情况下,则更新tpc_name(名称字段);
    以下是存储过程实现代码

     1 create procedure proc_data_sync is  
     2 n_num number :=0;
     3 n_cnt number;
     4 begin
     5 for s in (select pc_id,pc_name from source_tab)
     6     loop
     7       n_num := n_num + 1;
     8       begin
     9         n_cnt := 0;
    10         select count(*)
    11           into n_cnt
    12           from target_tab t
    13          where s.pc_id = tpc_id;
    14 
    15         if n_cnt = 0 then
    16           insert into target_tab
    17             (tpc_id,
    18              tpc_name
    19              )
    20           values
    21             (s.pc_id,
    22              s.pc_name
    23              );
    24         else
    25           update target_tab
    26              set tpc_name = s.pc_name
    27            where tpc_id = s.pc_id;
    28         end if;
    29       exception
    30         when others then 
    31         dbms_output.put_line('执行到pc_id='||s.pc_id||'时出错'||sqlerrm);
    32       end;
    33       if mod(n_num, 5000) = 0 then
    34         --每5000条提交一次
    35         commit;
    36       end if;
    37     end loop;
    38     commit;
    39 end proc_data_sync;

    在存储过程中为了让
       select count(*)
              into n_cnt
              from target_tab t
             where s.pc_id = tpc_id;
    以及update
              update target_tab
                 set tpc_name = s.pc_name
               where tpc_id = s.pc_id;

    的查询速度创建了下列索引

    create index idx_tpc_id  on target_tab (tpc_id);

    等待测试很长时间没有结果
    看看会话在做什么

    1 select sid from v$session where machine='pc-wxc' and status='ACTIVE'  and module='PL/SQL Developer';
    2 select c.SAMPLE_TIME,c.SQL_EXEC_ID,c.SQL_OPNAME,c.SESSION_ID,c.SQL_ID,c.SQL_PLAN_OPERATION,c.SQL_PLAN_OPTIONS from v$active_session_history c  where session_id='396' ;

    看到几乎都是全表扫描

    10    05-2月 -16 11.24.23.784 上午    20791275    SELECT    396    1pt181n76fvzc    TABLE ACCESS    FULL
    11    05-2月 -16 11.24.22.774 上午    20791180    SELECT    396    1pt181n76fvzc    TABLE ACCESS    FULL
    12    05-2月 -16 11.24.21.774 上午    20791074    SELECT    396    1pt181n76fvzc    TABLE ACCESS    FULL
    13    05-2月 -16 11.24.20.774 上午    20790967    SELECT    396    1pt181n76fvzc    TABLE ACCESS    FULL
    14    05-2月 -16 11.24.19.774 上午    20790870    SELECT    396    1pt181n76fvzc    TABLE ACCESS    FULL
    15    05-2月 -16 11.24.18.761 上午    20790766    SELECT    396    1pt181n76fvzc    TABLE ACCESS    FULL
    16    05-2月 -16 11.24.17.761 上午    20790663    SELECT    396    1pt181n76fvzc    TABLE ACCESS    FULL
    17    05-2月 -16 11.24.16.761 上午    20790558    SELECT    396    1pt181n76fvzc    TABLE ACCESS    FULL
    18    05-2月 -16 11.24.15.751 上午    20790452    SELECT    396    1pt181n76fvzc    TABLE ACCESS    FULL

    检查sql 执行计划

    select * from TABLE(dbms_xplan.display_cursor('1pt181n76fvzc',1,'ADVANCED +PEEKED_BINDS'));

    SQL_ID  1pt181n76fvzc, child number 0
    -------------------------------------
    SELECT COUNT(*) FROM TARGET_TAB T WHERE :B1 = TPC_ID
     
    Plan hash value: 3416452088
     
    ---------------------------------------------------------------------------------
    | Id  | Operation          | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
    ---------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT   |            |       |       |     2 (100)|          |
    |   1 |  SORT AGGREGATE    |            |     1 |    34 |            |          |
    |*  2 |   TABLE ACCESS FULL| TARGET_TAB |     1 |    34 |     2   (0)| 00:00:01 |
    ---------------------------------------------------------------------------------
     
    Query Block Name / Object Alias (identified by operation id):
    -------------------------------------------------------------
     
       1 - SEL$1
       2 - SEL$1 / T@SEL$1
     
    Outline Data
    -------------
     
      /*+
          BEGIN_OUTLINE_DATA
          IGNORE_OPTIM_EMBEDDED_HINTS
          OPTIMIZER_FEATURES_ENABLE('11.2.0.3')
          DB_VERSION('11.2.0.3')
          ALL_ROWS
          OUTLINE_LEAF(@"SEL$1")
          FULL(@"SEL$1" "T"@"SEL$1")
          END_OUTLINE_DATA
      */
     
    Peeked Binds (identified by position):
    --------------------------------------
     
       1 - :B1 (NUMBER): 27629632
     
    Predicate Information (identified by operation id):
    ---------------------------------------------------
     
       2 - filter(TO_NUMBER("TPC_ID")=:B1)
     
    Column Projection Information (identified by operation id):
    -----------------------------------------------------------
     
       1 - (#keys=0) COUNT(*)[22]
     
    Note
    -----
       - dynamic sampling used for this statement (level=2)
     

    开始以为优化器选择执行计划应该是有它的道理吧,那评估下全部执行总时间
    加入dbms_utility.get_time来测量时间

     1 create or replace procedure proc_data_sync is
     2 n_num number :=0;
     3 n_cnt number;
     4 start_time number;
     5 end_time number;
     6 begin
     7 for s in (select pc_id,pc_name from source_tab where rownum <=10)
     8     loop
     9       n_num := n_num + 1;
    10       start_time := dbms_utility.get_time;
    11       begin
    12         n_cnt := 0;
    13         select count(*)
    14           into n_cnt
    15           from target_tab t
    16          where s.pc_id = tpc_id;
    17 
    18         if n_cnt = 0 then
    19           insert into target_tab
    20             (tpc_id,
    21              tpc_name
    22              )
    23           values
    24             (s.pc_id,
    25              s.pc_name
    26              );
    27         else
    28           update target_tab
    29              set tpc_name = s.pc_name
    30            where tpc_id = s.pc_id;
    31         end if;
    32       exception
    33         when others then
    34         dbms_output.put_line('执行到pc_id='||s.pc_id||'时出错'||sqlerrm);
    35       end;
    36       end_time := dbms_utility.get_time;
    37       dbms_output.put_line(' 单次执行时间为' ||to_char((end_time - start_time) / 100,
    38                                      'fm999990.999') || 's.');
    39       if mod(n_num, 5000) = 0 then
    40         --每5000条提交一次
    41         commit;
    42       end if;
    43     end loop;
    44     commit;
    45 end proc_data_sync;

    end proc_data_sync;
    结果是
     单次执行时间为0.06s.
     单次执行时间为0.02s.
     单次执行时间为0.02s.
     单次执行时间为0.02s.
     单次执行时间为0.02s.
     单次执行时间为0.02s.
     单次执行时间为0.02s.
     单次执行时间为0.02s.
     单次执行时间为0.02s.
     单次执行时间为0.02s.

    单次执行时间为0.02s,
    select count(*)*0.02/60/60 from source_tab
    结果是10个小时可以同步完,这个结果显然不太能接受
    是不是因为没有收集优化器统计信息?
    那就收一下

    1 begin
    2  dbms_stats.delete_table_stats(ownname=>'user_name',tabname=>'target_tab',cascade_indexes=>true);
    3 end;


    收完后重复上述 过程问题依旧,什么情况?看执行计划,没什么发现,后注意到有这样的一行
       2 - filter(TO_NUMBER("TPC_ID")=:B1)
    这里做了函数的隐式类型转换,原来是这里的问题。
    再去看看表结构 souce_tab的PC_ID列是number 类型 ,而target_tab的 TPC_ID是varrchar2的
    至此问题就清晰了,在游标中做转换

    for s in (select to_char(pc_id) pc_id,pc_name from source_tab)
    再测试

     单次执行时间为0.s.
     单次执行时间为0.s.
     单次执行时间为0.s.
     单次执行时间为0.s.
     单次执行时间为0.s.
     单次执行时间为0.s.
     单次执行时间为0.s.
     单次执行时间为0.s.
     单次执行时间为0.s.

    再看执行计划

     1 SQL_ID  1pt181n76fvzc, child number 2
     2 -------------------------------------
     3 SELECT COUNT(*) FROM TARGET_TAB T WHERE :B1 = TPC_ID
     4  
     5 Plan hash value: 4054714533
     6  
     7 --------------------------------------------------------------------------------
     8 | Id  | Operation         | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
     9 --------------------------------------------------------------------------------
    10 |   0 | SELECT STATEMENT  |            |       |       |     1 (100)|          |
    11 |   1 |  SORT AGGREGATE   |            |     1 |    34 |            |          |
    12 |*  2 |   INDEX RANGE SCAN| IDX_TPC_ID |     1 |    34 |     1   (0)| 00:00:01 |
    13 --------------------------------------------------------------------------------
    14  
    15 Query Block Name / Object Alias (identified by operation id):
    16 -------------------------------------------------------------
    17  
    18    1 - SEL$1
    19    2 - SEL$1 / T@SEL$1
    20  
    21 Outline Data
    22 -------------
    23  
    24   /*+
    25       BEGIN_OUTLINE_DATA
    26       IGNORE_OPTIM_EMBEDDED_HINTS
    27       OPTIMIZER_FEATURES_ENABLE('11.2.0.3')
    28       DB_VERSION('11.2.0.3')
    29       ALL_ROWS
    30       OUTLINE_LEAF(@"SEL$1")
    31       INDEX(@"SEL$1" "T"@"SEL$1" ("TARGET_TAB"."TPC_ID"))
    32       END_OUTLINE_DATA
    33   */
    34  
    35 Peeked Binds (identified by position):
    36 --------------------------------------
    37  
    38    1 - :B1 (VARCHAR2(30), CSID=852): '27629632'
    39  
    40 Predicate Information (identified by operation id):
    41 ---------------------------------------------------
    42  
    43    2 - access("TPC_ID"=:B1)
    44  
    45 Column Projection Information (identified by operation id):
    46 -----------------------------------------------------------
    47  
    48    1 - (#keys=0) COUNT(*)[22]
    49  
    50 Note
    51 -----
    52    - dynamic sampling used for this statement (level=2)
    53  

    "TPC_ID"=:B1 这个已经从过滤条件变成了访问条件,也就是用上tpc_id的列的索引了,隐式类型转换虽然让编写sql变得简单了,但是很多人不会注意到,这成了潜在的性能问题.

    再跑测试出来后3分钟就搞定了,这个速度相差太远了吧.

  • 相关阅读:
    HSF原理
    Spring IOC 容器源码分析
    Spring Bean注册和加载
    CAP和BASE理论
    Java内存模型
    Java线程模型
    IO复用、多进程和多线程三种并发编程模型
    无锁编程本质论
    An Introduction to Lock-Free Programming
    安装与配置ironic
  • 原文地址:https://www.cnblogs.com/wangxingc/p/5183079.html
Copyright © 2020-2023  润新知