• oracle 优化——索引与组合索引


    1、索引结构。第一张图是索引的官方图解,右侧是存储方式的图解。

    图中很清晰的展示了索引存储的状况。

    在leaf 节点中存储了一列,索引所对应项的 :值,rowId,长度,头信息(控制信息)

    这样我们就能很清楚、如果通过索引查找数据,而只需要这个索引的值的时候,写上列名,就可以不需要回表。

    2、索引在一般的数据量情况下,只有三层。leaf 是目录,branch 是目录的目录。可以做一个测试

     1 drop table t1 purge;
     2 drop table t2 purge;
     3 drop table t3 purge;
     4 drop table t4 purge;
     5 drop table t5 purge;
     6 drop table t6 purge;
     7 drop table t7 purge;
     8 
     9 
    10 create table t1 as select rownum as id ,rownum+1 as id2,rpad('*',1000,'*') as contents from dual connect by level<=1;
    11 create table t2 as select rownum as id ,rownum+1 as id2,rpad('*',1000,'*') as contents from dual connect by level<=10;
    12 create table t3 as select rownum as id ,rownum+1 as id2,rpad('*',1000,'*') as contents from dual connect by level<=100;
    13 create table t4 as select rownum as id ,rownum+1 as id2,rpad('*',1000,'*') as contents from dual connect by level<=1000;
    14 create table t5 as select rownum as id ,rownum+1 as id2,rpad('*',1000,'*') as contents from dual connect by level<=10000;
    15 create table t6 as select rownum as id ,rownum+1 as id2,rpad('*',1000,'*') as contents from dual connect by level<=100000;
    16 create table t7 as select rownum as id ,rownum+1 as id2,rpad('*',1000,'*') as contents from dual connect by level<=1000000;
    17 
    18 
    19 create index idx_id_t1 on t1(id);
    20 create index idx_id_t2 on t2(id);
    21 create index idx_id_t3 on t3(id);
    22 create index idx_id_t4 on t4(id);
    23 create index idx_id_t5 on t5(id);
    24 create index idx_id_t6 on t6(id);
    25 create index idx_id_t7 on t7(id);
    26 
    27 set linesize 1000
    28 set autotrace off
    29 select index_name,
    30           blevel,
    31           leaf_blocks,
    32           num_rows,
    33           distinct_keys,
    34           clustering_factor
    35      from user_ind_statistics
    36     where table_name in( 'T1','T2','T3','T4','T5','T6','T7');
    37 
    38 索引的名字          层级     leaf 块
    39 INDEX_NAME  BLEVEL LEAF_BLOCKS   NUM_ROWS DISTINCT_KEYS CLUSTERING_FACTOR
    40 ------------------ ----------- ---------- ------------- -----------------
    41 IDX_ID_T1        0           1          1             1                 1
    42 IDX_ID_T2        0           1         10            10                 2
    43 IDX_ID_T3        0           1        100           100                15
    44 IDX_ID_T4        1           3       1000          1000               143
    45 IDX_ID_T5        1          21      10000         10000              1429
    46 IDX_ID_T6        1         222     100000        100000             14286
    47 IDX_ID_T7        2        2226    1000000       1000000            142858
    View Code

    数据在一千万条的时候也只有三层(查到数据只需要四个逻辑读),所以即便上亿数据,也无非十个左右的逻辑读,充分了体现了索引的优点。如果不建索引,那么全表扫描是很恐怖的。

    3、索引在函数运算上能起到优化作用,比如sum,avg,count

      在没有主键,只有索引的表中count*的时候,由于索引不记录空值,所以它不会走索引,如果对索引列加上 not null“select count(*) from user where userId not null”,就能走索引、当然如果有主键自然就不必了。

     各种函数,运算列如果是索引列的话效率就会大幅度提高,可以看下面例子

     1 SUM/AVG的优化
     2 drop table t purge;
     3 create table t as select * from dba_objects;
     4 create index idx1_object_id on t(object_id);
     5 set autotrace on
     6 set linesize 1000
     7 set timing on 
     8 
     9 select sum(object_id) from t; 
    10 执行计划
    11 ----------------------------------------------------------------------------------------
    12 | Id  | Operation             | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
    13 ----------------------------------------------------------------------------------------
    14 |   0 | SELECT STATEMENT      |                |     1 |    13 |    49   (0)| 00:00:01 |
    15 |   1 |  SORT AGGREGATE       |                |     1 |    13 |            |          |
    16 |   2 |   INDEX FAST FULL SCAN| IDX1_OBJECT_ID | 92407 |  1173K|    49   (0)| 00:00:01 |
    17 ----------------------------------------------------------------------------------------
    18 统计信息
    19 ----------------------------------------------------------
    20           0  recursive calls
    21           0  db block gets
    22         170  consistent gets
    23           0  physical reads
    24           0  redo size
    25         432  bytes sent via SQL*Net to client
    26         415  bytes received via SQL*Net from client
    27           2  SQL*Net roundtrips to/from client
    28           0  sorts (memory)
    29           0  sorts (disk)
    30           1  rows processed
    31           
    32 --比较一下假如不走索引的代价,体会一下这个索引的重要性
    33 select /*+full(t)*/ sum(object_id) from t;  
    34 SUM(OBJECT_ID)
    35 --------------
    36   2732093100         
    37 执行计划
    38 ---------------------------------------------------------------------------
    39 | Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
    40 ---------------------------------------------------------------------------
    41 |   0 | SELECT STATEMENT   |      |     1 |    13 |   292   (1)| 00:00:04 |
    42 |   1 |  SORT AGGREGATE    |      |     1 |    13 |            |          |
    43 |   2 |   TABLE ACCESS FULL| T    | 92407 |  1173K|   292   (1)| 00:00:04 |
    44 ---------------------------------------------------------------------------
    45 统计信息
    46 ----------------------------------------------------------
    47           0  recursive calls
    48           0  db block gets
    49        1047  consistent gets
    50           0  physical reads
    51           0  redo size
    52         432  bytes sent via SQL*Net to client
    53         415  bytes received via SQL*Net from client
    54           2  SQL*Net roundtrips to/from client
    55           0  sorts (memory)
    56           0  sorts (disk)
    57           1  rows processed     
    58           
    59 --起来类似的比如AVG,和SUM是一样的,如下:
    60 select avg(object_id) from t; 
    61 AVG(OBJECT_ID)
    62 --------------
    63   37365.5338
    64 执行计划
    65 ----------------------------------------------------------------------------------------
    66 | Id  | Operation             | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
    67 ----------------------------------------------------------------------------------------
    68 |   0 | SELECT STATEMENT      |                |     1 |    13 |    49   (0)| 00:00:01 |
    69 |   1 |  SORT AGGREGATE       |                |     1 |    13 |            |          |
    70 |   2 |   INDEX FAST FULL SCAN| IDX1_OBJECT_ID | 92407 |  1173K|    49   (0)| 00:00:01 |
    71 ----------------------------------------------------------------------------------------
    72 统计信息
    73 ----------------------------------------------------------
    74           0  recursive calls
    75           0  db block gets
    76         170  consistent gets
    77           0  physical reads
    78           0  redo size
    79         448  bytes sent via SQL*Net to client
    80         415  bytes received via SQL*Net from client
    81           2  SQL*Net roundtrips to/from client
    82           0  sorts (memory)
    83           0  sorts (disk)
    84           1  rows processed
    85 
    86 --不知大家注意到没,这里的试验已经告诉我们了,OBJECT_ID列是否为空,也不影响SUM/AVG等聚合的结果。          
    Oracle 索引列函数运算的高效测试

    可以从执行计划中看到   consistent gets   走索引是 一百多, 而不走索引就上千,效率成倍增加。

    4、索引本身有序

    通过索引列排序,不会产生排序 ,但是io多,代价小。排序代价是很大的。 由于本身有序,获取最大值,最小值,也是高效的。

    二、组合索引。(索引只适合返回少量记录)

      1、适用在单独查询返回记录很多,组合查询后忽然返回记录很少的情况

      比如:1某地区的贷款条数很多,2涉农贷款条数很多,3某地区贷款金额在某区间的贷款数很多

          但是,某地区,涉农贷款,金额某区间的 贷款条数 就很少。

        当涉及某些组合条件检索数据时,就可以采用组合索引的形式,

        create index area_loanType_  on t(area_id,loan_type);

         而时间是做了分区表的,所以在这种情况下、检索速度会大大提高。

        2.组合索引第一个索引的重要性。

      由于组合索引是仅仅相对于前一个索引有意义,所以,当查询时,只通过第二个索引列查询,是不会走索引的,它的存在的意义只能是在使用第一个索引的情况下。

    但是,只通过第一个索引列 查询、是会走索引的。所以第一个索引也是常用,并且有单独查寻需要的列。

    当然如果前一个索引的值很少,比如loanType(涉农,个体,工业,慈善..) 共六七个而已,那么Oracle 收集了统计信息后,可以索引跳跃扫描,类似与  SELECT * from loans where areaId="" and loanType in (慈善,个体,工业...)and(第三个组合索引 的条件)

    3.仅等值无范围查询时,组合索引顺序不影响性能

    比如:

     1 --3.仅等值无范围查询时,组合索引顺序不影响性能(比如where col1=xxx and col2=xxx,无论COL1+COL2组合还是COL2+COL1组合)
     2 
     3 drop table t purge;
     4 create table t as select * from dba_objects;
     5 insert into t select * from t;
     6 insert into t select * from t;
     7 insert into t select * from t;
     8 update t set object_id=rownum ;
     9 commit;
    10 create index idx_id_type on t(object_id,object_type);
    11 create index idx_type_id on t(object_type,object_id);
    12 set autotrace off
    13 alter session set statistics_level=all ;
    14 set linesize 366
    15 
    16 select /*+index(t,idx_id_type)*/ * from  t  where object_id=20  and object_type='TABLE';
    17 select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));
    18 -----------------------------------------------------------------------------------------------------
    19 | Id  | Operation                   | Name        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
    20 -----------------------------------------------------------------------------------------------------
    21 |   0 | SELECT STATEMENT            |             |      1 |        |      1 |00:00:00.01 |       5 |
    22 |   1 |  TABLE ACCESS BY INDEX ROWID| T           |      1 |     57 |      1 |00:00:00.01 |       5 |
    23 |*  2 |   INDEX RANGE SCAN          | IDX_ID_TYPE |      1 |      9 |      1 |00:00:00.01 |       4 |
    24 -----------------------------------------------------------------------------------------------------
    25 
    26 select /*+index(t,idx_type_id)*/ * from  t  where object_id=20  and object_type='TABLE';
    27 select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));
    28 Plan hash value: 3420768628
    29 
    30 -----------------------------------------------------------------------------------------------------
    31 | Id  | Operation                   | Name        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
    32 -----------------------------------------------------------------------------------------------------
    33 |   0 | SELECT STATEMENT            |             |      1 |        |      1 |00:00:00.01 |       5 |
    34 |   1 |  TABLE ACCESS BY INDEX ROWID| T           |      1 |     57 |      1 |00:00:00.01 |       5 |
    35 |*  2 |   INDEX RANGE SCAN          | IDX_TYPE_ID |      1 |      9 |      1 |00:00:00.01 |       4 |
    36 -----------------------------------------------------------------------------------------------------
    37 
    38 
    39 --4.组合索引最佳顺序一般是将列等值查询的列置前。(测试组合索引在条件是不等的情况下的情况,条件经常是不等的,要放在后面,让等值的在前面)
    40 
    41 select /*+index(t,idx_id_type)*/ *  from   t where object_id>=20 and object_id<2000 and object_type='TABLE';
    42 select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));
    43 -----------------------------------------------------------------------------------------------------
    44 | Id  | Operation                   | Name        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
    45 -----------------------------------------------------------------------------------------------------
    46 |   0 | SELECT STATEMENT            |             |      1 |        |    469 |00:00:00.01 |      86 |
    47 |   1 |  TABLE ACCESS BY INDEX ROWID| T           |      1 |     14 |    469 |00:00:00.01 |      86 |
    48 |*  2 |   INDEX RANGE SCAN          | IDX_ID_TYPE |      1 |      1 |    469 |00:00:00.01 |      40 |
    49 -----------------------------------------------------------------------------------------------------
    50 
    51 
    52 
    53 
    54 select /*+index(t,idx_type_id)*/ *  from  t  where object_id>=20 and object_id<2000   and object_type='TABLE';
    55 -----------------------------------------------------------------------------------------------------
    56 | Id  | Operation                   | Name        | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
    57 -----------------------------------------------------------------------------------------------------
    58 |   0 | SELECT STATEMENT            |             |      1 |        |    469 |00:00:00.01 |      81 |
    59 |   1 |  TABLE ACCESS BY INDEX ROWID| T           |      1 |    469 |    469 |00:00:00.01 |      81 |
    60 |*  2 |   INDEX RANGE SCAN          | IDX_TYPE_ID |      1 |    469 |    469 |00:00:00.01 |      35 |
    61 -----------------------------------------------------------------------------------------------------
    自己感悟吧

    4、组合索引排序 如果对组合索引中列进行排序,可以走索引。

    三、创建索引的一些规则

    1、权衡索引个数与DML之间关系,DML也就是插入、删除数据操作。

    这里需要权衡一个问题,建立索引的目的是为了提高查询效率的,但建立的索引过多,会影响插入、删除数据的速度,因为我们修改的表数据,索引也要跟着修改。这里需要权衡我们的操作是查询多还是修改多。

    2、把索引与对应的表放在不同的表空间。

         当读取一个表时表与索引是同时进行的。如果表与索引和在一个表空间里就会产生资源竞争,放在两个表这空就可并行执行。

    3、最好使用一样大小是块。

         Oracle默认五块,读一次I/O,如果你定义6个块或10个块都需要读取两次I/O。最好是5的整数倍更能提高效率。

    4、如果一个表很大,建立索引的时间很长,因为建立索引也会产生大量的redo信息,所以在创建索引时可以设置不产生或少产生redo信息。只要表数据存在,索引失败了大不了再建,所以可以不需要产生redo信息。

  • 相关阅读:
    clearfix
    css浮动
    css常识
    给数组排序方法2
    定时器
    数组
    redhat 7.6 iptables 配置
    redhat 7.6 流量监控命令、软件(3)nethogs 监控进程实时流量
    redhat 7.6 流量监控命令、软件(2) iftop 监控网络IP实时流量
    redhat 7.6 流量监控命令、软件(1) ethstatus
  • 原文地址:https://www.cnblogs.com/javaMan/p/3991889.html
Copyright © 2020-2023  润新知