• Oracle 优化——位图、函数等索引介绍


    一、位图索引

    我将使用一个例子,来描述位图索引的存储,并分析它的优点。

    Table :Loans 放贷信息

    ID userId 行业投向 币种 证件类型 还本付息方式 状态
    1 1 农业 人民币 身份证 等额本息还款法 已上报
    2 2 农业 人民币 身份证 等本还款法 未上报
    3 1 工业 人民币 护照 按季计息到期还本法 已上报
    4 2 个体 人民币 身份证 等本还款法 已上报
    5 5 其他 人民币 身份证 按月计息到期还本法 未上报

     

     

    我对行业投向,和还本付息方式添加了位图索引

    create bitmap index index_投向 on loans(行业投向);
    create bitmap index index_还本付息方式 on loans(还本付息方式);

    那么它会这么对位图索引进行存储:当前列的每一种值,存放在一个块中,通过0和1来标示改rownumber是否存在改值。

    行业投向位图索引/还本付息方式  

    值/行 第一行 第二行 第三行 第四行
    农业  1  1  0  0
    工业  0  0  1  0
    个体  0  0  0  1
    其他  0  0  0  0
    值/行 第一行 第二行 第三行 第四行
    等额本息还款法  1  0  0  0
    等本还款法  0  1  0  0
    按季计息到期还本法  0  0  1  0
    按月计息到期还本法  0  0  0 1

    有图可以看出, 农业、工业、个体都各以一个块来存放 所有列“自己是否为真”。 

    所以暂时可以得出:

    1、位图索引,必须创建在“仅仅几种值的情况”。 

      如果在低重复度的列上创建位图索引是很恐怖的,他将创建N多个块来存储。不论创建,还是查询,都是不聪明的。

    2、位图索引,不适合放在常修改的字段列(如状态列)容易发生死锁。   

      位图索引死锁情况举例

    --SESSION 1(持有者)
    DELETE FROM LOANS WHERE 行业投向='农业' AND status=1;
    
    ---SESSION 2(其他会话) 插入带'农业'的记录就立即被阻挡,以下三条语句都会被阻止
    insert into loans(Id,投向.....) values (1,'农业',....);
    update t set 投向='工业' WHERE id=25;
    delete from loans WHERE 行业投向='农业';
    
    --以下是可以进行不受阻碍的
    insert into loans(Id,投向.....) values (1,'工业',....);
    delete from t where gender='工业' ;
    UPDATE T SET status='aa' WHERE ROWID NOT IN ( SELECT ROWID FROM T WHERE 投向='工业' ) ; 
    
    --update只要不更新位图索引所在的列即可

    3、索引通过 比特位 存储01,来标示真假,占用内存很小,检索效率极高。 

      count(*) where 行业投向 = 农业,效率是很高的, 

    当采集平台完成这些金融数据采集后,金融监管部门要对信息进行分析、统计,形成报表。有位图索引效率是很好的。

    具体案例

      1 /*
      2 总结:本质原因:其实就是位图索引存放的是0,1的比特位,占字节数特别少。
      3 */
      4 
      5 --位图索引跟踪前准备
      6 drop table t purge;
      7 set autotrace off
      8 create table t as select * from dba_objects;
      9 insert into t select * from t;
     10 insert into t select * from t;
     11 insert into t select * from t;
     12 insert into t select * from t;
     13 insert into t select * from t;
     14 insert into t select * from t;
     15 update t set object_id=rownum;
     16 commit;
     17 
     18 --观察COUNT(*)全表扫描的代价
     19 set autotrace on
     20 set linesize 1000
     21 select count(*) from t;
     22 
     23 
     24 
     25 ------------------------------------------
     26   COUNT(*)
     27 ----------
     28   4684992
     29 执行计划
     30 ----------------------------------------------------------
     31 Plan hash value: 2966233522
     32 
     33 -------------------------------------------------------------------
     34 | Id  | Operation          | Name | Rows  | Cost (%CPU)| Time     |
     35 -------------------------------------------------------------------
     36 |   0 | SELECT STATEMENT   |      |     1 | 20420  (11)| 00:04:06 |
     37 |   1 |  SORT AGGREGATE    |      |     1 |            |          |
     38 |   2 |   TABLE ACCESS FULL| T    |   294M| 20420  (11)| 00:04:06 |
     39 -------------------------------------------------------------------
     40 统计信息
     41 ----------------------------------------------------------
     42           0  recursive calls
     43           0  db block gets
     44       66731  consistent gets
     45           0  physical reads
     46           0  redo size
     47         426  bytes sent via SQL*Net to client
     48         415  bytes received via SQL*Net from client
     49           2  SQL*Net roundtrips to/from client
     50           0  sorts (memory)
     51           0  sorts (disk)
     52           1  rows processed
     53           
     54           
     55 
     56 
     57 
     58 --观察COUNT(*)用普通索引的代价
     59 create index idx_t_obj on t(object_id);
     60 alter table T modify object_id not null;
     61 set autotrace on
     62 select count(*) from t;
     63 
     64 
     65 
     66 
     67 
     68   COUNT(*)
     69 ----------
     70   4684992
     71 普通索引的执行计划
     72 ---------------------------------------------------------------------------
     73 | Id  | Operation             | Name      | Rows  | Cost (%CPU)| Time     |
     74 ---------------------------------------------------------------------------
     75 |   0 | SELECT STATEMENT      |           |     1 |  3047   (2)| 00:00:37 |
     76 |   1 |  SORT AGGREGATE       |           |     1 |            |          |
     77 |   2 |   INDEX FAST FULL SCAN| IDX_T_OBJ |  4620K|  3047   (2)| 00:00:37 |
     78 ---------------------------------------------------------------------------
     79 普通索引的统计信息
     80 ----------------------------------------------------------
     81           0  recursive calls
     82           0  db block gets
     83       10998  consistent gets
     84           0  physical reads
     85           0  redo size
     86         426  bytes sent via SQL*Net to client
     87         415  bytes received via SQL*Net from client
     88           2  SQL*Net roundtrips to/from client
     89           0  sorts (memory)
     90           0  sorts (disk)
     91           1  rows processed
     92 
     93 
     94 
     95 
     96 --观察COUNT(*)用位图索引的代价(注意,这里我们特意取了status这个重复度很高的列做索引)
     97 create bitmap index idx_bitm_t_status on t(status);
     98 select count(*) from t;
     99 
    100 SQL> select count(*) from t;
    101 
    102 
    103 
    104 
    105 
    106 
    107   COUNT(*)
    108 ----------
    109   4684992
    110 
    111 位图索引的执行计划
    112 -------------------------------------------------------------------------------------------
    113 | Id  | Operation                     | Name              | Rows  | Cost (%CPU)| Time     |
    114 -------------------------------------------------------------------------------------------
    115 |   0 | SELECT STATEMENT              |                   |     1 |   115   (0)| 00:00:02 |
    116 |   1 |  SORT AGGREGATE               |                   |     1 |            |          |
    117 |   2 |   BITMAP CONVERSION COUNT     |                   |  4620K|   115   (0)| 00:00:02 |
    118 |   3 |    BITMAP INDEX FAST FULL SCAN| IDX_BITM_T_STATUS |       |            |          |
    119 -------------------------------------------------------------------------------------------
    120 位图索引的统计信息
    121 ----------------------------------------------------------
    122           0  recursive calls
    123           0  db block gets
    124         125  consistent gets
    125           0  physical reads
    126           0  redo size
    127         426  bytes sent via SQL*Net to client
    128         415  bytes received via SQL*Net from client
    129           2  SQL*Net roundtrips to/from client
    130           0  sorts (memory)
    131           0  sorts (disk)
    132           1  rows processed
    133          
    134  
    位图索引与普通索引比较以及执行计划

    二、反向索引

     假如 我现在有些ID  100001,100002,100003,100004,100005 ,那么反向索引 ,他的索引创建的就是  100001,200001,300001,400001,500001。 由于序列本身有序,会根据范围放在不同的叶子块

    详见:索引,组合索引篇      那么索引就被放在不同的快中,有效的减少了热快争用。

    再看一下这张图, 最下面就是叶子块  ,100001 和200001 和300001 会放在不同的块中,而一般常常会频繁的访问近期的数据,那么由于他们在不同的块中,在索引进行检索的时候,能够有效的减少资源竞争。

    创建反向索引的sql

    ---反向索引
    create index rev_index on t(column) reverse;
    
    ---将反向索引转换成普通索引。
    alter index rev_index  rebuild noreverse;

    2、反向索引,在进行范围查询的时候无效,  

    3、反向索引无序了,所以无法走索引排序,

    三、函数索引

    我们现在,有一个场景:有一列数据是有大小写的,但是查询的时候,不需要区分大小写。

    那么语句只能这么写    select * from t  where upper(object_name)='T' ;

    首先有一个常识,就是  走了函数查询,不会走索引。 就像有些查询 列的类型与值类型不匹,会进行值类型函数转换,然后无法进行索引查询

    eg: id为varchar类型  而查询语句为:select * from t where id = 1。  由于数据字段为varchar类型,而参数为number 类型,故会进行值类型转换。检索就走了全表扫描。

    那么如何实现场景需求呢? 只能让函数索引一展身手:

    create index idx_func_ojbnam on t( upper(object_name) );

    upper()是Oracle内部函数

    现在   select * from t  where upper(object_name)='T' ;   这条语句就能走上索引。

    请看具体案例

     1 --测函数索引前准备
     2 drop table t purge;
     3 create table t as select * from dba_objects;
     4 create index idx_object_id on t(object_id);
     5 create index idx_object_name on t(object_name);
     6 create index idx_created on t(created);
     7 
     8 
     9 
    10 
    11 --对列做UPPER操作,无法用到索引
    12 set autotrace traceonly
    13 set linesize 1000
    14 ---以下语句由于列运算,所以走的是全表扫描
    15 select * from t  where upper(object_name)='T' ;
    16 执行计划
    17 --------------------------------------------------------------------------
    18 | Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
    19 --------------------------------------------------------------------------
    20 |   0 | SELECT STATEMENT  |      |    12 |  2484 |   293   (1)| 00:00:04 |
    21 |*  1 |  TABLE ACCESS FULL| T    |    12 |  2484 |   293   (1)| 00:00:04 |
    22 --------------------------------------------------------------------------
    23 统计信息
    24 ----------------------------------------------------------
    25           0  recursive calls
    26           0  db block gets
    27        1049  consistent gets
    28           0  physical reads
    29           0  redo size
    30        1500  bytes sent via SQL*Net to client
    31         415  bytes received via SQL*Net from client
    32           2  SQL*Net roundtrips to/from client
    33           0  sorts (memory)
    34           0  sorts (disk)
    35           2  rows processed
    36           
    37 --去掉列的UPPER操作后立即用索引
    38 select * from t  where  object_name='T' ;
    39 执行计划
    40 ----------------------------------------------------------
    41 Plan hash value: 1138138579
    42 
    43 -----------------------------------------------------------------------------------------------
    44 | Id  | Operation                   | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
    45 -----------------------------------------------------------------------------------------------
    46 |   0 | SELECT STATEMENT            |                 |     2 |   414 |     4   (0)| 00:00:01 |
    47 |   1 |  TABLE ACCESS BY INDEX ROWID| T               |     2 |   414 |     4   (0)| 00:00:01 |
    48 |*  2 |   INDEX RANGE SCAN          | IDX_OBJECT_NAME |     2 |       |     3   (0)| 00:00:01 |
    49 -----------------------------------------------------------------------------------------------
    50 统计信息
    51 ----------------------------------------------------------
    52           0  recursive calls
    53           0  db block gets
    54           6  consistent gets
    55           0  physical reads
    56           0  redo size
    57        1506  bytes sent via SQL*Net to client
    58         415  bytes received via SQL*Net from client
    59           2  SQL*Net roundtrips to/from client
    60           0  sorts (memory)
    61           0  sorts (disk)
    62           2  rows processed      
    63           
    64               
    65 --如果必须用upper的条件,那你想用到索引,就得去建函数索引
    66 create index idx_func_ojbnam on t(upper(object_name));
    67 --继续执行,终于走索引了。
    68 
    69 select * from t  where upper(object_name)='T' ;
    70 执行计划
    71 -----------------------------------------------------------------------------------------------
    72 | Id  | Operation                   | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
    73 -----------------------------------------------------------------------------------------------
    74 |   0 | SELECT STATEMENT            |                 |   775 |   206K|   152   (0)| 00:00:02 |
    75 |   1 |  TABLE ACCESS BY INDEX ROWID| T               |   775 |   206K|   152   (0)| 00:00:02 |
    76 |*  2 |   INDEX RANGE SCAN          | IDX_FUNC_OJBNAM |   310 |       |     3   (0)| 00:00:01 |
    77 -----------------------------------------------------------------------------------------------
    78 统计信息
    79 ----------------------------------------------------------
    80           0  recursive calls
    81           0  db block gets
    82           6  consistent gets
    83           0  physical reads
    84           0  redo size
    85        1500  bytes sent via SQL*Net to client
    86         415  bytes received via SQL*Net from client
    87           2  SQL*Net roundtrips to/from client
    88           0  sorts (memory)
    89           0  sorts (disk)
    90           2  rows processed
    91 
    92 
    93 
    94 
    95 
    96  
    97  
    场景中使用函数索引的案例实战(可以直接运行)
     1 /*
     2   结论:什么类型就放什么值,否则会发生类型转换,导致系能问题!
     3 (是存放字符的字段就设varchar2类型,是存放数值的字段就设置number类型,是存放日期的字段就设置date类型)
     4  这里的案例宏中
     5  select * from t_col_type where id=6; 用不到索引,要改成select * from t_col_type where id='6';
     6  如果送来的参数无法保证是'6',只能写成select * from t_col_type where to_number(id)=6;并且建to_number(id)的函数索引
     7  方可,这是很无奈的事。
     8   
     9 */
    10 
    11 
    12 --举例说明:
    13 drop table t_col_type purge;
    14 create table t_col_type(id varchar2(20),col2 varchar2(20),col3 varchar2(20));
    15 insert into t_col_type select rownum,'abc','efg' from dual connect by level<=10000;
    16 commit;
    17 create index idx_id on t_col_type(id);
    18 set linesize 1000
    19 set autotrace traceonly
    20 
    21 select * from t_col_type where id=6;
    22 
    23 执行计划
    24 --------------------------------------------------------------------------------
    25 | Id  | Operation         | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
    26 --------------------------------------------------------------------------------
    27 |   0 | SELECT STATEMENT  |            |     1 |    36 |     9   (0)| 00:00:01 |
    28 |*  1 |  TABLE ACCESS FULL| T_COL_TYPE |     1 |    36 |     9   (0)| 00:00:01 |
    29 --------------------------------------------------------------------------------
    30    1 - filter(TO_NUMBER("ID")=6)
    31 统计信息
    32 ----------------------------------------------------------
    33           0  recursive calls
    34           0  db block gets
    35          32  consistent gets
    36           0  physical reads
    37           0  redo size
    38         540  bytes sent via SQL*Net to client
    39         415  bytes received via SQL*Net from client
    40           2  SQL*Net roundtrips to/from client
    41           0  sorts (memory)
    42           0  sorts (disk)
    43           1  rows processed
    44 
    45 --实际上只有如下写法才可以用到索引,这个很不应该,是什么类型的取值就设置什么样的字段。
    46           
    47 select * from t_col_type where id='6';
    48 执行计划
    49 ------------------------------------------------------------------------------------------
    50 | Id  | Operation                   | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
    51 ------------------------------------------------------------------------------------------
    52 |   0 | SELECT STATEMENT            |            |     1 |    36 |     2   (0)| 00:00:01 |
    53 |   1 |  TABLE ACCESS BY INDEX ROWID| T_COL_TYPE |     1 |    36 |     2   (0)| 00:00:01 |
    54 |*  2 |   INDEX RANGE SCAN          | IDX_ID     |     1 |       |     1   (0)| 00:00:01 |
    55 ------------------------------------------------------------------------------------------
    56    2 - access("ID"='6')
    57 统计信息
    58 ----------------------------------------------------------
    59           0  recursive calls
    60           0  db block gets
    61           4  consistent gets
    62           0  physical reads
    63           0544  bytes sent via SQL*Net to client
    65         415  bytes received via SQL*Net from client
    66           2  SQL*Net roundtrips to/to/from client
    67           0  sorts (memory)
    68           0  sorts (disk)
    69           1  rows processed          
    70 
    71 create index idx_func_tonumber_id on t_col_type(to_number(id));         
    72 select * from t_col_type where to_number(id)=6;        
    73lt;span style="color: #008080;">75 ----------------------------------------------------------------------------------------------------
    76 | Id  | Operation                   | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
    77t;>| Time     |
    77 ----------------------------------------------------------------------------------------------------
    78 |   0 | SELECT STATEMENT            |                      |   100 |  4900 |     2   (0)| 00:00:01 |
    79 |   1 |  TABLE ACCESS BY INDEX ROWID| T_COL_TYPE           |   100 |  4900 |     2   (0)| 00:00:01 |
    80 |*  2 |   INDEX RANGE SCAN          | IDX_FUNC_TONUMBER_ID |    40 |       |     1   (0)| 00:00:01 |
    81 ----------------------------------------------------------------------------------------------------
    82 2 - access(TO_NUMBER("ID")=6)
    83 统计信息
    84 ----------------------------------------------------------
    85           0  recursive calls
    86           0  db block gets
    87           4  consistent gets
    88           0  physical reads
    89           0  redo size
    90         540  bytes sent via SQL*Net to client
    91         416  bytes received via SQL*Net from client
    92           2  SQL*Net roundtrips to/from client
    93           0  sorts (memory)
    94           0  sorts (disk)
    95           1  rows processed  
    列类型不匹配导致无法走索引的案例实战

    2、自定义函数索引

    首先创建一个自定义函数让 id-1 的形式创建序列,当然没什么意义。

    create or replace function f_minus1(i int)
       return int DETERMINISTIC
        is
        begin
        return(i-1);
       end;
    create index idx_test on test (f_minus1(object_id));
    DETERMINISTIC关键字很重要

    四、全文检索

    干、、写了四个小时,Google 浏览器崩溃了。

  • 相关阅读:
    练习题
    作业2.6-2.15 两次作业
    11.13(2)
    11.13
    11.6
    10.30
    10.15
    10.9
    9.25号作业
    9.18号
  • 原文地址:https://www.cnblogs.com/javaMan/p/4012355.html
Copyright © 2020-2023  润新知