• 索引的Clustering Factor


    Clustering Factor:集群因子(聚簇因子)用来描述一个表中的列是否是规则排序的

    Clustering Factor的算法如下:我们知道可以通过dbms_rowid.rowid_block_number(rowid)找到记录对应的block号。索引中记录了rowid,因此oracle就可以根据索引中的rowid来判断记录是否是在同一个block中。举个例子,比如说索引中有a,b,c,d,e五个记录,首先比较a,b是否在同一个block,如果不在同一个block那么Clustering Factor +1,然后继续比较b,c同理,如果b,c不在同一个block,那么Clustering Factor+1,这样一直进行下去,直到比较了所有的记录。

        根据算法我们就可以知道clustering factor的值介于block数和表行数之间。如果clustering factor接近block数,说明表的存储和索引存储排序接近,也就是说表中的记录很有序,这样在做index range scan 的时候能,读取少量的data block就能得到我们想要的数据,代价比较小。如果clustering factor接近表记录数,说明表的存储和索引排序差异很大,在做index range scan的时候,会额外读取多个block,因为表记录分散,代价较高。

    下面来看一个例子:

    SQL> desc test;
     名称                                                                                                      是否为空? 类型
     ----------------------------------------------------------------------------------------------------------------- -------- ----------------

     OWNER                                                                                                              VARCHAR2(30)
     OBJECT_NAME                                                                                                        VARCHAR2(128)
     SUBOBJECT_NAME                                                                                                     VARCHAR2(30)
     OBJECT_ID                                                                                                 NOT NULL NUMBER
     DATA_OBJECT_ID                                                                                                     NUMBER
     OBJECT_TYPE                                                                                                        VARCHAR2(19)
     CREATED                                                                                                            DATE
     LAST_DDL_TIME                                                                                                      DATE
     TIMESTAMP                                                                                                          VARCHAR2(19)
     STATUS                                                                                                             VARCHAR2(7)
     TEMPORARY                                                                                                          VARCHAR2(1)
     GENERATED                                                                                                          VARCHAR2(1)
     SECONDARY                                                                                                          VARCHAR2(1)

    SQL> select index_name,clustering_factor from user_indexes where table_name='TEST';

    INDEX_NAME                     CLUSTERING_FACTOR
    ------------------------------ -----------------
    NAMETYPE                                   25007
    OBJID                                        730

    SQL> select index_name,column_name,column_position from user_ind_columns;

    INDEX_NAME                     COLUMN_NAME          COLUMN_POSITION
    ------------------------------ -------------------- ---------------
    OBJID                          OBJECT_ID                          1
    NAMETYPE                       OBJECT_TYPE                        2
    NAMETYPE                       OBJECT_NAME                        1

    SQL> select count(distinct dbms_rowid.rowid_block_number(rowid)) from test;

    COUNT(DISTINCTDBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID))
    ---------------------------------------------------
                                                    684

    SQL> select count(*) from test;

      COUNT(*)
    ----------
         49947

    可以看到test表一共有49949行,存储在684个block,索引OBJID建立在object_id列上,它的值与块数接近,说明表中的object_id很有可能是排过序的,我们查一下前10行验证下

    SQL> select count(distinct dbms_rowid.rowid_block_number(rowid)) from test where object_id<11;

    COUNT(DISTINCTDBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID))
    ---------------------------------------------------
                                                      1

    我们看到从object_id=1到object_id=10都在同一个block,说明object_id这列经过了适当排序,那么我们在对object_id这一列进行index range scan的时候就可以少读很多block,就能把结果检索出来。

    索引NAMETYPE建立在object_name,object_type上,并且以object_name列为主导列,它的clustering factor接近行数,说明object_name没有经过适当排序,是分散的,在进行index range scan的时候 会读取较多的block.

    现在我将OBJID索引反序

    SQL> alter index objid rebuild reverse;

    索引已更改。

    SQL> select index_name,clustering_factor from user_indexes;

    INDEX_NAME                     CLUSTERING_FACTOR
    ------------------------------ -----------------
    SYS_IL0000052108C00036$$
    OBJID                                      49945
    NAMETYPE                                   25007

    由于反序了索引,此时clustering factor必然很高,因为表中objid是有序的,而索引是反序的

    这里也得到了一点提示,如果索引发生了热点块(cache buffers chains)竞争,可以建立一个reverse索引,不过带来的却是index range scan的花销,需要权衡利弊,看看是否有其他解决方案。

  • 相关阅读:
    EntityFramework 启用迁移 EnableMigrations 报异常 "No context type was found in the assembly"
    JAVA 访问FTP服务器示例(2)
    NuGet Package Manager 更新错误解决办法
    JAVA 访问FTP服务器示例(1)
    RemoteAttribute 的使用问题
    诡异的 javascript 变量
    javascript apply用法
    Babun 中文乱码
    GSM呼叫过程
    转站博客园
  • 原文地址:https://www.cnblogs.com/hehe520/p/6330653.html
Copyright © 2020-2023  润新知