今天下午一哥们遇到个case.他说如下SQL语句
SELECT WORKITEMID
FROM WFWIPARTICIPANT
WHERE PARTICIPANT IN ('771', '99999', '41', '146', '李锦');
用DBMS_STATS收集统计信息之后会走全表扫描,而用
analyze table WFWIPARTICIPANT compute statistics ---收集统计信息会走索引
analyze table WFWIPARTICIPANT delete statistics ----删除统计信息也会走索引,Oracle采用动态采样
表结构如下:
SQL> desc WFWIPARTICIPANT;
Name Type Nullable Default Comments
--------------- ------------- -------- ------- --------
WIPARTICID NUMBER
WORKITEMID NUMBER Y
PARTICIPANTTYPE VARCHAR2(20) Y
PARTICIPANT VARCHAR2(256) Y
PARTICIPANT2 VARCHAR2(64) Y
WORKITEMSTATE NUMBER(2) Y
PARTIINTYPE VARCHAR2(20) Y
EXTEND1 VARCHAR2(64) Y
并且在列partintype 上面有个索引。
我叫那位哥们把表用exp导出,然后我自己导入到我电脑测试一下,测试过程:
首先用DBMS_STATS收集统计信息:
SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>'TSSA',TABNAME=>'WFWIPARTICIPANT',ESTIMATE_PERCENT=>100,CASCADE=>TRUE);
PL/SQL procedure successfully completed
执行该SQL
SQL> SELECT WORKITEMID
2 FROM WFWIPARTICIPANT
3 WHERE PARTICIPANT IN ('771', '99999', '41', '146', '李锦');
已选择60行。
执行计划
----------------------------------------------------------
Plan hash value: 1708170390
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 530 | 7420 | 103 (5)| 00:00:02 |
|* 1 | TABLE ACCESS FULL| WFWIPARTICIPANT | 530 | 7420 | 103 (5)| 00:00:02 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("PARTICIPANT"='146' OR "PARTICIPANT"='41' OR
"PARTICIPANT"='771' OR "PARTICIPANT"='99999' OR "PARTICIPANT"='李锦')
发现确实是走了全表扫描,于是改用ANALYZE 收集统计信息
SQL> analyze table WFWIPARTICIPANT delete statistics;
表已分析。
SQL> analyze table WFWIPARTICIPANT compute statistics;
表已分析。
SQL> SELECT WORKITEMID
2 FROM WFWIPARTICIPANT
3 WHERE PARTICIPANT IN ('771', '99999', '41', '146', '李锦');
已选择60行。
执行计划
----------------------------------------------------------
Plan hash value: 1217134846
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 71 | 852 | 64 (0)| 00:00:01 |
| 1 | INLIST ITERATOR | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| WFWIPARTICIPANT | 71 | 852 | 64 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | WF_IDX_PARTICIPANT | 71 | | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("PARTICIPANT"='146' OR "PARTICIPANT"='41' OR "PARTICIPANT"='771' OR
"PARTICIPANT"='99999' OR "PARTICIPANT"='李锦')
用ANALYZE命令分析之后,发现确实走了索引扫描
解决问题的思路:遇到这类问题,首先应该查询索引的选择率:
SQL> select a.owner,a.index_name,a.index_type,partitioned,b.num_rows,b.distinct_keys,b.num_rows/b.distinct_keys avg_row_per_key
2 ,b.distinct_keys/b.num_rows SELECTIVITY,b.last_analyzed,b.stale_stats from dba_indexes a,dba_ind_statistics b
3 where a.owner=b.owner and a.index_name=b.index_name and a.index_name='WF_IDX_PARTICIPANT';
OWNER INDEX_NAME INDEX_TYPE PARTITIONED NUM_ROWS DISTINCT_KEYS AVG_ROW_PER_KEY SELECTIVITY LAST_ANALYZED STALE_STATS
----- -------------------- ---------- ----------- ---------- ------------- --------------- ----------- ------------- -----------
TSSA WF_IDX_PARTICIPANT NORMAL NO 57820 4073 14.195924380063 0.070442753 2010-4-15 17: NO
注意观察,表一共有57820列,但是索引的列上面只有4073个不同的值,也就是说索引选择性为7%,很显然,这列数据分布不均衡。于是猜测该问题和直方图有关。现在查看列上面有没有直方图:
SQL> select owner,table_name,column_name,num_distinct,histogram,num_buckets from dba_tab_col_statistics
2 where table_name='WFWIPARTICIPANT' and column_name='PARTICIPANT';
OWNER TABLE_NAME COLUMN_NAME NUM_DISTINCT HISTOGRAM NUM_BUCKETS
----- ------------------------------ ------------------------------ ------------ --------------- -----------
TSSA WFWIPARTICIPANT PARTICIPANT 4073 NONE 1
NUM_BUCKETS为1表示该列没有直方图,恩,这里用ANALYZE命令收集的统计信息里面没有直方图信息,于是改用
DBMS_STATS收集统计信息,看看是否有直方图统计信息:
SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>'TSSA',TABNAME=>'WFWIPARTICIPANT',ESTIMATE_PERCENT=>100,CASCADE=>TRUE);
PL/SQL procedure successfully completed
SQL> select owner,table_name,column_name,num_distinct,histogram,num_buckets from dba_tab_col_statistics
2 where table_name='WFWIPARTICIPANT' and column_name='PARTICIPANT';
OWNER TABLE_NAME COLUMN_NAME NUM_DISTINCT HISTOGRAM NUM_BUCKETS
----- ------------------------------ ------------------------------ ------------ --------------- -----------
TSSA WFWIPARTICIPANT PARTICIPANT 4073 HEIGHT BALANCED 254
注意观察,NUM_BUCKETS=254,Oracle自动的对该列收集了直方图统计信息,于是怀疑直方图的存在影响了执行计划
,现在我删掉直方图的统计信息:
SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>'TSSA',TABNAME=>'WFWIPARTICIPANT',ESTIMATE_PERCENT=>100,DEGREE=>16,method_opt=>'for columns size 1 PARTICIPANT',CASCADE=>TRUE);
PL/SQL procedure successfully completed
SQL> select owner,table_name,column_name,num_distinct,histogram,num_buckets from dba_tab_col_statistics
2 where table_name='WFWIPARTICIPANT' and column_name='PARTICIPANT';
OWNER TABLE_NAME COLUMN_NAME NUM_DISTINCT HISTOGRAM NUM_BUCKETS
----- ------------------------------ ------------------------------ ------------ --------------- -----------
TSSA WFWIPARTICIPANT PARTICIPANT 4073 NONE 1
再次运行该查询语句
SQL> SELECT WORKITEMID
2 FROM WFWIPARTICIPANT
3 WHERE PARTICIPANT IN ('771', '99999', '41', '146', '李锦');
已选择60行。
执行计划
----------------------------------------------------------
Plan hash value: 1217134846
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 71 | 994 | 64 (0)| 00:00:01 |
| 1 | INLIST ITERATOR | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| WFWIPARTICIPANT | 71 | 994 | 64 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | WF_IDX_PARTICIPANT | 71 | | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("PARTICIPANT"='146' OR "PARTICIPANT"='41' OR "PARTICIPANT"='771' OR
"PARTICIPANT"='99999' OR "PARTICIPANT"='李锦')
当删掉直通图统计信息之后,优化器选择了我们期望的访问路径
这个案例提醒我们,对表收集统计信息的时候,要写好参数,另外,我们在对访问路径做优化的时候,首先应该查看的就是索引的选择率,索引的类型,以及列上面关于直方图的统计信息,有了直方图的统计信息并不总是会给我们带来好处,当然也不总是会对我们带来坏处,具体问题具体对待。另外一个值得注意的就是,关于统计信息收集的方式,一定要写好参数,如果Oracle自动去收集了直方图统计信息,而我们不知道,这样对于性能诊断会带来麻烦的。