数据来源
根据博客:某社区600万用户数据导入MYSQL、MSSQL、Oracle数据库方法,我们得到了一个含有600多万条用户数据的oracle数据库。本文就是根据这个来验证数据库索引的特性。
1.测试数据库CSDNUSER
CREATE TABLE "SCOTT"."CSDNUSER" ( "ID" int primary key not null, "USERNAME" VARCHAR2(256), "PASSWORD" VARCHAR2(256), "EMAIL" VARCHAR2(256) )
数据导入方法参考某社区600万用户数据导入MYSQL、MSSQL、Oracle数据库方法。
上述数据库包含主键索引,但是没有为主键命名,因此搜索该索引的等级BLEVEL,可以通过以下查询语句求出:
select index_name, blevel, num_rows from user_indexes where table_name = 'CSDNUSER';
查询结果如下图所示:
从上述查询结果中我们发现没有BLEVEL和NUM_ROWS。
1.未命名的主键
接下来我们查询ID>700万数据,之所以是700万是因为我们总共数据时600多万,这样可以更加明显的看出来有没有索引的查询效率。查询语句如下:
SET AUTOTRACE ON SELECT * FROM CSDNUSER2 WHERE ID>7000000;
查询执行计划如下:
执行计划
----------------------------------------------------------
Plan hash value: 2317779687
--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 403 | 1 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| CSDNUSER2 | 1 | 403 | 1 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | SYS_C0038672 | 1 | | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID">7000000)
Note
-----
- dynamic sampling used for this statement
统计信息
----------------------------------------------------------
9 recursive calls
0 db block gets
92 consistent gets
2 physical reads
116 redo size
450 bytes sent via SQL*Net to client
405 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
从统计信息中我们看出一共有“92 consistent gets”,相当于有92次IO。
2.无主键
然后我们删除上述主键SYS_COO38672,删除语句如下:
--删除主键 alter table csdnuser2 drop constraint SYS_C0038672;
再次执行上述查询语句,查询执行计划如下:
执行计划
----------------------------------------------------------
Plan hash value: 1618046436
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 498 | 195K| 12348 (1)| 00:02:29 |
|* 1 | TABLE ACCESS FULL| CSDNUSER2 | 498 | 195K| 12348 (1)| 00:02:29 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("ID">7000000)
Note
-----
- dynamic sampling used for this statement
统计信息
----------------------------------------------------------
182 recursive calls
0 db block gets
45129 consistent gets
45015 physical reads
0 redo size
450 bytes sent via SQL*Net to client
405 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
5 sorts (memory)
0 sorts (disk)
0 rows processed
从统计信息中我们看出一共有“45129 consistent gets”,相当于有45129次IO。
3.添加命名主键
添加主键语句如下:
--添加主键 alter table csdnuser add constraint pk_csdnuser primary key(ID);
查询该索引的等级BLEVEL,可以通过以下查询语句求出:
select index_name, blevel, num_rows from user_indexes where table_name = 'CSDNUSER';
查询结果如下图所示:
从上述查询结果中我们发现BLEVEL:2和NUM_ROWS=6428632,为表中记录数。
再次执行上述查询语句,查询执行计划如下:
执行计划
----------------------------------------------------------
Plan hash value: 2702699671
--------------------------------------------------------------------------------
-----------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 403 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| CSDNUSER | 1 | 403 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | PK_CSDNUSER | 1 | | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID">7000000)
统计信息
----------------------------------------------------------
1 recursive calls
0 db block gets
3 consistent gets
0 physical reads
0 redo size
450 bytes sent via SQL*Net to client
405 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
发现是“ 3 consistent gets”,表明添加命名索引以后,只需要3次IO就可以结束查询。
PS:2012-6-13解释前面错误理解
上述的命名主键与非命名主键的说法是错误的,第二次consistent gets子所以很大是因为删除了主键索引,这是没有错的。而第三次的consistent gets为3,而第一次consistent gets为92,并不说明自定义命名的索引效率比系统命名的索引效率高。之所以第三次只需要3次consistent gets是因为执行完第一次以后有缓存存在。假设在第一次查询以后再一次查询,那么统计结果跟第三次一模一样。
2.测试数据库USERINFO
http://www.itpub.net/thread-1313899-1-1.html
2.1创建数据库USERINFO
CREATE TABLE "USERINFO" ( "NO" INT , "NAME" VARCHAR2(50), "BIRTHDAY" DATE )
2.2创建序列
CREATE SEQUENCE SEQ_USERINFO_NO INCREMENT BY 1 -- 每次加几个 START WITH 1 -- 从1开始计数
2.3插入100000条数据
begin for i in 1..100000 loop INSERT INTO USERINFO VALUES(SEQ_USERINFO_NO.nextval,'XUWEI',SYSDATE); end loop; end; /
2.4统计结果
查询NO=5000,查询语句如下:
set autotrace traceonly select * from userinfo where no = 5000;
第一次查询得到的统计信息:
未选定行
执行计划
----------------------------------------------------------
Plan hash value: 3576123897
------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 49 | 102 (0)| 00:00:02 |
|* 1 | TABLE ACCESS FULL| USERINFO | 1 | 49 | 102 (0)| 00:00:02 |
------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("NO"=5000)
Note
-----
- dynamic sampling used for this statement
统计信息
----------------------------------------------------------
5 recursive calls
0 db block gets
119023 consistent gets
0 physical reads
25048 redo size
391 bytes sent via SQL*Net to client
405 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
第二次查询得到的统计信息:
未选定行
执行计划
----------------------------------------------------------
Plan hash value: 3576123897
------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 49 | 102 (0)| 00:00:02 |
|* 1 | TABLE ACCESS FULL| USERINFO | 1 | 49 | 102 (0)| 00:00:02 |
------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("NO"=5000)
Note
-----
- dynamic sampling used for this statement
统计信息
----------------------------------------------------------
0 recursive calls
0 db block gets
100704 consistent gets
0 physical reads
21164 redo size
391 bytes sent via SQL*Net to client
405 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
第三次查询得到的统计信息:
未选定行
执行计划
----------------------------------------------------------
Plan hash value: 3576123897
------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 49 | 102 (0)| 00:00:02 |
|* 1 | TABLE ACCESS FULL| USERINFO | 1 | 49 | 102 (0)| 00:00:02 |
------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("NO"=5000)
Note
-----
- dynamic sampling used for this statement
统计信息
----------------------------------------------------------
0 recursive calls
0 db block gets
100704 consistent gets
0 physical reads
21208 redo size
391 bytes sent via SQL*Net to client
405 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
为NO字段添加索引IX_USERINFO_NO
--创建索引 create index IX_USERINFO_NO on USERINFO(NO);
第四次查询得到的统计信息:
执行计划
----------------------------------------------------------
Plan hash value: 1066629497
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 49 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| USERINFO | 1 | 49 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IX_USERINFO_NO | 1 | | 1 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("NO"=5000)
Note
-----
- dynamic sampling used for this statement
统计信息
----------------------------------------------------------
9 recursive calls
0 db block gets
76 consistent gets
1 physical reads
0 redo size
543 bytes sent via SQL*Net to client
416 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
第五次查询得到的统计信息:
执行计划
----------------------------------------------------------
Plan hash value: 1066629497
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 49 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| USERINFO | 1 | 49 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IX_USERINFO_NO | 1 | | 1 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("NO"=5000)
Note
-----
- dynamic sampling used for this statement
统计信息
----------------------------------------------------------
0 recursive calls
0 db block gets
4 consistent gets
0 physical reads
0 redo size
543 bytes sent via SQL*Net to client
416 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
删除索引IX_USERINFO_NO
--删除索引 drop index IX_USERINFO_NO;
添加主键PK_USERINFO_NO
--添加主键 alter table USERINFO add constraint PK_USERINFO_NO primary key(NO);
第六次查询得到的统计信息:
执行计划 ---------------------------------------------------------- Plan hash value: 4161181038 ---------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 49 | 2 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| USERINFO | 1 | 49 | 2 (0)| 00:00:01 | |* 2 | INDEX UNIQUE SCAN | PK_USERINFO_NO | 1 | | 1 (0)| 00:00:01 | ---------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("NO"=5000) 统计信息 ---------------------------------------------------------- 192 recursive calls 0 db block gets 33 consistent gets 1 physical reads 0 redo size 447 bytes sent via SQL*Net to client 405 bytes received via SQL*Net from client 1 SQL*Net roundtrips to/from client 6 sorts (memory) 0 sorts (disk) 1 rows processed
第七次查询得到的统计信息:
执行计划 ---------------------------------------------------------- Plan hash value: 4161181038 -------------------------------------------------------------------------------- -------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 49 | 2 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| USERINFO | 1 | 49 | 2 (0)| 00:00:01 | |* 2 | INDEX UNIQUE SCAN | PK_USERINFO_NO | 1 | | 1 (0)| 00:00:01 | ---------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("NO"=5000) 统计信息 ---------------------------------------------------------- 0 recursive calls 0 db block gets 3 consistent gets 0 physical reads 0 redo size 447 bytes sent via SQL*Net to client 405 bytes received via SQL*Net from client 1 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
2.5统计分析
从第一次到第三次查询,都是无索引状态下的查询,我们可以发现:
- 第一次查询时recursive calls=5>0,而后面两次recursive calls都为0,这个也适用于后期有索引的情况
- consistent gets在不断缩小,知道最后不变。例如第一次查询的consistent gets最大,而第二次和第三次的consistent gets相等。
- 在三次查询过程中,physical reads都为0。(physical reads=0表明物理IO次数为0,也就是说没有从硬盘上读取数据到内存中。之所以physical reads=0,是因为前面insert的100000数据已经在buffer_cache里了)
从第四次到第五次查询,都是有索引状态下的插叙,我们可以发现:
- consistent gets次数在下降,consistent gets最后变到4。
- recursive calls次数在下降,recursive calls最后变到0。
- 相对于第一次到第三次查询,consistent gets明显下降,这是因为添加了索引,前面第一次到第三次是全表扫描,而第四次跟第五次查询是索引扫描。逻辑读(consistent gets)之所以最后会是4,是因此需要读取根节点一个,分支一个,叶子一个,表块一个,共4个。
- physical reads同上。
从六次到第七次查询,是将原来索引换成了主键,我们可以发现:
- consistent gets最后降为3,而不是4,这个是不明白的地方。
- consistent gets同上
- recursive calls同上
- physical reads同上
主键也是索引的一种,只要加了索引,那么逻辑读(consistent gets)就明显降低,查询效率大大提高。这也是索引的作用。
2.6清空缓存后的physical reads
假设我们运行如下命令清空缓存:
alter system flush buffer_cache; alter system flush shared_pool;
然后再次执行查询语句,得到的统计信息如下:
执行计划 ---------------------------------------------------------- Plan hash value: 1066629497 ---------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 49 | 2 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| USERINFO | 1 | 49 | 2 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | IX_USERINFO_NO | 1 | | 1 (0)| 00:00:01 | ---------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("NO"=5000) Note ----- - dynamic sampling used for this statement 统计信息 ---------------------------------------------------------- 401 recursive calls 0 db block gets 128 consistent gets 308 physical reads 0 redo size 543 bytes sent via SQL*Net to client 416 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 6 sorts (memory) 0 sorts (disk) 1 rows processed
可以看到physical reads=308。
再次执行查询语句,,得到的统计信息如下:
执行计划 ---------------------------------------------------------- Plan hash value: 1066629497 ---------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 49 | 2 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| USERINFO | 1 | 49 | 2 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | IX_USERINFO_NO | 1 | | 1 (0)| 00:00:01 | ---------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("NO"=5000) Note ----- - dynamic sampling used for this statement 统计信息 ---------------------------------------------------------- 0 recursive calls 0 db block gets 4 consistent gets 0 physical reads 0 redo size 543 bytes sent via SQL*Net to client 416 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed