首先创建一个简单的测试环境:
SQL> CREATE TABLE T (ID NUMBER, START_TIME DATE);
Table created.
SQL> INSERT INTO T (ID, START_TIME) SELECT ROWNUM, CREATED FROM DBA_OBJECTS;
36001 rows created.
SQL> COMMIT;
Commit complete.
SQL> CREATE INDEX IND_T_TIME ON T (TO_CHAR(START_TIME, 'YYYY-MM-DD'));
Index created.
SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'T', METHOD_OPT => 'FOR ALL INDEXED COLUMNS')
PL/SQL procedure successfully completed.
SQL> ALTER SESSION SET NLS_DATE_LANGUAGE = AMERICAN;
Session altered.
SQL> ALTER SESSION SET NLS_DATE_FORMAT = 'YYYY-MM-DD HH24:MI:SS';
Session altered.
SQL> SET AUTOT ON EXP
SQL> SELECT * FROM T WHERE TO_CHAR(START_TIME, 'YYYY-MM-DD') = '2006-01-18';
ID START_TIME
---------- -------------------
33873 2006-01-18 11:20:47
34280 2006-01-18 11:20:48
35990 2006-01-18 14:40:26
35991 2006-01-18 14:45:20
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=9 Card=342 Bytes=7524)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=9 Card=342 Bytes=7524)
2 1 INDEX (RANGE SCAN) OF 'IND_T_TIME' (NON-UNIQUE) (Cost=1 Card=342)
SQL> SELECT * FROM T WHERE TO_CHAR(START_TIME, 'yyyy-mm-dd') = '2006-01-18';
ID START_TIME
---------- -------------------
33873 2006-01-18 11:20:47
34280 2006-01-18 11:20:48
35990 2006-01-18 14:40:26
35991 2006-01-18 14:45:20
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=10 Card=360 Bytes=7920)
1 0 TABLE ACCESS (FULL) OF 'T' (Cost=10 Card=360 Bytes=7920)
SQL> SELECT /*+ INDEX (T IND_T_TIME) */ * FROM T
2 WHERE TO_CHAR(START_TIME, 'yyyy-mm-dd') = '2006-01-18';
ID START_TIME
---------- -------------------
33873 2006-01-18 11:20:47
34280 2006-01-18 11:20:48
35990 2006-01-18 14:40:26
35991 2006-01-18 14:45:20
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=10 Card=360 Bytes=7920)
1 0 TABLE ACCESS (FULL) OF 'T' (Cost=10 Card=360 Bytes=7920)
问题产生了。由于函数索引一个常数输入参数的不同,造成了Oracle无法使用函数索引。这时,Oracle认为索引扫描得到的结果是不正确的,因此,即使使用HINT,Oracle也不会采用索引扫描。使这个查询语句使用索引的办法就是在增加日期转换格式为小写的函数索引。
SQL> CREATE INDEX IND_T_LOWER ON T (TO_CHAR(START_TIME, 'yyyy-mm-dd'));
Index created.
SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'T', METHOD_OPT => 'FOR ALL INDEXED COLUMNS')
PL/SQL procedure successfully completed.
SQL> SELECT * FROM T WHERE TO_CHAR(START_TIME, 'yyyy-mm-dd') = '2006-01-18';
ID START_TIME
---------- -------------------
33873 2006-01-18 11:20:47
34280 2006-01-18 11:20:48
35990 2006-01-18 14:40:26
35991 2006-01-18 14:45:20
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=9 Card=342 Bytes=7524)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=9 Card=342 Bytes=7524)
2 1 INDEX (RANGE SCAN) OF 'IND_T_LOWER' (NON-UNIQUE) (Cost=1 Card=342)
问题似乎解决了,但是新的问题出现了,如果输入的日期格式是大小写混写怎么样。
而且lfree的测试发现了一个更有趣的现象。
SQL> SELECT * FROM T WHERE TO_CHAR(START_TIME, 'YYYY-MM-Dd') = '2006-01-18';
ID START_TIME
---------- -------------------
33873 2006-01-18 11:20:47
34280 2006-01-18 11:20:48
35990 2006-01-18 14:40:26
35991 2006-01-18 14:45:20
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=10 Card=360 Bytes=7920)
1 0 TABLE ACCESS (FULL) OF 'T' (Cost=10 Card=360 Bytes=7920)
SQL> SELECT * FROM T WHERE TO_CHAR(START_TIME, 'yyyy-mm-dD') = '2006-01-18';
ID START_TIME
---------- -------------------
33873 2006-01-18 11:20:47
34280 2006-01-18 11:20:48
35990 2006-01-18 14:40:26
35991 2006-01-18 14:45:20
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=9 Card=342 Bytes=7524)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=9 Card=342 Bytes=7524)
2 1 INDEX (RANGE SCAN) OF 'IND_T_LOWER' (NON-UNIQUE) (Cost=1 Card=342)
仔细观察上面两个SQL语句,当格式为YYYY-MM-Dd时,无法使用YYYY-MM-DD格式的索引。这个很容易理解,输入不同吗。但是格式为yyyy-mm-dD时,却可以使用yyyy-mm-dd的索引,这是为什么呢?
经过一系列测试后,总结出了日期转换函数格式参数的大小写规则(终于引入正题了)。并发现一开始对CBO函数索引的理解还不算很清楚,对CBO的聪明程度有点低估了(当然,总体来说,还不算太聪明)。
刚开始测试时发现,即使是下面这种查询也可以使用索引:
SQL> SELECT * FROM T WHERE TO_CHAR(START_TIME, 'yYYY-mM-dD') = '2006-01-18';
ID START_TIME
---------- -------------------
33873 2006-01-18 11:20:47
34280 2006-01-18 11:20:48
35990 2006-01-18 14:40:26
35991 2006-01-18 14:45:20
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=9 Card=342 Bytes=7524)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T' (Cost=9 Card=342 Bytes=7524)
2 1 INDEX (RANGE SCAN) OF 'IND_T_LOWER' (NON-UNIQUE) (Cost=1 Card=342)
似乎Oracle只关系第一个字母。但是综合考虑YYYY-MM-Dd的情况,似乎又不是这样。于是得到了Oracle会找到第一个小写字母为止的推论。
在测试过程中,找到了一个更好的方法来确定Oracle使用字符串转换的格式。
SQL> SET AUTOT OFF
SQL> EXPLAIN PLAN FOR
2 SELECT * FROM T WHERE TO_CHAR(START_TIME, 'yYYY-mM-dD') = '2006-01-18';
Explained.
SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY());
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 342 | 7524 | 9 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 342 | 7524 | 9 |
|* 2 | INDEX RANGE SCAN | IND_T_LOWER | 342 | | 1 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access(TO_CHAR("T"."START_TIME",'yyyy-mm-dd')='2006-01-18')
Note: cpu costing is off
15 rows selected.
通过看执行计划后面的Predicate Information,可以清楚的看到Oracle使用的格式是yyyy-mm-dd而不是yYYY-mM-dD。
继续测试:
SQL> EXPLAIN PLAN FOR
2 SELECT * FROM T WHERE TO_CHAR(START_TIME, 'YyYY-MM-DD') = '2006-01-18';
Explained.
SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY());
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------
--------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
--------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 360 | 7920 | 10 |
|* 1 | TABLE ACCESS FULL | T | 360 | 7920 | 10 |
--------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(TO_CHAR("T"."START_TIME",'Yyyy-MM-DD')='2006-01-18')
Note: cpu costing is off
14 rows selected.
上面这种情况也是对的,但是下面列出的情况又不符合了。
SQL> EXPLAIN PLAN FOR
2 SELECT * FROM T WHERE TO_CHAR(START_TIME, 'YYyY-MM-DD') = '2006-01-18';
Explained.
SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY());
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 342 | 7524 | 9 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 342 | 7524 | 9 |
|* 2 | INDEX RANGE SCAN | IND_T_TIME | 342 | | 1 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access(TO_CHAR("T"."START_TIME",'YYYY-MM-DD')='2006-01-18')
Note: cpu costing is off
15 rows selected.
显然,这种情况又和刚才得到的推论冲突了。似乎这种规律只对前两位有效。
突然我明白了Oracle所定义的规则。由于英文会出现首字母大写的情况,所以Oracle在判定的时候首先判断第一位的大小写状态,如果是小写,就不继续判断后面的内容了,而把整个字符串都当作小写。如果第一位是大写,Oracle继续判断第二位,如果也是大写,那么Oracle就把整个字符串都当作大写,如果第二位是小写,则Oracle认为这种情况是首字母大写,就把第一个字母当作大写,而剩余字符做小写出来。
如果有兴趣可以对SELECT TO_CHAR(SYSDATE, 'Month') FROM DUAL;进行测试,不管给出的Month的大小写如何变,Oracle的输出结果只有三种,大写、小写和首字母大写。
SQL> SELECT TO_CHAR(SYSDATE, 'Month') FROM DUAL;
TO_CHAR(S
---------
January
SQL> SELECT TO_CHAR(SYSDATE, 'MOnth') FROM DUAL;
TO_CHAR(S
---------
JANUARY
SQL> SELECT TO_CHAR(SYSDATE, 'MoNTH') FROM DUAL;
TO_CHAR(S
---------
January
SQL> SELECT TO_CHAR(SYSDATE, 'mONTH') FROM DUAL;
TO_CHAR(S
---------
january