今天同事碰到一个SQL的性能问题,主要是MERGE INTO的性能问题,执行脚本的时候,居然耗时50多分钟,汗!简直让人抓狂,脚本如下:
MERGE INTO EDS.TW_DP_B_TDTERM_IMEI_DAY DM
USING T_IMEI_DAY_1111 TEMP
ON(
DM.DATE_CD = TEMP.DATE_CD AND
DM.CITY_ID = TEMP.CITY_ID AND
DM.IMEI = TEMP.IMEI AND
DM.USR_NBR = TEMP.USR_NBR
)
WHEN MATCHED THEN UPDATE SET
DM.GSM_FLUX = TEMP.GSM_FLUX ,
DM.TD_FLUX = TEMP.TD_FLUX ,
DM.GPRS_FLUX = TEMP.GPRS_FLUX
WHEN NOT MATCHED THEN INSERT(
DM.DATE_CD ,
DM.CITY_ID ,
DM.IMEI ,
DM.BUSS_CITY_ID ,
DM.TYPE_ID ,
DM.USR_NBR ,
DM.GSM_FLUX ,
DM.TD_FLUX ,
DM.GPRS_FLUX
)
VALUES(
TEMP.DATE_CD ,
TEMP.CITY_ID ,
TEMP.IMEI ,
TEMP.BUSS_CITY_ID ,
TEMP.TYPE_ID ,
TEMP.USR_NBR ,
TEMP.GSM_FLUX ,
TEMP.TD_FLUX ,
TEMP.GPRS_FLUX
);
这两个表的数据量大致如下: T_IMEI_DAY_1111 三十多万, EDS.TW_DP_B_TDTERM_IMEI_DAY 三百多万数据,跑数前都做过表的相关信息收集。 EDS.TW_DP_B_TDTERM_IMEI_DAY表的索引有效,碎片很少:
SQL> select count(1) from T_IMEI_DAY_1111;
COUNT(1)
----------
333575
SQL> select count(1) from EDS.TW_DP_B_TDTERM_IMEI_DAY;
COUNT(1)
----------
3731336
SQL>
脚本的执行计划如下:
SQL> MERGE INTO EDS.TW_DP_B_TDTERM_IMEI_DAY DM
USING T_IMEI_DAY_1111 TEMP
ON(
DM.DATE_CD = TEMP.DATE_CD AND
DM.CITY_ID = TEMP.CITY_ID AND
DM.IMEI = TEMP.IMEI AND
DM.USR_NBR = TEMP.USR_NBR
)
WHEN MATCHED THEN UPDATE SET
DM.GSM_FLUX = TEMP.GSM_FLUX ,
DM.TD_FLUX = TEMP.TD_FLUX ,
DM.GPRS_FLUX = TEMP.GPRS_FLUX
WHEN NOT MATCHED THEN INSERT(
DM.DATE_CD ,
DM.CITY_ID ,
DM.IMEI ,
DM.BUSS_CITY_ID ,
DM.TYPE_ID ,
DM.USR_NBR ,
DM.GSM_FLUX ,
DM.TD_FLUX ,
DM.GPRS_FLUX
)
VALUES(
TEMP.DATE_CD ,
TEMP.CITY_ID ,
TEMP.IMEI ,
TEMP.BUSS_CITY_ID ,
TEMP.TYPE_ID ,
TEMP.USR_NBR ,
TEMP.GSM_FLUX ,
TEMP.TD_FLUX ,
TEMP.GPRS_FLUX
);
333575 rows merged.
Execution Plan
----------------------------------------------------------
Plan hash value: 3661285687
--------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------
| 0 | MERGE STATEMENT | | 264K| 54M| | 3576 (2)| 00:01:20 | | |
| 1 | MERGE | TW_DP_B_TDTERM_IMEI_DAY | | | | | | | |
| 2 | VIEW | | | | | | | | |
|* 3 | HASH JOIN OUTER | | 264K| 44M| 31M| 3576 (2)| 00:01:20 | | |
| 4 | TABLE ACCESS FULL | T_IMEI_DAY_1111 | 264K| 28M| | 100 (2)| 00:00:03 | | |
| 5 | PARTITION RANGE ALL| | 2128K| 125M| | 765 (7)| 00:00:17 | 1 | 365 |
| 6 | TABLE ACCESS FULL | TW_DP_B_TDTERM_IMEI_DAY | 2128K| 125M| | 765 (7)| 00:00:17 | 1 | 365 |
--------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("DM"."USR_NBR"(+)="TEMP"."USR_NBR" AND "DM"."IMEI"(+)="TEMP"."IMEI" AND
"DM"."CITY_ID"(+)="TEMP"."CITY_ID" AND "DM"."DATE_CD"(+)="TEMP"."DATE_CD")
Note
-----
- dynamic sampling used for this statement
Statistics
----------------------------------------------------------
9975 recursive calls
1212324 db block gets
111135 consistent gets
2447 physical reads
228686840 redo size
822 bytes sent via SQL*Net to client
2571 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
333575 rows processed
SQL>
在优化脚本期间,我添加了USE_MERGE提示,结果执行脚本的时间一下子缩短到几十秒。但是让我很纳闷的是, 无论怎么看,添加USE_MERGE提示的脚本的执行计划怎么比较都比上面没添加提示的脚本的执行计划查(执行计划如下所示)
SQL> MERGE /*+ USE_MERGE(DM TEMP) */ INTO EDS.TW_DP_B_TDTERM_IMEI_DAY DM
USING T_IMEI_DAY_1111 TEMP
ON(
DM.DATE_CD = TEMP.DATE_CD AND
DM.CITY_ID = TEMP.CITY_ID AND
DM.IMEI = TEMP.IMEI AND
DM.USR_NBR = TEMP.USR_NBR
)
WHEN MATCHED THEN UPDATE SET
DM.GSM_FLUX = TEMP.GSM_FLUX ,
DM.TD_FLUX = TEMP.TD_FLUX ,
DM.GPRS_FLUX = TEMP.GPRS_FLUX
WHEN NOT MATCHED THEN INSERT(
DM.DATE_CD ,
DM.CITY_ID ,
DM.IMEI ,
DM.BUSS_CITY_ID ,
DM.TYPE_ID ,
DM.USR_NBR ,
DM.GSM_FLUX ,
DM.TD_FLUX ,
DM.GPRS_FLUX
)
VALUES(
TEMP.DATE_CD ,
TEMP.CITY_ID ,
TEMP.IMEI ,
TEMP.BUSS_CITY_ID ,
TEMP.TYPE_ID ,
TEMP.USR_NBR ,
TEMP.GSM_FLUX ,
TEMP.TD_FLUX ,
TEMP.GPRS_FLUX
)
;
333575 rows merged.
Execution Plan
----------------------------------------------------------
Plan hash value: 2993229602
---------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------------------
| 0 | MERGE STATEMENT | | 1688K| 346M| | 14060 (1)| 00:05:13 | | |
| 1 | MERGE | TW_DP_B_TDTERM_IMEI_DAY | | | | | | | |
| 2 | VIEW | | | | | | | | |
| 3 | MERGE JOIN OUTER | | 1688K| 283M| | 14060 (1)| 00:05:13 | | |
| 4 | SORT JOIN | | 264K| 28M| 68M| 2424 (1)| 00:00:54 | | |
| 5 | TABLE ACCESS FULL | T_IMEI_DAY_1111 | 264K| 28M| | 100 (2)| 00:00:03 | | |
|* 6 | SORT JOIN | | 2128K| 125M| 586M| 11636 (2)| 00:04:19 | | |
| 7 | PARTITION RANGE ALL| | 2128K| 125M| | 765 (7)| 00:00:17 | 1 | 365 |
| 8 | TABLE ACCESS FULL | TW_DP_B_TDTERM_IMEI_DAY | 2128K| 125M| | 765 (7)| 00:00:17 | 1 | 365 |
---------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
6 - access("DM"."USR_NBR"(+)="TEMP"."USR_NBR" AND "DM"."IMEI"(+)="TEMP"."IMEI" AND
"DM"."CITY_ID"(+)="TEMP"."CITY_ID" AND "DM"."DATE_CD"(+)="TEMP"."DATE_CD")
filter("DM"."DATE_CD"(+)="TEMP"."DATE_CD" AND "DM"."CITY_ID"(+)="TEMP"."CITY_ID" AND
"DM"."IMEI"(+)="TEMP"."IMEI" AND "DM"."USR_NBR"(+)="TEMP"."USR_NBR")
Note
-----
- dynamic sampling used for this statement
Statistics
----------------------------------------------------------
1884 recursive calls
1217983 db block gets
90427 consistent gets
654 physical reads
259129380 redo size
814 bytes sent via SQL*Net to client
2667 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
11 sorts (memory)
0 sorts (disk)
333575 rows processed
SQL> exit