• [Oracle 10g]Partition Outer Join


    Partition Outer Join是Oracle 10g新增加的,用来方便“补足”在outer join的时候“漏掉”(NULL)的数据。

    当两个用来做Join的表中的数据不“相当”的时候,如果我们希望在最终结果中其中一张表的所有records都不落下,一般会用到outer join。另外一个表中“缺失”的那部分数据自然都是以NULL来补足。但是如果我们希望缺失的那部分数据(NULL)能正常"补足'’ 出来,这个时候用Partiton Outer Join就会很方便,而且性能很好。

    举个例子,来源这里

    首先建立一个测试表customer_orders,

    SQL>
      1  CREATE TABLE customer_orders(name, dt, amt)
      2  AS
      3     SELECT *
      4     FROM (
      5            SELECT owner,
      6                   TRUNC(created) + MOD(ROWNUM, 6),
      7                   TRUNC(object_id/ROWNUM)
      8            FROM   all_objects
      9            WHERE  owner IN ('ORDSYS', 'WKSYS')
     10            ORDER BY
     11                   DBMS_RANDOM.RANDOM
     12           )
     13*    WHERE ROWNUM <= 1000
    SQL> / 
     
    Table created. 

     

      生成一个报表,显示每个customer在2007年每个月的花销.

    SQL> SELECT name,
      2         TO_CHAR(dt, 'YYYYMM') as year_month,
      3         SUM(amt) as total_amt
      4  FROM  customer_orders
      5  GROUP BY
      6         name,
      7         TO_CHAR(dt, 'YYYYMM')
      8  ORDER BY 2, 1;


     

    NAME                           YEAR_M  TOTAL_AMT
    ------------------------------ ------ ----------
    ORDSYS                         200710       224973
    WKSYS                          200710       3830

    SQL>

    这条简单的SQL只会返回每个customer实际的花销,并不能把07年其他月份的数据显示出来(尽管其他月份都没有任何花销)。

    那么如何把其他月份的数据(0)也都显示出来呢,最容易想到做法就是用外连接!

    SQL> 
      1  WITH year_month AS
      2  (  SELECT TO_CHAR(
      3                ADD_MONTHS(
      4                     DATE '2007-01-01', ROWNUM-1), 'YYYYMM') as year_month
      5     FROM dual
      6     CONNECT BY ROWNUM <= 12
      7  )
      8  SELECT co.name,
      9         ym.year_month,
     10         NVL(SUM(co.amt), 0) AS total_amount
     11  FROM year_month ym
     12  LEFT OUTER JOIN  customer_orders co
     13  ON   (TO_CHAR(co.dt, 'YYYYMM') = ym.year_month)
     14  GROUP BY
     15         co.name,
     16         ym.year_month
     17  ORDER BY
     18         co.name,
     19*        ym.year_month
     

    NAME                           YEAR_M TOTAL_AMOUNT
    ------------------------------ ------ ------------
    ORDSYS                         200710           224973
    WKSYS                          200710            3830
                                         200701            0
                                         200702            0
                                         200703            0
                                         200704            0
                                         200705            0
                                         200706            0
                                         200707            0
                                         200708            0
                                         200709            0
                                         200711            0

                                          200712            0

    12 rows selected.

    SQL>

    虽然把12个月份都显示出来了,但是customer的名字除了10月份之外都没有显示在记录里面,这个显然不是那么友好。

    那么该如何让customer的名字也都列出来呢,先来看看不用partiton outer join如何来实现,

    SQL>  
      1  WITH year_months AS
      2  (  SELECT TO_CHAR(
      3                ADD_MONTHS(
      4                     DATE '2007-01-01', ROWNUM-1), 'YYYYMM') as year_month
      5     FROM dual
      6     CONNECT BY ROWNUM <= 12
      7  ),
      8  customer_names AS
      9  (
     10     SELECT DISTINCT name
     11     FROM custormer_orders
     12  )
     13  SELECT sq.name,
     14         sq.year_month,
     15         NVL(co.total_amount, 0) AS total_amount
     16  FROM
     17  (
     18     SELECT name,
     19            TO_CHAR(dt, 'YYYYMM') as year_month,
     20            SUM(amt)  as total_amount
     21     FROM  custormer_orders
     22     GROUP BY
     23            name,
     24            TO_CHAR(dt, 'YYYYMM')
     25  )co,
     26  (
     27     SELECT *
     28     FROM customer_names,
     29          year_months
     30  ) sq
     31  WHERE
     32      sq.year_month = co.year_month(+)
     33  AND sq.name = co.name(+)
     34  ORDER BY
     35      sq.name,
     36*     sq.year_month
     

     

    NAME                           YEAR_M TOTAL_AMOUNT
    ------------------------------ ------ ------------
    ORDSYS                         200701            0
    ORDSYS                         200702            0
    ORDSYS                         200703            0
    ORDSYS                         200704            0
    ORDSYS                         200705            0
    ORDSYS                         200706            0
    ORDSYS                         200707            0
    ORDSYS                         200708            0
    ORDSYS                         200709            0
    ORDSYS                         200710       224973
    ORDSYS                         200711            0

    ORDSYS                         200712            0


    WKSYS                          200701            0
    WKSYS                          200702            0
    WKSYS                          200703            0
    WKSYS                          200704            0
    WKSYS                          200705            0
    WKSYS                          200706            0
    WKSYS                          200707            0
    WKSYS                          200708            0
    WKSYS                          200709            0
    WKSYS                          200710         3830
    WKSYS                          200711            0

    WKSYS                          200712            0

    24 rows selected.

    SQL>

     

    这种方式用了两个临时表,然后这两个临时表进行了笛卡尔乘积,然后再同custormer_orders表做外连接。 注意,为了得到所有的customer,customer_orders表被访问了两次,显然会对性能造成影响。 通过查看dbms_xplan.display_cursor来看看这条SQL的执行计划及代价, 

     

    SQL> SELECT plan_table_output
      2  FROM  TABLE(dbms_xplan.display_cursor);
     
    PLAN_TABLE_OUTPUT
    ------------------------------------------------------------------------------------------------------------------------------
    ------------------------------------------------------------------------------------------------------------------------------
     
    SQL_ID  2tn2b3sdyhxn4, child number 0
    -------------------------------------
        WITH year_months AS     (  SELECT TO_CHAR(
    ADD_MONTHS(                        DATE '2007-01-01', ROWNUM-1),
    'YYYYMM') as year_month        FROM dual        CONNECT BY ROWNUM <= 12
        ),     customer_names AS     (       SELECT DISTINCT name
    FROM custormer_orders    )    SELECT sq.name,           sq.year_month,
            NVL(co.total_amount, 0) AS total_amount    FROM    (
    SELECT name,              TO_CHAR(dt, 'YYYYMM') as year_month,
        SUM(amt)  as total_amount       FROM  custormer_orders       GROUP
    BY              name,              TO_CHAR(dt, 'YYYYMM')    )co,    (
        SELECT *       FROM customer_names,            year_months    ) sq
      WHERE        sq.year_month = co.year_month(+)    AND sq.name =
    co.name(+)    ORDER BY        sq.name,       sq.year_month
     
    Plan hash value: 1191063733
     
    ---------------------------------------------------------------------------------------------------------------
    | Id  | Operation                          | Name             | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
    ---------------------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT                   |                  |       |       |       |   154 (100)|       |
    |   1 |  SORT ORDER BY                     |                  | 10000 |   556K|  1352K|   154   (3)| 00:00:02 |
    |*  2 |   HASH JOIN OUTER                  |                  | 10000 |   556K|       |    13  (24)| 00:00:01 |
    |   3 |    VIEW                            |                  |  1000 | 22000 |       |     7  (15)| 00:00:01 |
    |   4 |     MERGE JOIN CARTESIAN           |                  |  1000 | 22000 |       |     7  (15)| 00:00:01 |
    |   5 |      VIEW                          |                  |     1 |     5 |       |     2   (0)| 00:00:01 |
    |   6 |       COUNT                        |                  |       |       |       |            |       |
    |   7 |        CONNECT BY WITHOUT FILTERING|                  |       |       |       |            |       |
    |   8 |         FAST DUAL                  |                  |     1 |       |       |     2   (0)| 00:00:01 |
    |   9 |      BUFFER SORT                   |                  |  1000 | 17000 |       |     7  (15)| 00:00:01 |
    |  10 |       VIEW                         |                  |  1000 | 17000 |       |     5  (20)| 00:00:01 |
    |  11 |        HASH UNIQUE                 |                  |  1000 | 17000 |       |     5  (20)| 00:00:01 |
    |  12 |         TABLE ACCESS FULL          | CUSTORMER_ORDERS |  1000 | 17000 |       |     4   (0)| 00:00:01 |
    |  13 |    VIEW                            |                  |  1000 | 35000 |       |     5  (20)| 00:00:01 |
    |  14 |     HASH GROUP BY                  |                  |  1000 | 39000 |       |     5  (20)| 00:00:01 |
    |  15 |      TABLE ACCESS FULL             | CUSTORMER_ORDERS |  1000 | 39000 |       |     4   (0)| 00:00:01 |
    ---------------------------------------------------------------------------------------------------------------
    Predicate Information (identified by operation id):
    ---------------------------------------------------
       2 - access("SQ"."YEAR_MONTH"="CO"."YEAR_MONTH" AND "SQ"."NAME"="CO"."NAME")
     
    Note
    -----
       - dynamic sampling used for this statement
     
     
    47 rows selected.
     
    SQL>
     
     
    从执行计划看,这个SQL的cost是154!

    接下来看看用Partition Outer Join如何实现,

    SQL>
    1  WITH year_months AS
    2  (
    3      SELECT to_char
    4              ( ADD_MONTHS(
    5                    DATE'2007-01-01', ROWNUM - 1), 'YYYYMM') AS year_month
    6      FROM dual
    7      CONNECT BY ROWNUM <= 12
    8  )
    9  SELECT co.name,
    0         ym.year_month,
    1         NVL(SUM(co.amt), 0) as total_amount
    2  FROM  year_months ym
    3  LEFT OUTER JOIN
    4        custormer_orders co
    5  PARTITION BY (co.name)
    6  ON   (TO_CHAR(co.dt, 'YYYYMM') = ym.year_month)
    7  GROUP BY
    8        co.name,
    9        ym.year_month
    0  ORDER BY
    1        co.name,
    2*       ym.year_month

    NAME                           YEAR_M TOTAL_AMOUNT
    ------------------------------ ------ ------------
    ORDSYS                         200701            0
    ORDSYS                         200702            0
    ORDSYS                         200703            0
    ORDSYS                         200704            0
    ORDSYS                         200705            0
    ORDSYS                         200706            0
    ORDSYS                         200707            0
    ORDSYS                         200708            0
    ORDSYS                         200709            0
    ORDSYS                         200710       224973
    ORDSYS                         200711            0

    ORDSYS                         200712            0


    WKSYS                          200701            0
    WKSYS                          200702            0
    WKSYS                          200703            0
    WKSYS                          200704            0
    WKSYS                          200705            0
    WKSYS                          200706            0
    WKSYS                          200707            0
    WKSYS                          200708            0
    WKSYS                          200709            0
    WKSYS                          200710         3830
    WKSYS                          200711            0

    WKSYS                          200712            0

    24 rows selected.

    SQL>

     

    注意PARTITION OUTER JOIN 的用法,

    (xx: table 1) LEFT OUTER JOIN (xxx: table 2) PARTITION BY (xxx: column name) ON (xxx: join condition)

    相应的执行计划如下,

    SQL> select plan_table_output
      2  from table(dbms_xplan.display_cursor); 
     
    PLAN_TABLE_OUTPUT
    ------------------------------------------------------------------------------------------------------------
    ------------------------------------------------------------------------------------------------------------ 
     
    SQL_ID  4np7tnbh0pqa2, child number 0
    -------------------------------------
    WITH year_months AS (     SELECT to_char             ( ADD_MONTHS(
                 DATE'2007-01-01', ROWNUM - 1), 'YYYYMM') AS year_month
    FROM dual     CONNECT BY ROWNUM < 12 ) SELECT co.name,
    ym.year_month,        NVL(SUM(co.amt), 0) as total_amount FROM
    year_months ym LEFT OUTER JOIN       custormer_orders co  PARTITION BY
    (co.name) ON   (TO_CHAR(co.dt, 'YYYYMM') = ym.year_month) GROUP BY
     co.name,       ym.year_month ORDER BY        co.name,
    ym.year_month 
     
    Plan hash value: 4042059898 
     
    -------------------------------------------------------------------------------------------------------
    | Id  | Operation                          | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
    -------------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT                   |                  |       |       |    10 (100)|          |
    |   1 |  SORT GROUP BY                     |                  |     2 |   138 |    10  (20)| 00:00:01 |
    |   2 |   VIEW                             |                  |    10 |   690 |     9  (12)| 00:00:01 |
    |   3 |    NESTED LOOPS PARTITION OUTER    |                  |    10 |   230 |     9  (12)| 00:00:01 |
    |   4 |     BUFFER SORT                    |                  |       |       |            |          |
    |   5 |      VIEW                          |                  |     1 |     5 |     2   (0)| 00:00:01 |
    |   6 |       COUNT                        |                  |       |       |            |          |
    |   7 |        CONNECT BY WITHOUT FILTERING|                  |       |       |            |          |
    |   8 |         FAST DUAL                  |                  |     1 |       |     2   (0)| 00:00:01 |
    |*  9 |     FILTER                         |                  |       |       |            |          |
    |  10 |      SORT PARTITION JOIN           |                  |    10 |   180 |     5  (20)| 00:00:01 |
    |  11 |       TABLE ACCESS FULL            | CUSTORMER_ORDERS |    10 |   180 |     4   (0)| 00:00:01 |
    ------------------------------------------------------------------------------------------------------- 
     
    Predicate Information (identified by operation id):
    --------------------------------------------------- 
     
       9 - filter("YM"."YEAR_MONTH"=TO_CHAR(INTERNAL_FUNCTION("CO"."DT"),'YYYYMM')) 
     
    35 rows selected. 
     
    SQL>
     
     
     
     
     
     
     

    从执行计划上看,应用Partition Outer Join之后, cost 从154降到了10!性能得到大大提升!

  • 相关阅读:
    [题解] [JSOI2011] 任务调度
    [题解] [JSOI2011] 棒棒糖
    [题解] [JSOI2011] 柠檬
    [题解] [JSOI2010] 排名
    [湖南集训] 谈笑风生
    BZOJ 4695 最假女选手 线段树
    HNOI 2010 物品调度 并查集 置换
    Luogu P4299 首都 LCT
    BZOJ 2738 矩阵乘法 整体二分
    51nod 1175 区间第k大 整体二分
  • 原文地址:https://www.cnblogs.com/fangwenyu/p/1639230.html
Copyright © 2020-2023  润新知