• OLAP 大表和小表并行hash join


    
    一个表50MB 
    
    一个表10GB 
    
    
    50M表做驱动表,放在PGA里
    
    这时候慢在对对 10g 的全表扫描
    
    
    对10个G扫描块 需要开并行
    
    
    我有这样一个算法 
    
    一个进程 读 50mb 
    
    8进程 来 扫描 10gb 
    
    一个 进程扫描 1.25gb 
    
    
    50MB 都分发到 8个进程
    
    超大表和小表之间做HASH JOIN,一般会启用用并行,ORACLE在并行HASH JOIN的时候会用到很多技术,比如 HASH HASH, 或者BROADCAST
    
    对于超大表和小表做HASH JOIN,一定要让小表进行广播(Broadcast),通常情况下CBO会选择正确,但是如果统计信息不准,或者基数计算错误CBO选择了 HASH HASH join,这个时
    
    候就很慢,观察现象就是它在做direct path write temp,这个时候就可以用HINT PQ_DISTRIBUTE 进行调整
    
    PQ_DISTRIBUTE(驱动表 None, Broadcast) 如果外层表很小(HASH_AJ),
    这个时候可以用 PQ_DISTRIBUTE(驱动表 Broadcast,None)
    
    
    下面就是一个具体的例子, F 是一个超大表 T 是一个小表 
    
    SQL&get; explain plan for select /*+ parallel(f 8) parallel(t 8) use_hash(t,f) full(f) full(t) PQ_DISTRIBUTE(f HASH, HASH) */ *
      2    from crs_data_fct f
      3    JOIN crs_time_perd_fdim t ON t.TIME_PERD_ID = f.TIME_PERD_ID;
    
    Explained.
    
    Elapsed: 00:00:00.83
    SQL&get; select * from table(dbms_xplan.display);
    
    PLAN_TABLE_OUTPUT
    ----------------------------------------------------------------------------------------------------------------------------------------------------
    Plan hash value: 353396990
    
    ---------------------------------------------------------------------------------------------------------------------------------------------------
    | Id  | Operation                       | Name               | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |    TQ  |IN-OUT| PQ Distrib |
    ---------------------------------------------------------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT                |                    |   239M|   696G|  8371  (20)| 00:01:58 |       |       |        |      |            |
    |   1 |  PX COORDINATOR                 |                    |       |       |            |          |       |       |        |      |            |
    |   2 |   PX SEND QC (RANDOM)           | :TQ10002           |   239M|   696G|  8371  (20)| 00:01:58 |       |       |  Q1,02 | P-&get;S | QC (RAND)  |
    |*  3 |    HASH JOIN BUFFERED           |                    |   239M|   696G|  8371  (20)| 00:01:58 |       |       |  Q1,02 | PCWP |            |
    |   4 |     PX RECEIVE                  |                    | 15808 |  1636K|    10   (0)| 00:00:01 |       |       |  Q1,02 | PCWP |            |
    |   5 |      PX SEND HASH               | :TQ10000           | 15808 |  1636K|    10   (0)| 00:00:01 |       |       |  Q1,00 | P-&get;P | HASH       |
    |   6 |       PX BLOCK ITERATOR         |                    | 15808 |  1636K|    10   (0)| 00:00:01 |       |       |  Q1,00 | PCWC |            |
    |   7 |        TABLE ACCESS STORAGE FULL| CRS_TIME_PERD_FDIM | 15808 |  1636K|    10   (0)| 00:00:01 |       |       |  Q1,00 | PCWP |            |
    |   8 |     PX RECEIVE                  |                    |   239M|   673G|  8267  (19)| 00:01:56 |       |       |  Q1,02 | PCWP |            |
    |   9 |      PX SEND HASH               | :TQ10001           |   239M|   673G|  8267  (19)| 00:01:56 |       |       |  Q1,01 | P-&get;P | HASH       |
    |  10 |       PX PARTITION LIST ALL     |                    |   239M|   673G|  8267  (19)| 00:01:56 |     1 |   951 |  Q1,01 | PCWC |            |
    |  11 |        TABLE ACCESS STORAGE FULL| CRS_DATA_FCT       |   239M|   673G|  8267  (19)| 00:01:56 |     1 |   951 |  Q1,01 | PCWP |            |
    ---------------------------------------------------------------------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       3 - access("T"."TIME_PERD_ID"="F"."TIME_PERD_ID")
    
    Note
    -----
       - dynamic sampling used for this statement (level=6)
    
    27 rows selected.
    
    SQL&get; explain plan for select /*+ parallel(f,8) parallel(t 8) use_hash(t,f) full(f) full(t) PQ_DISTRIBUTE(t None, Broadcast) */ *
      2    from crs_data_fct f
      3    JOIN crs_time_perd_fdim t ON t.TIME_PERD_ID = f.TIME_PERD_ID;
    
    Explained.
    
    SQL&get; select * from table(dbms_xplan.display);
    
    PLAN_TABLE_OUTPUT
    ------------------------------------------------------------------------------------------------------------------------------------------------------
    Plan hash value: 271674260
    
    ---------------------------------------------------------------------------------------------------------------------------------------------------
    | Id  | Operation                       | Name               | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |    TQ  |IN-OUT| PQ Distrib |
    ---------------------------------------------------------------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT                |                    |   239M|   696G|  8371  (20)| 00:01:58 |       |       |        |      |            |
    |   1 |  PX COORDINATOR                 |                    |       |       |            |          |       |       |        |      |            |
    |   2 |   PX SEND QC (RANDOM)           | :TQ10001           |   239M|   696G|  8371  (20)| 00:01:58 |       |       |  Q1,01 | P-&get;S | QC (RAND)  |
    |*  3 |    HASH JOIN                    |                    |   239M|   696G|  8371  (20)| 00:01:58 |       |       |  Q1,01 | PCWP |            |
    |   4 |     PX RECEIVE                  |                    | 15808 |  1636K|    10   (0)| 00:00:01 |       |       |  Q1,01 | PCWP |            |
    |   5 |      PX SEND BROADCAST          | :TQ10000           | 15808 |  1636K|    10   (0)| 00:00:01 |       |       |  Q1,00 | P-&get;P | BROADCAST  |
    |   6 |       PX BLOCK ITERATOR         |                    | 15808 |  1636K|    10   (0)| 00:00:01 |       |       |  Q1,00 | PCWC |            |
    |   7 |        TABLE ACCESS STORAGE FULL| CRS_TIME_PERD_FDIM | 15808 |  1636K|    10   (0)| 00:00:01 |       |       |  Q1,00 | PCWP |            |
    |   8 |     PX PARTITION LIST ALL       |                    |   239M|   673G|  8267  (19)| 00:01:56 |     1 |   951 |  Q1,01 | PCWC |            |
    |   9 |      TABLE ACCESS STORAGE FULL  | CRS_DATA_FCT       |   239M|   673G|  8267  (19)| 00:01:56 |     1 |   951 |  Q1,01 | PCWP |            |
    ---------------------------------------------------------------------------------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
       3 - access("T"."TIME_PERD_ID"="F"."TIME_PERD_ID")
    
    Note
    -----
       - dynamic sampling used for this statement (level=6)
    
    25 rows selected.
    
    hints:PQ_DISTRIBUTE(小表 None, Broadcast)
    
    
    
    

  • 相关阅读:
    【机器学习】关于判别模型和生成模型
    Delphi新手跟我学写CALL,附完整原程序
    QT事件研究的文章
    杂烩:QWidget、QGraphics、QtQuick
    Golang全接触
    学会使用git
    代码创建 WPF 旋转动画
    值得推荐的C/C++框架和库 very good
    可恶的QT隐式共享
    Notes on OpenSSL and Qt(ssl.pri,qsslocket_openssl_symbols.cpp)
  • 原文地址:https://www.cnblogs.com/hzcya1995/p/13352400.html
Copyright © 2020-2023  润新知