• oralce之 10046对Hash Join分析


    前两天解决了一个优化SQL的case,SQL语句如下,big_table为150G大小,small_table很小,9000多条记录,不到1M大小,hash_area_size, sort_area_size均设置足够大,可以进行optimal hash join和memory sort。

    1
    2
    3
    4
    5
    6
    select /*+ leading(b) use_hash(a b) */ distinct a.ID
    from BIG_TABLE a, SMALL_TABLE b
    where (a.category  = b.from_cat or
           a.category2 = b.from_cat) and
           a.site_id  = b.site_id and
           a.sale_end >= sysdate;

    执行计划如下:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    --------------------------------------------------------------------------
    | Id  | Operation            |  Name        | Rows  | Bytes | Cost (%CPU)|
    --------------------------------------------------------------------------
    |   0 | SELECT STATEMENT     |              |     2 |   174 |    18  (17)|
    |   1 |  SORT UNIQUE         |              |     2 |   174 |    18  (17)|
    |*  2 |   HASH JOIN          |              |     2 |   174 |    17  (12)|
    |   3 |    TABLE ACCESS FULL | SMALL_TABLE  |  1879 | 48854 |    14   (8)|
    |*  4 |    TABLE ACCESS FULL | BIG_TABLE    |     4 |   244 |     3  (34)|
    --------------------------------------------------------------------------
     
    Predicate Information (identified by operation id):
    ---------------------------------------------------
       2 - access("A"."SITE_ID"="B"."SITE_ID")
           filter("A"."CATEGORY"="B"."FROM_CAT" OR
                  "A"."CATEGORY2"="B"."FROM_CAT")
       4 - filter("A"."SALE_END">=SYSDATE@!)

    粗略来看,PLAN非常的完美,SQL HINT写的也很到位,小表在内build hash table,大表在外进行probe操作,根据经验来看,整个SQL执行的时间应该和FTS(Full Table Scan) BIG_TABLE的时间差不多。

    但是FTS BIG_TABLE的时间大约是8分钟,而真个SQL执行的时间长达3~4小时。

    那么问题究竟出在哪里?

    FTS时间应该不会有太大变化,那么问题应该在hash join,设置event来trace一下hash join的过程:

    1
    alter session set events '10104 trace name context forever, level 2';
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    ### Hash table ###
    # NOTE: The calculated number of rows in non-empty buckets may be smaller
    #       than the true number.
    Number of buckets with   0 rows:      16373
    Number of buckets with   1 rows:          0
    Number of buckets with   2 rows:          0
    Number of buckets with   3 rows:          1
    Number of buckets with   4 rows:          0
    Number of buckets with   5 rows:          0
    Number of buckets with   6 rows:          0
    Number of buckets with   7 rows:          1
    Number of buckets with   8 rows:          0
    Number of buckets with   9 rows:          0
    Number of buckets with between  10 and  19 rows:          1
    Number of buckets with between  20 and  29 rows:          1
    Number of buckets with between  30 and  39 rows:          3
    Number of buckets with between  40 and  49 rows:          0
    Number of buckets with between  50 and  59 rows:          0
    Number of buckets with between  60 and  69 rows:          0
    Number of buckets with between  70 and  79 rows:          0
    Number of buckets with between  80 and  89 rows:          0
    Number of buckets with between  90 and  99 rows:          0
    Number of buckets with 100 or more rows:          4
    ### Hash table overall statistics ###
    Total buckets: 16384 Empty buckets: 16373 Non-empty buckets: 11
    Total number of rows: 9232
    Maximum number of rows in a bucket: 2531
    Average number of rows in non-empty buckets: 839.272705

    仔细看,在一个bucket中最多的行数竟然有2531行,因为bucket中是一个链表的结构,所以这几千行都是串在一个链表上。 
    由这一点想到这个Hash Table所依赖的hash key的distinct value可能太少,重复值太多。否则不应该会有这么多行在同一个bucket里面。

    因为Join条件里面有两个列from_cat和site_id,穷举法有三种情况:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    SQL> select site_id,from_cat,count(*) from SMALL_TABLE group by site_id,from_cat having count(*)>100;
     
    no rows selected
     
    2. Build hash table based on (from_cat):
     
    SQL> select from_cat,count(*) from SMALL_TABLE group by from_cat having count(*)>100;
     
    no rows selected
     
    3. Build hash table based on (site_id):
     
    SQL> select site_id,count(*) from SMALL_TABLE group by site_id having count(*)>100;
     
       SITE_ID   COUNT(*)
    ---------- ----------
             0       2531
             2       2527
           146       1490
           210       2526

    到这里可以发现,基于site_id这种情况和trace file中这两行很相符:

    1
    2
    Number of buckets with 100 or more rows: 4
    Maximum number of rows in a bucket: 2531

    注:这判断过程可以从执行计划的“Predicate Information”部分看出:

    1
    access("A"."SITE_ID"="B"."SITE_ID")

    所以推断这个hash table是基于site_id而建的,而Big_Table中大量的行site_id=0,都落在这个linked list最长的bucket中,而大部分行都会扫描完整个链表而最后被丢弃掉,所以这个Hash Join的操作效率非常差,几乎变为了Nest Loop操作。

    找到了根本原因,问题也就迎刃而解了。

    理想状况下,hash table应当建立于(site_id,from_cat)上,那么问题肯定出在这个OR上,把OR用UNION改写:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    select /*+ leading(b) use_hash(a b) */ distinct a.ID
    from BIG_TABLE a, SMALL_TABLE b
    where  a.category  = b.from_cat and
           a.site_id  = b.site_id and
           a.sale_end >= sysdate
    UNION
    select /*+ leading(b) use_hash(a b) */ distinct a.ID
    from BIG_TABLE a, SMALL_TABLE b
    where  a.category2 = b.from_cat and
           a.site_id  = b.site_id and
           a.sale_end >= sysdate;
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    --------------------------------------------------------------------------
    | Id  | Operation            |  Name        | Rows  | Bytes | Cost (%CPU)|
    --------------------------------------------------------------------------
    |   0 | SELECT STATEMENT     |              |     2 |   148 |    36  (59)|
    |   1 |  SORT UNIQUE         |              |     2 |   148 |    36  (59)|
    |   2 |   UNION-ALL          |              |       |       |            |
    |*  3 |    HASH JOIN         |              |     1 |    74 |    17  (12)|
    |   4 |     TABLE ACCESS FULL| SMALL_TABLE  |  1879 | 48854 |    14   (8)|
    |*  5 |     TABLE ACCESS FULL| BIG_TABLE    |     4 |   192 |     3  (34)|
    |*  6 |    HASH JOIN         |              |     1 |    74 |    17  (12)|
    |   7 |     TABLE ACCESS FULL| SMALL_TABLE  |  1879 | 48854 |    14   (8)|
    |*  8 |     TABLE ACCESS FULL| BIG_TABLE    |     4 |   192 |     3  (34)|
    --------------------------------------------------------------------------
     
    Predicate Information (identified by operation id):
    ---------------------------------------------------
       3 - access("A"."CATEGORY"="B"."FROM_CAT" AND
                  "A"."SITE_ID"="B"."SITE_ID")
       5 - filter("A"."SALE_END">=SYSDATE@!)
       6 - access("A"."CATEGORY2"="B"."FROM_CAT" AND
                  "A"."SITE_ID"="B"."SITE_ID")
       8 - filter("A"."SALE_END">=SYSDATE@!)

    初看这个PLAN好像不如第一个PLAN,因为执行了两次BIG_TABLE的FTS,但是让我们在来看看HASH TABLE的结构

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    ### Hash table ###
    # NOTE: The calculated number of rows in non-empty buckets may be smaller
    #       than the true number.
    Number of buckets with   0 rows:       9306
    Number of buckets with   1 rows:       5310
    Number of buckets with   2 rows:       1436
    Number of buckets with   3 rows:        285
    Number of buckets with   4 rows:         43
    Number of buckets with   5 rows:          4
    Number of buckets with   6 rows:          0
    Number of buckets with   7 rows:          0
    Number of buckets with   8 rows:          0
    Number of buckets with   9 rows:          0
    Number of buckets with between  10 and  19 rows:          0
    Number of buckets with between  20 and  29 rows:          0
    Number of buckets with between  30 and  39 rows:          0
    Number of buckets with between  40 and  49 rows:          0
    Number of buckets with between  50 and  59 rows:          0
    Number of buckets with between  60 and  69 rows:          0
    Number of buckets with between  70 and  79 rows:          0
    Number of buckets with between  80 and  89 rows:          0
    Number of buckets with between  90 and  99 rows:          0
    Number of buckets with 100 or more rows:          0
    ### Hash table overall statistics ###
    Total buckets: 16384 Empty buckets: 9306 Non-empty buckets: 7078
    Total number of rows: 9232
    Maximum number of rows in a bucket: 5
    Average number of rows in non-empty buckets: 1.304323

    这就是我们所需要的Hash Table,最长的链表只有五行数据。

    整个SQL的执行时间从三四个小时缩短为16分钟,大大超出了developer的预期。

    这个SQL单纯从PLAN上很难看出问题所在,需要了解Hash Join的机制,进行更深一步的分析。

    source:http://www.itpub.net/thread-955209-1-1.html

  • 相关阅读:
    Jmeter4.0压测实战
    tomcat只部署一个index.html 文件
    redis 查看当前连接数
    公众号入口-H5测试要点
    windows 下启动redis && Python 操作 redis
    Jmeter4.0之beanshell引用外部jar
    算法练习之存在重复元素
    python 递归查找jpg文件并打印
    hadoop单节点安装
    windows 下获取文件夹下的文件名称
  • 原文地址:https://www.cnblogs.com/andy6/p/7502059.html
Copyright © 2020-2023  润新知