• PostgreSQL 13支持增量排序(Incremental Sorting)


    PostgreSQL 13支持增量排序(Incremental Sorting)

    PostgreSQL 13一个重要的功能是支持增量排序,使用order by 时可以加速排序,SQL如下

    select * from test order by a,b limit 10;

    如果在字段a上面建立了索引,需要对字段a、b进行排序,如果一个结果已经按几个前导键排序,这就允许对附加的b进行批量排序。

    enable_incremental_sort

    PostgreSQL新增了配置enable_incremental_sort用于控制是否开启增量排序,此参数默认开启

    测试准备

    在PostgreSQL 13中创建测试表进行测试

    postgres=# create table test(id int,c1 int ,c2 int,info varchar(300),crt_time timestamp);
    CREATE TABLE
    postgres=# insert into test select t,t,2,'test',clock_timestamp()  from generate_series(1,1000000)t;
    INSERT 0 1000000
    postgres=# create index i_test_id on test(id);
    CREATE INDEX
    --查看数据如下
    postgres=# select * from test order by id,c1 limit 10;
     id | c1 | c2 | info |          crt_time          
    ----+----+----+------+----------------------------
      1 |  1 |  2 | test | 2022-06-02 14:23:38.253289
      2 |  2 |  2 | test | 2022-06-02 14:23:38.253777
      3 |  3 |  2 | test | 2022-06-02 14:23:38.253785
      4 |  4 |  2 | test | 2022-06-02 14:23:38.253787
      5 |  5 |  2 | test | 2022-06-02 14:23:38.25379
      6 |  6 |  2 | test | 2022-06-02 14:23:38.253791
      7 |  7 |  2 | test | 2022-06-02 14:23:38.253793
      8 |  8 |  2 | test | 2022-06-02 14:23:38.253795
      9 |  9 |  2 | test | 2022-06-02 14:23:38.253809
     10 | 10 |  2 | test | 2022-06-02 14:23:38.25381
    (10 rows)
    

    PostgreSQL 13 测试

    • 这里我是在pg14中做的测试,pg13这个参数名叫enable_incrementalsort
    postgres=# show enable_incremental_sort;
     enable_incremental_sort 
    -------------------------
     on
    (1 row)
    
    postgres=# explain analyze select * from test order by id,c1 limit 10;
                                                                   QUERY PLAN                                                               
    ----------------------------------------------------------------------------------------------------------------------------------------
     Limit  (cost=0.46..1.16 rows=10 width=25) (actual time=0.159..0.163 rows=10 loops=1)
       ->  Incremental Sort  (cost=0.46..70373.03 rows=1000000 width=25) (actual time=0.157..0.159 rows=10 loops=1)
             Sort Key: id, c1
             Presorted Key: id
             Full-sort Groups: 1  Sort Method: quicksort  Average Memory: 25kB  Peak Memory: 25kB
             ->  Index Scan using i_test_id on test  (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.103..0.106 rows=11 loops=1)
     Planning Time: 0.427 ms
     Execution Time: 0.265 ms
    (8 rows)
    
    
    • 可以看到Incremental SortPresorted Key: id并且走了i_test_id索引,SQL耗时0.265ms

    关闭enable_incremental_sort

    postgres=# set enable_incremental_sort=off;
    SET
    postgres=# explain analyze select * from test order by id,c1 limit 10;
                                                             QUERY PLAN                                                          
    -----------------------------------------------------------------------------------------------------------------------------
     Limit  (cost=38962.64..38962.67 rows=10 width=25) (actual time=272.945..272.953 rows=10 loops=1)
       ->  Sort  (cost=38962.64..41462.64 rows=1000000 width=25) (actual time=272.933..272.937 rows=10 loops=1)
             Sort Key: id, c1
             Sort Method: top-N heapsort  Memory: 25kB
             ->  Seq Scan on test  (cost=0.00..17353.00 rows=1000000 width=25) (actual time=0.028..118.098 rows=1000000 loops=1)
     Planning Time: 0.305 ms
     Execution Time: 273.023 ms
    (7 rows)
    
    • 关闭增量排序后SQL耗时273.023 ms,性能差了几个数量级

    PostgreSQL 12 测试

    • Abase 7.0基于PostgreSQL 12.3

    同样使用上面的建表语句,执行SQL如下

    postgres=#  explain analyze select * from test order by id,c1 limit 10;
                                                              QUERY PLAN                                                          
    ------------------------------------------------------------------------------------------------------------------------------
     Limit  (cost=38962.64..38962.67 rows=10 width=536) (actual time=288.847..288.851 rows=10 loops=1)
       ->  Sort  (cost=38962.64..41462.64 rows=1000000 width=536) (actual time=288.839..288.840 rows=10 loops=1)
             Sort Key: id, c1
             Sort Method: top-N heapsort  Memory: 25kB
             ->  Seq Scan on test  (cost=0.00..17353.00 rows=1000000 width=536) (actual time=0.078..173.460 rows=1000000 loops=1)
     Planning Time: 24.726 ms
     Execution Time: 289.135 ms
    (7 rows)
    
    

    PG 12中执行计划和PG 14关闭enable_incremental_sort参数一样,性能较低

    当然这只是一个简单的查询,如果包含where,以及连表等情况是否也可以使用 Incremental Sort

    带条件

    加上c1 > 100000,c1没有创建索引

    postgres=# explain analyze select * from test where c1 > 100000 order by id,c1 limit 10;
                                                                   QUERY PLAN                                                                
    -----------------------------------------------------------------------------------------------------------------------------------------
     Limit  (cost=0.47..1.23 rows=10 width=25) (actual time=49.470..49.476 rows=10 loops=1)
       ->  Incremental Sort  (cost=0.47..68345.40 rows=899386 width=25) (actual time=49.467..49.469 rows=10 loops=1)
             Sort Key: id, c1
             Presorted Key: id
             Full-sort Groups: 1  Sort Method: quicksort  Average Memory: 25kB  Peak Memory: 25kB
             ->  Index Scan using i_test_id on test  (cost=0.42..27873.02 rows=899386 width=25) (actual time=49.383..49.387 rows=11 loops=1)
                   Filter: (c1 > 100000)
                   Rows Removed by Filter: 100000
     Planning Time: 0.879 ms
     Execution Time: 49.594 ms
    (10 rows)
    
    

    加上 id > 100000,id有索引

    postgres=# explain analyze select * from test where id > 100000 order by id,c1 limit 10;
                                                                  QUERY PLAN                                                               
    ---------------------------------------------------------------------------------------------------------------------------------------
     Limit  (cost=0.46..1.19 rows=10 width=25) (actual time=0.160..0.164 rows=10 loops=1)
       ->  Incremental Sort  (cost=0.46..65542.05 rows=899386 width=25) (actual time=0.148..0.150 rows=10 loops=1)
             Sort Key: id, c1
             Presorted Key: id
             Full-sort Groups: 1  Sort Method: quicksort  Average Memory: 25kB  Peak Memory: 25kB
             ->  Index Scan using i_test_id on test  (cost=0.42..25069.68 rows=899386 width=25) (actual time=0.115..0.119 rows=11 loops=1)
                   Index Cond: (id > 100000)
     Planning Time: 0.408 ms
     Execution Time: 0.258 ms
    (9 rows)
    

    可以看到即使where条件没有索引,排序字段有索引也可以使用增量排序功能,而且效果也还不错。做了一个过滤操作 Filter: (c1 > 100000)

    PG 13 多字段排序

    • 根据id,c1,c2进行排序,一样可以走增量排序
    postgres=# explain analyze select * from test  order by id,c1,c2 limit 10;
                                                                   QUERY PLAN                                                               
    ----------------------------------------------------------------------------------------------------------------------------------------
     Limit  (cost=0.46..1.16 rows=10 width=25) (actual time=0.175..0.179 rows=10 loops=1)
       ->  Incremental Sort  (cost=0.46..70373.03 rows=1000000 width=25) (actual time=0.172..0.174 rows=10 loops=1)
             Sort Key: id, c1, c2
             Presorted Key: id
             Full-sort Groups: 1  Sort Method: quicksort  Average Memory: 25kB  Peak Memory: 25kB
             ->  Index Scan using i_test_id on test  (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.126..0.130 rows=11 loops=1)
     Planning Time: 0.485 ms
     Execution Time: 0.237 ms
    (8 rows)
    
    

    PG 13 join

    • 复制一张test2
    postgres=# create table test2 as select * from test;
    SELECT 1000000
    
    postgres=# create index i_test2_id on test2(id);
    CREATE INDEX
    
    • join连表查询,并且排序字段test.id,test.c1
    postgres=# explain analyze select *from test join test2 on test.id = test2.id order by test.id,test.c1 limit 10;
                                                                       QUERY PLAN                                                                   
    ------------------------------------------------------------------------------------------------------------------------------------------------
     Limit  (cost=1.93..3.04 rows=10 width=50) (actual time=0.089..0.092 rows=10 loops=1)
       ->  Incremental Sort  (cost=1.93..110738.33 rows=1000000 width=50) (actual time=0.087..0.089 rows=10 loops=1)
             Sort Key: test.id, test.c1
             Presorted Key: test.id
             Full-sort Groups: 1  Sort Method: quicksort  Average Memory: 26kB  Peak Memory: 26kB
             ->  Merge Join  (cost=1.85..65738.33 rows=1000000 width=50) (actual time=0.044..0.068 rows=11 loops=1)
                   Merge Cond: (test.id = test2.id)
                   ->  Index Scan using i_test_id on test  (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.022..0.036 rows=11 loops=1)
                   ->  Index Scan using i_test2_id on test2  (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.014..0.018 rows=11 loops=1)
     Planning Time: 0.599 ms
     Execution Time: 0.174 ms
    (11 rows)
    
    postgres=# set enable_incremental_sort=off ;
    SET
    postgres=# explain analyze select *from test join test2 on test.id = test2.id order by test.id,test.c1 limit 10;
                                                                          QUERY PLAN                                                                       
    -------------------------------------------------------------------------------------------------------------------------------------------------------
     Limit  (cost=87347.97..87347.99 rows=10 width=50) (actual time=1964.394..1964.407 rows=10 loops=1)
       ->  Sort  (cost=87347.97..89847.97 rows=1000000 width=50) (actual time=1964.391..1964.402 rows=10 loops=1)
             Sort Key: test.id, test.c1
             Sort Method: top-N heapsort  Memory: 26kB
             ->  Merge Join  (cost=1.85..65738.33 rows=1000000 width=50) (actual time=0.070..1690.949 rows=1000000 loops=1)
                   Merge Cond: (test.id = test2.id)
                   ->  Index Scan using i_test_id on test  (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.042..571.732 rows=1000000 loops=1)
                   ->  Index Scan using i_test2_id on test2  (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.017..585.722 rows=1000000 loops=1)
     Planning Time: 1.292 ms
     Execution Time: 1964.517 ms
    (10 rows)
    
    

    join后排序也可以走增量排序,使用增量排序耗时:0.174 ms,而关闭增量后耗时1964.517 ms

    • 如果join后排序的字段来自不同的表test.id,test2.c1
    postgres=# explain analyze select *from test join test2 on test.id = test2.id order by test.id,test2.c1 limit 10;
                                                                       QUERY PLAN                                                                   
    ------------------------------------------------------------------------------------------------------------------------------------------------
     Limit  (cost=1.93..3.04 rows=10 width=50) (actual time=0.151..0.155 rows=10 loops=1)
       ->  Incremental Sort  (cost=1.93..110738.33 rows=1000000 width=50) (actual time=0.149..0.151 rows=10 loops=1)
             Sort Key: test.id, test2.c1
             Presorted Key: test.id
             Full-sort Groups: 1  Sort Method: quicksort  Average Memory: 26kB  Peak Memory: 26kB
             ->  Merge Join  (cost=1.85..65738.33 rows=1000000 width=50) (actual time=0.075..0.088 rows=11 loops=1)
                   Merge Cond: (test.id = test2.id)
                   ->  Index Scan using i_test_id on test  (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.040..0.044 rows=11 loops=1)
                   ->  Index Scan using i_test2_id on test2  (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.025..0.028 rows=11 loops=1)
     Planning Time: 0.778 ms
     Execution Time: 0.230 ms
    (11 rows)
    
    
    postgres=# set enable_incremental_sort=off ;
    SET
    postgres=# explain analyze select *from test join test2 on test.id = test2.id order by test.id,test2.c1 limit 10;
                                                                          QUERY PLAN                                                                       
    -------------------------------------------------------------------------------------------------------------------------------------------------------
     Limit  (cost=87347.97..87347.99 rows=10 width=50) (actual time=1493.513..1493.519 rows=10 loops=1)
       ->  Sort  (cost=87347.97..89847.97 rows=1000000 width=50) (actual time=1493.510..1493.513 rows=10 loops=1)
             Sort Key: test.id, test2.c1
             Sort Method: top-N heapsort  Memory: 26kB
             ->  Merge Join  (cost=1.85..65738.33 rows=1000000 width=50) (actual time=0.065..1228.403 rows=1000000 loops=1)
                   Merge Cond: (test.id = test2.id)
                   ->  Index Scan using i_test_id on test  (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.027..318.044 rows=1000000 loops=1)
                   ->  Index Scan using i_test2_id on test2  (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.027..390.231 rows=1000000 loops=1)
     Planning Time: 0.761 ms
     Execution Time: 1493.685 ms
    (10 rows)
    

    join后排序的字段来自不同的表test.id,test2.c1,也可以走增量排序,开启增量耗时:0.230,关闭后耗时:1493.685 ms

    来看看一个比较慢的SQL:

    • 这个SQL两表关联,而且使用了c2=2这一列全部为2,并且使用offset 100000
    postgres=# explain analyze select *from test join test2 on test.id = test2.id where test.c2 = 2 order by test.id,test2.c1 limit 10 offset 100000;
                                                                         QUERY PLAN                                                                      
    -----------------------------------------------------------------------------------------------------------------------------------------------------
     Limit  (cost=11325.58..11326.72 rows=10 width=50) (actual time=198.125..198.131 rows=10 loops=1)
       ->  Incremental Sort  (cost=2.02..113237.64 rows=1000000 width=50) (actual time=0.127..193.661 rows=100010 loops=1)
             Sort Key: test.id, test2.c1
             Presorted Key: test.id
             Full-sort Groups: 3126  Sort Method: quicksort  Average Memory: 29kB  Peak Memory: 29kB
             ->  Merge Join  (cost=1.94..68237.64 rows=1000000 width=50) (actual time=0.052..152.908 rows=100011 loops=1)
                   Merge Cond: (test.id = test2.id)
                   ->  Index Scan using i_test_id on test  (cost=0.42..27873.02 rows=1000000 width=25) (actual time=0.026..46.138 rows=100011 loops=1)
                         Filter: (c2 = 2)
                   ->  Index Scan using i_test2_id on test2  (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.020..51.088 rows=100011 loops=1)
     Planning Time: 0.707 ms
     Execution Time: 198.252 ms
    (12 rows)
    
    

    因为增量排序的缘故,查询还是很快

    • 如果我们关闭增量排序功能
    postgres=# explain analyze select *from test join test2 on test.id = test2.id where test.c2 = 2 order by test.id,test2.c1 limit 10 offset 100000;
                                                                          QUERY PLAN                                                                       
    -------------------------------------------------------------------------------------------------------------------------------------------------------
     Limit  (cost=156536.56..156536.59 rows=10 width=50) (actual time=2496.085..2496.093 rows=10 loops=1)
       ->  Sort  (cost=156286.56..158786.56 rows=1000000 width=50) (actual time=2469.643..2491.429 rows=100010 loops=1)
             Sort Key: test.id, test2.c1
             Sort Method: external merge  Disk: 72432kB
             ->  Merge Join  (cost=1.94..68237.64 rows=1000000 width=50) (actual time=0.082..1371.433 rows=1000000 loops=1)
                   Merge Cond: (test.id = test2.id)
                   ->  Index Scan using i_test_id on test  (cost=0.42..27873.02 rows=1000000 width=25) (actual time=0.040..433.114 rows=1000000 loops=1)
                         Filter: (c2 = 2)
                   ->  Index Scan using i_test2_id on test2  (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.033..401.784 rows=1000000 loops=1)
     Planning Time: 0.807 ms
     Execution Time: 2530.205 ms
    (11 rows)
    

    这个SQL耗时 2530.205 ms,和198.252 ms比增量排序提升还是很明显

    但是我们观察到上面的SQL中使用id进行关联,且用id排序的时候查询效率较高,如果排序的字段换成crt_time效果如何?

    postgres=# explain analyze select *from test join test2 on test.id = test2.id where test.c2 = 2 order by test.crt_time,test2.c1 limit 10 offset 100000;
                                                                          QUERY PLAN                                                                       
    -------------------------------------------------------------------------------------------------------------------------------------------------------
     Limit  (cost=156536.56..156536.59 rows=10 width=50) (actual time=2702.107..2702.133 rows=10 loops=1)
       ->  Sort  (cost=156286.56..158786.56 rows=1000000 width=50) (actual time=2667.324..2697.033 rows=100010 loops=1)
             Sort Key: test.crt_time, test2.c1
             Sort Method: external merge  Disk: 72432kB
             ->  Merge Join  (cost=1.94..68237.64 rows=1000000 width=50) (actual time=0.161..1524.794 rows=1000000 loops=1)
                   Merge Cond: (test.id = test2.id)
                   ->  Index Scan using i_test_id on test  (cost=0.42..27873.02 rows=1000000 width=25) (actual time=0.074..488.803 rows=1000000 loops=1)
                         Filter: (c2 = 2)
                   ->  Index Scan using i_test2_id on test2  (cost=0.42..25373.02 rows=1000000 width=25) (actual time=0.073..487.688 rows=1000000 loops=1)
     Planning Time: 1.835 ms
     Execution Time: 2746.486 ms
    (11 rows)
    
    

    当join关联的字段和order by的字段不一样时,虽然order by的字段有索引但也不能走,如果字段一致那么也能利用增量排序。

    使用test.crt_time排序和上面关闭增量排序执行计划一样

    总结

    • 增量排序对于单表多字段排序来说效率还是提升明显

    • join连表查询如果关联的键和排序键一样也能走增量排序,如果不一样则不能走增量排序

    参考资料:

    https://postgres.fun/20200721193000.html

    新版本调研 · 13 Beta 1 初体验

    https://mp.weixin.qq.com/s/mBIL2uzIHB7qVByBIVRmhg

  • 相关阅读:
    用Oracle实现ASH的数据透视图
    Oracle AWR 之 通过dbms_workload_repository.awr_report_text(html)函数在客户端生成AWR报告
    (转)CentOS 7 安装 Python3、pip3
    (转) Linux 内核运行参数修改——sysctl命令
    (转)oracle linux 7 安装oracle 12c
    (转)Oracle与DB2在数据库高可用技术上的相同与差异探讨
    (转)OpenStack —— 原理架构介绍(一、二)
    (转)ELK原理与介绍
    (转)Db2 备份恢复性能问题诊断与调优
    (转)IBM AIX系统安装
  • 原文地址:https://www.cnblogs.com/zhangfx01/p/16348068.html
Copyright © 2020-2023  润新知