• PostgreSQL work_mem理解


    官方说法:

     work_mem (integer)

    Specifies the amount of memory to be used by internal sort operations and hash tables before writing to temporary disk files. The value defaults to four megabytes (4MB). Note that for a complex query, several sort or hash operations might be running in parallel; each operation will be allowed to use as much memory as this value specifies before it starts to write data into temporary files. Also, several running sessions could be doing such operations concurrently. Therefore, the total memory used could be many times the value of work_mem; it is necessary to keep this fact in mind when choosing the value. Sort operations are used for ORDER BYDISTINCT, and merge joins. Hash tables are used in hash joins, hash-based aggregation, and hash-based processing of IN subqueries.

    声明内部排序操作和Hash表在开始使用临时磁盘文件之前使用的内存限制。 缺省数值是4兆字节(4MB)。请注意对于复杂的查询, 可能会并发行若干排序或者散列表操作;每个都会被允许使用这个参数获得这么多内存, 然后才会开始求助于临时文件。同样,好几个正在运行的会话可能会同时进行排序操作。 因此使用的总内存可能是work_mem的好几倍。 当选择这个值的时候,必须记住这个事实。 ORDER BYDISTINCT和融合连接都要用到排序操作。 Hash表在散列连接、散列为基础的聚合、散列为基础的IN子查询处理中都要用到。

    生成一百万条记录

    [postgres@sht-sgmhadoopdn-04 ~]$ perl -e '@c=("a".."z","A".."Z",0..9); print join("",map{$c[rand@c]}10..20+rand(40))."
    " for 1..1000000' > /tmp/random_strings
    [postgres@sht-sgmhadoopdn-04 ~]$ ls -lh /tmp/random_strings 
    -rw-r--r-- 1 postgres dba 31M Nov 21 22:44 /tmp/random_strings

    创建对应表结构并导入数据

    edbstore=# CREATE TABLE test (id serial PRIMARY KEY, random_text text );
    CREATE TABLE
    edbstore=# d test
                                Table "public.test"
       Column    |  Type   |                     Modifiers                     
    -------------+---------+---------------------------------------------------
     id          | integer | not null default nextval('test_id_seq'::regclass)
     random_text | text    | 
    Indexes:
        "test_pkey" PRIMARY KEY, btree (id)
    
    edbstore=# d
                 List of relations
     Schema |    Name     |   Type   |  Owner   
    --------+-------------+----------+----------
     public | tb1         | table    | postgres
     public | test        | table    | postgres
     public | test_id_seq | sequence | postgres
    (3 rows)
    
    edbstore=# copy test (random_text) FROM '/tmp/random_strings';
    COPY 1000000
    edbstore=# select * from test limit 10;
     id |                   random_text                   
    ----+-------------------------------------------------
      1 | CKQyHTYH5VjeHRUC6YYLF8H5S
      2 | G22uBhFmrlA17wTUzf
      3 | ey6kX7I6etknzhEFCL
      4 | 8LB6navSS8VyoIeqbJBx9RqB3O4AI8GIFExnM7s
      5 | bvYt4dKGSiAun6yA5Q7owlKWJGEgD0nlxoBRZm8B
      6 | qk1RfhXHwo2PNpbI4
      7 | rnPterTw1a3Z3DoL8rhzlltUKb5
      8 | l2TrrbDsBkAa5V5ZBKFE59k4T7sDKA58yrS0mJNssl7CJnF
      9 | xM9HPgq6QMRsx1aOTqM0LPRQRYkQy50uV
     10 | viSJ4p1i3O0dY8tKei3x
    (10 rows)

     通过每次获取不通的数据量来观察每次explain的执行方式

    edbstore=# show work_mem;
     work_mem 
    ----------
     1MB
    (1 row)
    
    edbstore=# EXPLAIN analyze SELECT * FROM test WHERE id <= 10 ORDER BY random_text ASC;
                                                           QUERY PLAN                                                       
    ------------------------------------------------------------------------------------------------------------------------
     Sort  (cost=8.73..8.75 rows=9 width=35) (actual time=0.188..0.202 rows=10 loops=1)
       Sort Key: random_text
       Sort Method: quicksort  Memory: 25kB
       ->  Index Scan using test_pkey on test  (cost=0.42..8.58 rows=9 width=35) (actual time=0.018..0.037 rows=10 loops=1)
             Index Cond: (id <= 10)
     Planning time: 1.435 ms
     Execution time: 0.294 ms
    (7 rows)
    
    edbstore=# EXPLAIN analyze SELECT * FROM test WHERE id <= 100 ORDER BY random_text ASC;
                                                             QUERY PLAN                                                         
    ----------------------------------------------------------------------------------------------------------------------------
     Sort  (cost=13.50..13.75 rows=100 width=35) (actual time=0.870..1.027 rows=100 loops=1)
       Sort Key: random_text
       Sort Method: quicksort  Memory: 34kB
       ->  Index Scan using test_pkey on test  (cost=0.42..10.18 rows=100 width=35) (actual time=0.022..0.218 rows=100 loops=1)
             Index Cond: (id <= 100)
     Planning time: 0.286 ms
     Execution time: 1.248 ms
    (7 rows)
    
    edbstore=# EXPLAIN analyze SELECT * FROM test WHERE id <= 1000 ORDER BY random_text ASC;
                                                              QUERY PLAN                                                          
    ------------------------------------------------------------------------------------------------------------------------------
     Sort  (cost=92.57..95.10 rows=1011 width=35) (actual time=8.846..10.251 rows=1000 loops=1)
       Sort Key: random_text
       Sort Method: quicksort  Memory: 112kB
       ->  Index Scan using test_pkey on test  (cost=0.42..42.12 rows=1011 width=35) (actual time=0.027..2.474 rows=1000 loops=1)
             Index Cond: (id <= 1000)
     Planning time: 0.286 ms
     Execution time: 11.584 ms
    (7 rows)
    
    edbstore=# EXPLAIN analyze SELECT * FROM test WHERE id <= 10000 ORDER BY random_text ASC;
                                                                QUERY PLAN                                                            
    ----------------------------------------------------------------------------------------------------------------------------------
     Sort  (cost=1049.39..1074.68 rows=10116 width=35) (actual time=144.963..160.943 rows=10000 loops=1)
       Sort Key: random_text
       Sort Method: external merge  Disk: 448kB
       ->  Index Scan using test_pkey on test  (cost=0.42..376.45 rows=10116 width=35) (actual time=0.063..22.225 rows=10000 loops=1)
             Index Cond: (id <= 10000)
     Planning time: 0.149 ms
     Execution time: 173.841 ms
    (7 rows)
    
    edbstore=# EXPLAIN analyze SELECT * FROM test WHERE id <= 100000 ORDER BY random_text ASC;
                                                                  QUERY PLAN                                                              
    --------------------------------------------------------------------------------------------------------------------------------------
     Sort  (cost=17477.39..17727.70 rows=100122 width=35) (actual time=1325.789..1706.516 rows=100000 loops=1)
       Sort Key: random_text
       Sort Method: external merge  Disk: 4440kB
       ->  Index Scan using test_pkey on test  (cost=0.42..3680.56 rows=100122 width=35) (actual time=0.088..214.490 rows=100000 loops=1)
             Index Cond: (id <= 100000)
     Planning time: 0.147 ms
     Execution time: 1822.008 ms
    (7 rows)
    
    edbstore=# EXPLAIN analyze SELECT * FROM test WHERE id <= 1000000 ORDER BY random_text ASC;
                                                           QUERY PLAN                                                       
    ------------------------------------------------------------------------------------------------------------------------
     Sort  (cost=202426.34..204926.34 rows=1000000 width=35) (actual time=8703.143..10160.421 rows=1000000 loops=1)
       Sort Key: random_text
       Sort Method: external merge  Disk: 44504kB
       ->  Seq Scan on test  (cost=0.00..20732.00 rows=1000000 width=35) (actual time=0.024..1021.491 rows=1000000 loops=1)
             Filter: (id <= 1000000)
     Planning time: 0.316 ms
     Execution time: 10577.464 ms
    (7 rows)
    row Sort Method Execution time
    10 quicksort  Memory: 25kB 0.294 ms
    100 Sort Method: quicksort  Memory: 34kB 1.248 ms
    1000 Sort Method: quicksort  Memory: 112kB 11.584 ms
    10000 Sort Method: external merge  Disk: 448kB 173.841 ms
    100000 Sort Method: external merge  Disk: 4440kB 1822.008 ms
    1000000 Sort Method: external merge  Disk: 44504kB 10577.464 ms

    通过上图我们可以看到,当sort的数据大于一万条时,explain显示排序方法从 quicksort in memory, 到external merge disk method,说明此时的work_mem的大小不能满足我们在内存的sort和hash表的需求。此时我们将work_mem参数的值调大

    edbstore=# set work_mem="500MB";
    SET
    edbstore=# EXPLAIN analyze SELECT * FROM test WHERE id <= 1000000 ORDER BY random_text ASC;
                                                          QUERY PLAN                                                       
    -----------------------------------------------------------------------------------------------------------------------
     Sort  (cost=120389.84..122889.84 rows=1000000 width=35) (actual time=6232.270..6884.121 rows=1000000 loops=1)
       Sort Key: random_text
       Sort Method: quicksort  Memory: 112847kB
       ->  Seq Scan on test  (cost=0.00..20732.00 rows=1000000 width=35) (actual time=0.015..659.035 rows=1000000 loops=1)
             Filter: (id <= 1000000)
     Planning time: 0.125 ms
     Execution time: 7302.621 ms
    (7 rows)
    row Sort Method Execution time
    1000000 quicksort  Memory: 112847kB 6887.851 ms

    可以发现sort method从merg disk变成quicksort in memory。

    https://www.depesz.com/2011/07/03/understanding-postgresql-conf-work_mem/

  • 相关阅读:
    字符串与数字相互转换
    CodeForces
    解救迷茫的草滩小王子
    Ubuntu18.0.4 apt换源
    N进制与十进制之间的 转换(整数,小数)
    2019-10-10问题
    千里之行始于足下,付出总会有回报
    git--基本命令篇
    C#-网络请求方法
    爬虫实战-网易云音乐
  • 原文地址:https://www.cnblogs.com/ilifeilong/p/9998641.html
Copyright © 2020-2023  润新知