postgresql的effective_cache_size

优化器假设可以用于单个查询的磁盘缓存的有效大小。这个因素会被用到使用索引的成本考虑中：值越大，使用索引扫描的可能性就越大；值越小，使用顺序扫描的可能性就越大。
设置该参数的时候，需要同时考虑到shared buffer和内核对磁盘缓存的使用，尽管有些数据会同时存在shared buffer和内核的磁盘缓存中。同时要考虑到在不同的表上并发查询的数量，因为他们也会使用到共享空间。
该参数不会影响分配给postgresql的共享内存，也不保留内核磁盘缓存。只是用于优化器的评估目的。系统也不会假设不同查询之间的数据保留在磁盘缓存上。默认是4GB。

指定值的时候，如果不指定unit，默认就是block。

#select name, setting, unit from pg_settings where name like 'effective_cache_size'; 
         name         | setting | unit 
----------------------+---------+------
 effective_cache_size | 524288  | 8kB

成本评估要考虑很多因素：i/o数量、操作调用次数、处理的元组的数量、选择性等等。但是i/o的成本是什么呢？很显然，如果数据已经在cache中或数据在磁盘上，代价显然是不同的。

参数effective_cache_size就是用来告诉优化器，系统可以提供多大的cache。这里的cache不仅仅是内存的cache，也考虑了文件系统cache、cpu的cache等。effective_cache_size是这些cache的总和。

postgres=# create table t_random as select id,random() as r from generate_series(1,1000000) as id order by random();
SELECT 1000000
postgres=# create table t_ordered as select id,random() as r from generate_series(1,1000000) AS id;
SELECT 1000000
postgres=# create index idx_random on t_random(id);
CREATE INDEX
postgres=# create index idx_ordered on t_ordered(id);
CREATE INDEX
postgres=# vacuum analyze t_random;
VACUUM
postgres=# vacuum analyze t_ordered;
VACUUM
postgres=#

两个表都包含相同的数据，一个表是有序的，一个是无序的。

将effective_cache_size设置一个较小的值。优化器会认为系统的内存不是很多：

postgres=# set effective_cache_size to '1 MB';
SET
postgres=# show effective_cache_size;
 effective_cache_size 
----------------------
 1MB
(1 row)

postgres=# set enable_bitmapscan to on;
SET
postgres=# explain SELECT * FROM t_random WHERE id < 1000;
                                 QUERY PLAN                                 
----------------------------------------------------------------------------
 Bitmap Heap Scan on t_random  (cost=19.71..2453.44 rows=940 width=12)
   Recheck Cond: (id < 1000)
   ->  Bitmap Index Scan on idx_random  (cost=0.00..19.48 rows=940 width=0)
         Index Cond: (id < 1000)
(4 rows)

postgres=# set enable_bitmapscan to off;
SET
postgres=# explain SELECT * FROM t_random WHERE id < 1000;
                                   QUERY PLAN                                    
---------------------------------------------------------------------------------
 Index Scan using idx_random on t_random  (cost=0.42..3732.86 rows=940 width=12)
   Index Cond: (id < 1000)
(2 rows)

postgres=#

通常pg会走bitmap索引扫描，但是这里我们想看看索引扫描会发生什么。所以关闭了bitmap索引扫描。

postgres=# SET effective_cache_size TO '1000 GB';
SET
postgres=# explain SELECT * FROM t_random WHERE id < 1000;
                                   QUERY PLAN                                    
---------------------------------------------------------------------------------
 Index Scan using idx_random on t_random  (cost=0.42..3488.86 rows=940 width=12)
   Index Cond: (id < 1000)
(2 rows)

postgres=#

可以看到，索引扫描的成本降低了。

我们必须把成本看作是“相对的”。绝对的数字并不重要——重要的是一个计划与其他计划相比有多贵。
如果顺序扫描的成本保持不变，而索引扫描的价格相对于顺序扫描下降了，PostgreSQL会更倾向于索引。这正是effective_cache_size的核心内容:在有大量RAM的情况下，更有可能进行使用索引扫描。

当谈及如何配置postgres.conf文件中的effective_cache_size的设置的时候，往往没有意识到并不会有什么神奇的效果。

postgres=# set effective_cache_size to '1 MB';
SET
postgres=# explain SELECT * FROM t_ordered WHERE id < 1000;
                                   QUERY PLAN                                    
---------------------------------------------------------------------------------
 Index Scan using idx_ordered on t_ordered  (cost=0.42..38.85 rows=996 width=12)
   Index Cond: (id < 1000)
(2 rows)

postgres=# SET effective_cache_size TO '1000 GB';
SET
postgres=# explain SELECT * FROM t_ordered WHERE id < 1000;
                                   QUERY PLAN                                    
---------------------------------------------------------------------------------
 Index Scan using idx_ordered on t_ordered  (cost=0.42..38.85 rows=996 width=12)
   Index Cond: (id < 1000)
(2 rows)

postgres=#

优化器使用的表统计信息包含关于物理“相关性”的信息。如果相关性是1，即所有数据是有序的在磁盘上。effective_cache_size并不会改变什么。

如果只有一个列，同样也不会有什么效果：

postgres=# ALTER TABLE t_random DROP COLUMN r;
ALTER TABLE
postgres=# SET effective_cache_size TO '1 MB';
SET
postgres=# explain SELECT * FROM t_random WHERE id < 1000;
                                    QUERY PLAN                                     
-----------------------------------------------------------------------------------
 Index Only Scan using idx_random on t_random  (cost=0.42..28.88 rows=940 width=4)
   Index Cond: (id < 1000)
(2 rows)

postgres=# SET effective_cache_size TO '1000 GB';
SET
postgres=# explain SELECT * FROM t_random WHERE id < 1000;
                                    QUERY PLAN                                     
-----------------------------------------------------------------------------------
 Index Only Scan using idx_random on t_random  (cost=0.42..28.88 rows=940 width=4)
   Index Cond: (id < 1000)
(2 rows)

postgres=#

调优建议：

effective_cache_size = RAM * 0.7

如果是pg专用服务器，也可以考虑设置为RAM*0.8。

相关阅读:
【USACO】clocks 遇到各种问题最后还是参考别人的思路
 【USACO】calfflac
【USACO】namenum
【USACO】beads
最大流问题
 模运算性质摘自百度百科
 学历是铜牌，能力是银牌，人脉是金牌，思维是王牌
 U盘版Windows 10已经在亚马逊Amazon開始接受预订啦
 hadoop实验：求气象数据的最低温度
 Appium Android Bootstrap源代码分析之启动执行
原文地址：https://www.cnblogs.com/abclife/p/14565691.html