创建统计信息
预估行数
pg
中对于单列选择性的估算比较准确,而对于多列的情况则会出现不准确的情况,因为pg
默认使用独立属性,直接以多个字段选择性相乘的方法计算多个字段条件的选择性
创建统计信息功能是在pg10
版本引入的功能,我们在优化SQL
的时候,一个最重要的点就是统计信息是否准确,统计信息不准会导致优化器预估的行数不准确,进而影响扫描方法和连接方式。
--创建表
postgres=# create table test2(n_id int,id1 int,id2 int);
CREATE TABLE
postgres=# insert into test2 select i,i/1000,i/10000 from generate_series(1,1000000) s(i);
INSERT 0 1000000
postgres=# analyze test2;
ANALYZE
postgres=# \x
Expanded display is on.
postgres=# select * from pg_stats where tablename = 'test2' and attname = 'id1';
-[ RECORD 1 ]----------+-------------------------------------------------------------------------------------
schemaname | public
tablename | test2
attname | id1
inherited | f
null_frac | 0
avg_width | 4
n_distinct | 1000
most_common_vals | {381,649,852,142,269,415,496,537,714,80,177,303,526,870,924}
most_common_freqs | {0.0016,0.0016,0.0015666666,0.0015333333,0.0015333333...}
histogram_bounds | {0,10,19,29,39,49,59,69,78,89,99,109,119,128,139,...}
correlation | 1
most_common_elems |
most_common_elem_freqs |
elem_count_histogram |
--查看执行计划
postgres=# explain (analyse,buffers) select * from test2 where id1 = 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Seq Scan on test2 (cost=0.00..17906.00 rows=992 width=12) (actual time=0.144..109.573 rows=1000 loops=1)
Filter: (id1 = 1)
Rows Removed by Filter: 999000
Buffers: shared hit=5406
Planning Time: 0.118 ms
Execution Time: 109.697 ms
(6 rows)
预估的行数为992
和实际返回1000
,预估和实际非常接近
- 如果在
id1
和id2
上都过滤数据时,会怎么样?
postgres=# explain (analyse,buffers) select * from test2 where id1 = 1 and id2= 0;
QUERY PLAN
----------------------------------------------------------------------------------------------------------
Seq Scan on test2 (cost=0.00..20406.00 rows=10 width=12) (actual time=0.153..138.057 rows=1000 loops=1)
Filter: ((id1 = 1) AND (id2 = 0))
Rows Removed by Filter: 999000
Buffers: shared hit=5406
Planning Time: 0.267 ms
Execution Time: 138.184 ms
(6 rows)
预估为10
但是实际为1000
,相差100
倍了,为什么会发生这种情况呢?
第一列的选择性大约是0.001
(1/1000),第二列的选择性是0.01
(1/100)。为了计算被这2个 "独立 "条件过滤的行数,planner
将它们的选择性相乘。所以,我们得到
选择性 = 0. 001 * 0. 01 = 0. 00001
当这个乘以我们在表中的行数即1000000
时,我们得到10
。这就是planner
估计的10
的由来。但是,这几列不是独立的,我们怎么告诉planner
呢?
函数依赖
pg10
开始支持用户自定义统计信息,这样我们便可以针对这种多列的情况创建自定义多个字段的统计信息,目前支持多列相关性和多列唯一值两种统计。
创建函数依赖
回到我们之前的估算问题,问题是col2
的值其实不过是col1 / 10
。在数据库术语中,我们会说col2
在功能上依赖于col1
。这意味着col1
的值足以决定col2
的值,不存在两行col1
的值相同而col2
的值不同的情况。因此,col2
上的第2
个过滤器实际上并没有删除任何行!但是,planner
捕捉到了足够的统计数据。但是,规划者捕捉到了足够的统计数据来知道这一点。
--创建统计信息
postgres=# create statistics s1(dependencies) on id1,id2 from test2;
CREATE STATISTICS
--analyze
postgres=# analyze test2;
ANALYZE
--查看执行计划
postgres=# explain (analyse,buffers) select * from test2 where id1 = 1 and id2 = 0;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Seq Scan on test2 (cost=0.00..20406.00 rows=997 width=12) (actual time=0.159..124.450 rows=1000 l oops=1)
Filter: ((id1 = 1) AND (id2 = 0))
Rows Removed by Filter: 999000
Buffers: shared hit=5406
Planning Time: 0.364 ms
Execution Time: 124.592 ms
(6 rows)
--查看统计信息
postgres=# SELECT stxname,stxkeys,extdat.stxddependencies FROM pg_statistic_ext ext join pg_statistic_ext_data extdat on ext.oid = extdat.stxoid;
stxname | stxkeys | stxddependencies
---------+---------+----------------------
s1 | 2 3 | {"2 => 3": 1.000000}
(1 row)
--stxkeys中的2 3表示表的第二列和第三列
--从这一点来看,我们可以看到Postgres意识到col1完全决定了col2,因此有一个系数为1来捕捉这些信息。现在,所有对这两列进行过滤的查询都会有更好的估计。
postgres=# select statistics_name,attnames,dependencies from pg_stats_ext;
statistics_name | attnames | dependencies
-----------------+-----------+----------------------
s1 | {id1,id2} | {"2 => 3": 1.000000}
(1 row)
简单来说就是id1
相同的两行数据,那么id2
也一定相同,(比如前面插入的数据id2=0
的有10000
条数据,那么id1
的值分别是(1-9)分别有1000
条数据,任意两行id1
相等的数据,id2
一定也是一样的)
如果没有函数依赖性统计,规划器会认为两个WHERE
条件是独立的, 并且会将它们的选择性乘以一起,以致得到太小的行数估计。 通过这样的统计,规划器认识到WHERE
条件是多余的,并且不会低估行数。
当前只有在考虑简单等值条件(将列与常量值比较)时,函数依赖才适用。不会使用它们来改进比较两个列或者比较列和表达式的等值条件的估计,也不会用它们来改进范围子句、LIKE
或者任何其他类型的条件。
ndistinct 统计
单列统计信息存储每一列中可区分值的数量。在组合多个列(例如GROUP BY a,b
)时,如果规划器只有单列统计数据,则对可区分值数量的估计常常会错误,导致选择不好的计划
--对test2表进行group by id1,id2操作
postgres=# explain (analyse,buffers) select id1,id2,count(*) from test2 group by id1,id2;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=22906.00..23906.00 rows=100000 width=16) (actual time=473.444..474.544 rows=1001 loops=1)
Group Key: id1, id2
Buffers: shared hit=5406
-> Seq Scan on test2 (cost=0.00..15406.00 rows=1000000 width=8) (actual time=0.022..178.253 rows=1000000 loops=1)
Buffers: shared hit=5406
Planning Time: 1.202 ms
Execution Time: 479.178 ms
(7 rows)
planner
预估的行数是100000
,而实际行数只有1001
--创建ndistinct
postgres=# create statistics s2(ndistinct) on id1,id2 from test2;
CREATE STATISTICS
postgres=# analyze test2;
ANALYZE
--可以看到planner预估值更加准确了,预估1000,实际返回10001
postgres=# explain (analyse,buffers) select id1,id2,count(*) from test2 group by id1,id2;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=22906.00..22916.00 rows=1000 width=16) (actual time=442.839..443.160 rows=1001 loops=1)
Group Key: id1, id2
Buffers: shared hit=5406
-> Seq Scan on test2 (cost=0.00..15406.00 rows=1000000 width=8) (actual time=0.029..147.498 rows=1000000 loops=1)
Buffers: shared hit=5406
Planning Time: 0.364 ms
Execution Time: 444.362 ms
(7 rows)
--统计信息记录了ndistinct值:为1000
postgres=# sELECT stxname,stxkeys,extdat.stxdndistinct FROM pg_statistic_ext ext join pg_statistic_ext_data extdat on ext.oid = extdat.stxoid where stxname = 's2';
stxname | stxkeys | stxdndistinct
---------+---------+----------------
s2 | 2 3 | {"2, 3": 1000}
(1 row)
常见的情况有月,季,年的列。省,市,区等需要联合group by的情况
建议只对实际用于分组的列组合以及分组数错误估计导致了糟糕计划的列组合创建ndistinct
统计信息对象。否则,ANALYZE
循环只会被浪费。
高频值-mvc
MCV(most common values)
如果没有函数依赖性统计,规划器会认为两个WHERE
条件是独立的, 并且会将它们的选择性乘以一起,以致得到太小的行数估计。 通过这样的统计,规划器认识到WHERE
条件是多余的,并且不会低估行数。
创建表t2
与两个完全相关的列(包含相同的数据),并且在这些列上创建一个MCV
列表:
--创建表
CREATE TABLE t2 (a int,b int);
--插入数据
INSERT INTO t2 SELECT mod(i,100), mod(i,100)FROM generate_series(1,1000000) s(i);
analyze t2;
--查看执行计划预估97,实际10000
postgres=# EXPLAIN ANALYZE SELECT * FROM t2 WHERE (a = 70) AND (b = 70);
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Seq Scan on t2 (cost=0.00..19425.00 rows=97 width=8) (actual time=0.038..123.438 rows=10000 loops=1)
Filter: ((a = 70) AND (b = 70))
Rows Removed by Filter: 990000
Planning Time: 0.150 ms
Execution Time: 124.647 ms
(5 rows)
--创建mcv
CREATE STATISTICS s3 (mcv) ON a, b FROM t2;
ANALYZE t2;
-- valid combination (found in MCV),a=70 and b=70预估11267,实际10000,相差不大
postgres=# EXPLAIN ANALYZE SELECT * FROM t2 WHERE (a = 70) AND (b = 70);
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Seq Scan on t2 (cost=0.00..19425.00 rows=11267 width=8) (actual time=0.069..181.738 rows=10000 loops=1)
Filter: ((a = 70) AND (b = 70))
Rows Removed by Filter: 990000
Planning Time: 1.120 ms
Execution Time: 182.452 ms
(5 rows)
-- invalid combination (not found in MCV)
postgres=# EXPLAIN ANALYZE SELECT * FROM t2 WHERE (a = 70) AND (b = 80);
QUERY PLAN
--------------------------------------------------------------------------------------------------
Seq Scan on t2 (cost=0.00..19425.00 rows=1 width=8) (actual time=125.878..125.879 rows=0 loops=1)
Filter: ((a = 70) AND (b = 80))
Rows Removed by Filter: 1000000
Planning Time: 0.207 ms
Execution Time: 125.945 ms
(5 rows)
----可以看到{70,70}在高频值中,{70,80}不在高频值中
postgres=# SELECT m.* FROM pg_statistic_ext join pg_statistic_ext_data on (oid = stxoid)
postgres-# , pg_mcv_list_items(stxdmcv) m WHERE stxname = 's3';
index | values | nulls | frequency | base_frequency
-------+---------+-------+----------------------+------------------------
0 | {70,70} | {f,f} | 0.011266666666666666 | 0.00012693777777777776
1 | {78,78} | {f,f} | 0.0111 | 0.00012321
2 | {32,32} | {f,f} | 0.011066666666666667 | 0.00012247111111111112
3 | {13,13} | {f,f} | 0.011033333333333332 | 0.00012173444444444442
4 | {82,82} | {f,f} | 0.011 | 0.00012099999999999999
....
--当 WHERE (a = 70) AND (b = 70)的时候rows=11267是如何计算的呢
rows= 1000000 * 0.011266666666666666
= 11267
a
和b
的组合中实际频率(在样本中)约为1%
。 组合的基本频率(根据简单的每列频率计算)仅为0.01%
,导致两个数量级的低估。
--计算WHERE (a = 1) AND (b = 2)的预估值
--为方便计算将a和b的statisics设置为10
alter table t2 alter column a SET STATISTICS 10;
alter table t2 alter column b SET STATISTICS 10;
analyze t2;
postgres=# SELECT null_frac, n_distinct, most_common_vals, most_common_freqs FROM pg_stats
postgres-# WHERE tablename='t2' AND attname in('a','b');
null_frac | n_distinct | most_common_vals | most_common_freqs
-----------+------------+------------------+-------------------
0 | 100 | {7,85} | {0.015,0.015}
0 | 100 | {7,85} | {0.015,0.015}
(2 rows)
postgres=# EXPLAIN ANALYZE SELECT * FROM t2 WHERE (a = 1) AND (b = 2);
QUERY PLAN
--------------------------------------------------------------------------------------------------
Seq Scan on t2 (cost=0.00..19425.75 rows=98 width=8) (actual time=137.876..137.876 rows=0 loops=1)
Filter: ((a = 1) AND (b = 2))
Rows Removed by Filter: 1000000
Planning Time: 0.452 ms
Execution Time: 137.924 ms
(5 rows)
--a和b的选择率都是一样的
selectivity = (1 - sum(mvf))/(num_distinct - num_mcv)
postgres=# select (1-(0.014999999664723873+0.014999999664723873))/(100-2);
?column?
------------------------
0.00989795919051583933
(1 row)
--rows,预估返回98行
rows= reltuple*selectivity(a=1) * selectivity(b = 2)
postgres=# select 1000000*0.00989795919051583933*0.00989795919051583933;
?column?
---------------------------------------------
97.9695961371169693741399756143748489000000
(1 row)
--其他a=70 and b=80没有在高频中,预估也是98
postgres=# EXPLAIN ANALYZE SELECT * FROM t2 WHERE (a =70) AND (b =80);
QUERY PLAN
---------------------------------------------------------------------------------------------------
Seq Scan on t2 (cost=0.00..19425.75 rows=98 width=8) (actual time=121.889..121.889 rows=0 loops=1)
Filter: ((a = 70) AND (b = 80))
Rows Removed by Filter: 1000000
Planning Time: 0.160 ms
Execution Time: 121.952 ms
(5 rows)
建议仅在实际在条件中一起使用的列的组合上创建MCV
统计对象,对于这些组合,错误估计组数会导致糟糕的执行计划。 否则,只会浪费ANALYZE
和规划时间。
default_statistics_target
提升该限制可能会让规划器做出更准确的估计(特别是对那些有不规则数据分布的列), 其代价是在pg_statistic
中消耗了更多空间,并且需要略微多一些的时间来计算估计数值。
注意:该值的取值范围是0-1000,其中值越低采样比例就越低,分析结果的准确性也就越低,但是ANALYZE命令执行的速度却更快。如果将该值设置为-1,那么该字段的采样比率将恢复到系统当前默认的采样值,我们可以通过下面的命令获取当前系统的缺省采样值。
postgres=# show default_statistics_target;
default_statistics_target
---------------------------
100
(1 row)
从上面的结果可以看出,该数据库的缺省采样值为100(10%)。
- 可以修改表字段也可以修改索引
ALTER TABLE [ IF EXISTS ] [ ONLY ] name [ * ]
action [, ... ]
ALTER [ COLUMN ] column_name SET STATISTICS integer
--创建test表插入100w数据
postgres=# CREATE TABLE test AS (SELECT random() x, random() y FROM generate_series(1,1000000));
SELECT 1000000
postgres=# ANALYZE test;
ANALYZE
--创建索引
postgres=# create index i_test_idx on test((x+y));
CREATE INDEX
postgres=# analyze test;
ANALYZE
--查看x+y的执行计划
postgres=# explain analyze select * from test where x+y <0.01;
QUERY PLAN
---------------------------------------------------------------------------------------------------
Bitmap Heap Scan on test (cost=7.68..673.21 rows=652 width=16) (actual time=0.036..0.283 rows=60 loops=1)
Recheck Cond: ((x + y) < '0.01'::double precision)
Heap Blocks: exact=60
-> Bitmap Index Scan on i_test_idx (cost=0.00..7.51 rows=652 width=0) (actual time=0.017..0.017 rows=60 loops=1)
Index Cond: ((x + y) < '0.01'::double precision)
Planning Time: 0.569 ms
Execution Time: 0.342 ms
(7 rows)
--修改statistics
postgres=# ALTER INDEX i_test_idx ALTER COLUMN expr SET STATISTICS 3000;
ALTER INDEX
postgres=# analyze test;
ANALYZE
--再次查看执行计划发现预估的值和实际的值比较接近了
postgres=# EXPLAIN ANALYZE SELECT * FROM test WHERE x + y < 0.01;
QUERY PLAN
---------------------------------------------------------------------------------------------------
Index Scan using i_test_idx on test (cost=0.42..135.64 rows=121 width=16) (actual time=0.011..0.277 rows=60 loops=1)
Index Cond: ((x + y) < '0.01'::double precision)
Planning Time: 0.515 ms
Execution Time: 0.342 ms
(4 rows)
--修改为10000
postgres=# ALTER INDEX i_test_idx ALTER COLUMN expr SET STATISTICS 10000;
ALTER INDEX
postgres=# analyze test;
ANALYZE
postgres=# EXPLAIN ANALYZE SELECT * FROM test WHERE x + y < 0.01;
QUERY PLAN
---------------------------------------------------------------------------------------------------
Index Scan using i_test_idx on test (cost=0.42..80.87 rows=71 width=16) (actual time=0.010..0.217
rows=60 loops=1)
Index Cond: ((x + y) < '0.01'::double precision)
Planning Time: 0.784 ms
Execution Time: 0.283 ms
(4 rows)
使用alter
修改statistics
为3000
,这个数字设置了直方图中使用了多少个桶以及存储了多少个最常见的值,
这样取样的粒度也就越细,可以查看pg_stats
中histogram_bounds
记录了更多的值。
- 查看修改的值
postgres=# select cla.relname,att.attname,att.attstattarget from pg_attribute att join pg_class cla on att.attrelid=cla.oid where cla.relname = 'i_test_idx';
relname | attname | attstattarget
------------+---------+---------------
i_test_idx | expr | 3000
(1 row)
--或者
postgres=# \d+ i_test_idx
Index "public.i_test_idx"
Column | Type | Key? | Definition | Storage | Stats target
--------+------------------+------+------------+---------+--------------
expr | double precision | yes | (x + y) | plain | 3000
btree, for table "public.test"
代价异常
今天人问为什么index
时代价显示比最终的代价小,有时候执行计划节点的某一个子节点的cost
比总的cost
大是正常情况
一般来说常见的有两种情况
- 1、在有
limit 1
的情况下,实际的行数只有1并且运行时间远低于开销估计所建议的时间。这并非预估错误
postgres=# explain analyze select * from test3 where n_id < 1000 limit 1;
QUERY PLAN
---------------------------------------------------------------------------------------------------
Limit (cost=0.29..0.32 rows=1 width=4) (actual time=0.022..0.022 rows=1 loops=1)
-> Index Only Scan using i_test3_id on test3 (cost=0.29..25.13 rows=939 width=4) (actual time=
0.019..0.019 rows=1 loops=1)
Index Cond: (n_id < 1000)
Heap Fetches: 1
Planning Time: 0.297 ms
Execution Time: 0.092 ms
(6 rows)
postgres=# explain analyze select * from test3 where n_id < 1000 ;
QUERY PLAN
---------------------------------------------------------------------------------------------------
Index Only Scan using i_test3_id on test3 (cost=0.29..25.13 rows=939 width=4) (actual time=0.020..19.543 rows=999 loops=1)
Index Cond: (n_id < 1000)
Heap Fetches: 999
Planning Time: 0.718 ms
Execution Time: 19.707 ms
(5 rows)
-
2、归并连接也有这样的情况,如果一个归并连接用尽了一个输入并且其中的最后一个键值小于另一个输入中的下一个键值,它将停止读取另一个输入。在这种情况下不过会有更多的匹配,因此不需要第二个输入的剩余部分。这会导致不读取另一个子节点的所有内容
-
Index Scan using i_aj_all_bh_ysaj
预估的代价是47W
,而最终总的预估为39W
GroupAggregate (cost=391236.41..391243.61 rows=188 width=45) (actual time=5839.527..5861.034 rows=184 loops=1)
Group Key: test_1.c_jbfy
-> Sort (cost=391236.41..391238.18 rows=710 width=38) (actual time=5839.324..5847.166 rows=105340 loops=1)
Sort Key: test_1.c_jbfy
Sort Method: quicksort Memory: 11302kB
-> Merge Join (cost=342460.08..391202.78 rows=710 width=38) (actual time=4280.410..5688.354 rows=105340 loops=1)
Merge Cond: ((test.c_bh_ysaj)::text = (test_1.c_bh)::text)
-> Index Scan using i_aj_all_bh_ysaj on test (cost=0.43..470402.81 rows=127589 width=32) (actual time=0.012..1054.034 rows=162022 loops=1)
Index Cond: (c_bh_ysaj IS NOT NULL)
Filter: ((c_ah IS NOT NULL) AND (d_jarq >= to_date('20190101'::text, 'yyyymmdd'::text)) AND (d_jarq <= to_date('20200101'::text, 'yyyymmdd'::text))))
Rows Removed by Filter: 254334
-> Sort (cost=342459.60..343014.47 rows=221949 width=38) (actual time=4279.757..4342.135 rows=357118 loops=1)
Sort Key: test_1.c_bh
Sort Method: quicksort Memory: 40188kB
.......
--另外由于实现的限制,BitmapAnd和BitmapOr节点总时报告实际行数为0
总结
1、当pg中的列之间有相关性的时候,统计信息很可能不准确,这个时候使用函数依赖能解决统计不准的问题
2、针对有group by a,b
这种的可以创建ndistinct
来改善执行计划
3、创建高频值可以让统计信息更加准确,default_statistics_target
设置桶的数量可以让统计信息更加准确,以此来提高查询效率
5、再有limit
和merge join
的情况下代价是不一样的
参考资料:https://www.citusdata.com/blog/2018/03/06/postgres-planner-and-its-usage-of-statistics
http://mysql.taobao.org/monthly/2016/05/09/
https://blog.fearcat.in/a?ID=00001-3cf74023-0519-423d-aefe-64eee59bfdbe
https://blog.csdn.net/weixin_39540651/article/details/103928235