运维笔记--postgresql占用CPU问题定位
场景描述:
业务系统访问变慢,登陆服务器查看系统负载并不高,然后查看占用CPU较高的进程,发现是连接数据库的几个进程占用系统资源较多。
处理方式:
查找出占用系统内存&CPU排名前10的进程:[或者用top命令查看] ---这里需要注意,如果用了容器,需要进入容器内部查看相应的进程。
ps aux|head -1;ps aux|grep -v PID|sort -rn -k +3|head
切换到postgres用户,执行psql,进入数据库终端:指定上述命令找到的系统进程号
SELECT procpid, START, now() - START AS lap, current_query FROM ( SELECT backendid, pg_stat_get_backend_pid (S.backendid) AS procpid, pg_stat_get_backend_activity_start (S.backendid) AS START,pg_stat_get_backend_activity (S.backendid) AS current_query FROM (SELECT pg_stat_get_backend_idset () AS backendid) AS S) AS S WHERE current_query <> '<IDLE>' and procpid=15874 ORDER BY lap DESC;
定位到SQL,确认该SQL完成的业务查询功能,查看执行计划,增加索引or 修改代码。
SELECT "******_edoc_queue".id FROM "******_edoc_queue" WHERE (("*******_edoc_queue"."edoc_id" = '521300000004TCS60515001FV-960157.pdf') AND ("*****_edoc_queue"."active" = true)) ORDER BY "*****_edoc_queue"."id"
查询该条SQL的执行计划:(Postgresql使用explain analyze + sql语法的格式)
postgres=# c ***你的实际模式schema You are now connected to database "stbg" as user "postgres". stbg=# explain analyze SELECT "cus_center_new_edoc_queue".id FROM "cus_center_new_edoc_queue" WHERE (("cus_center_new_edoc_queue"."edoc_id" = '521300000008TCS60417066FV-960101.pdf') AND ("cus_center_new_edoc_queue"."active" = true)) ORDER BY "cus_center_new_edoc_queue"."id"; ---得到如下执行计划: stbg= BY "cus_center_new_edoc_queue"."id";ter_new_edoc_queue"."active" = true)) ORDER QUERY PLAN ----------------------------------------------------------------------------- Sort (cost=21044.85..21044.85 rows=1 width=4) (actual time=109.905..109.905 ro ws=0 loops=1) Sort Key: id Sort Method: quicksort Memory: 25kB -> Seq Scan on cus_center_new_edoc_queue (cost=0.00..21044.84 rows=1 width=4) (actual time=109.880..109.880 rows=0 loops=1) Filter: (active AND ((edoc_id)::text = '521300000008TCS60417066FV-960101.pdf'::text)) Rows Removed by Filter: 583348 Planning time: 0.468 ms Execution time: 109.952 ms (8 rows) ----可以看出执行查询时间:109.952 ms