• GreenPlum 大数据平台--segment 失效问题排查


    01,segment

      检查一:

      在master节点上检查失效的segment

      正常情况下:

     1 20190711:16:08:57:024059 gpstate:greenplum01:gpadmin-[INFO]:-Starting gpstate with args: -e
     2 20190711:16:08:57:024059 gpstate:greenplum01:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44'
     3 20190711:16:08:57:024059 gpstate:greenplum01:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.3.23 (Greenplum Database 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.2.0, 64-bit compiled on Jan 16 2019 02:32:15'
     4 20190711:16:08:57:024059 gpstate:greenplum01:gpadmin-[INFO]:-Obtaining Segment details from master...
     5 20190711:16:08:57:024059 gpstate:greenplum01:gpadmin-[INFO]:-Gathering data fromsegments...
     6 ..
     7 20190711:16:08:59:024059 gpstate:greenplum01:gpadmin-[INFO]:-----------------------------------------------------
     8 20190711:16:08:59:024059 gpstate:greenplum01:gpadmin-[INFO]:-Segment Mirroring Status Report
     9 20190711:16:08:59:024059 gpstate:greenplum01:gpadmin-[INFO]:-----------------------------------------------------
    10 20190711:16:08:59:024059 gpstate:greenplum01:gpadmin-[INFO]:-All segments are running normally
    View Code

      检查二:

    psql -c "SELECT * FROM gp_segment_configuration WHERE status='d';"
    [gpadmin@greenplum01 ~]$ psql -c "SELECT * FROM gp_segment_configuration WHERE status='d';"
     dbid | content | role | preferred_role | mode | status | port | hostname | address | replication_port
    ------+---------+------+----------------+------+--------+------+----------+---------+------------------
    (0 rows)

      检查三:

    gpstate -m
     1 0190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:-Starting gpstate with args: -m
     2 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44'
     3 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.3.23 (Greenplum Database 5.16.0 build commit:23cec7df0406d69d6552a4bbb77035dba4d7dd44) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.2.0, 64-bit compiled on Jan 16 2019 02:32:15'
     4 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:-Obtaining Segment details from master...
     5 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:--------------------------------------------------------------
     6 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:--Current GPDB mirror list and status
     7 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:--Type = Group
     8 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:--------------------------------------------------------------
     9 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:-   Mirror        Datadir  Port    Status    Data Status
    10 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:-   greenplum03   /greenplum/data/mirror/gpseg0  43000   Passive   Synchronized
    11 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:-   greenplum03   /greenplum/data/mirror/gpseg1  43001   Passive   Synchronized
    12 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:-   greenplum03   /greenplum/data2/mirror/gpseg2  43002   Passive   Synchronized
    13 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:-   greenplum03   /greenplum/data2/mirror/gpseg3  43003   Passive   Synchronized
    14 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:-   greenplum02   /greenplum/data/mirror/gpseg4  43000   Passive   Synchronized
    15 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:-   greenplum02   /greenplum/data/mirror/gpseg5  43001   Passive   Synchronized
    16 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:-   greenplum02   /greenplum/data2/mirror/gpseg6  43002   Passive   Synchronized
    17 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:-   greenplum02   /greenplum/data2/mirror/gpseg7  43003   Passive   Synchronized
    18 20190711:16:12:19:024199 gpstate:greenplum01:gpadmin-[INFO]:--------------------------------------------------------------
    View Code

      检查四:日志检查

    gplogfilter -t
     1 [gpadmin@greenplum01 ~]$ gplogfilter -t
     2 requested timestamp range from beginning of data to end of data
     3 ----------  /greenplum/data/master/gpseg-1/pg_log/startup.log ----------
     4        in:      21 lines,      21 log entries; timestamps from 2019-07-11 11:30:36.331409 to 2019-07-11 11:31:29.331627
     5     match:       0 lines
     6       out:       0 lines,       0 log entries
     7 ----------  /greenplum/data/master/gpseg-1/pg_log/gpdb-2019-07-11_113036.csv ----------
     8 2019-07-11 11:31:22.514551 CST|||p10925|th2011645824||||0|||seg-1|||||FATAL: |57P01|terminating connection due to administrator command|||||||0||postgres.c|3670|
     9        in:      88 lines,      88 log entries; timestamps from 2019-07-11 11:30:36.469747 to 2019-07-11 11:31:22.514551
    10     match:       1 lines,       1 log entries; timestamps from 2019-07-11 11:31:22.514551 to 2019-07-11 11:31:22.514551
    11       out:       1 lines,       1 log entries; timestamps from 2019-07-11 11:31:22.514551 to 2019-07-11 11:31:22.514551
    12 ----------  /greenplum/data/master/gpseg-1/pg_log/gpdb-2019-07-11_113124.csv ----------
    13        in:      63 lines,      63 log entries; timestamps from 2019-07-11 11:31:24.020944 to 2019-07-11 11:31:25.144209
    14     match:       0 lines
    15       out:       0 lines,       0 log entries
    16 ----------  /greenplum/data/master/gpseg-1/pg_log/gpdb-2019-07-11_113129.csv ----------
    17 2019-07-11 13:53:29.393443 CST|gpadmin|gpdb|p21035|th280524672|[local]||2019-07-11 13:53:29 CST|0|con20||seg-1||||sx1|FATAL: |3D000|database "gpdb" does not exist|||||||0||postinit.c|790|
    18 2019-07-11 14:02:15.111734 CST|kingle|gpdb|p21208|th280524672|[local]||2019-07-11 14:02:15 CST|0|con22||seg-1||||sx1|FATAL: |28000|no pg_hba.conf entry for host "[local]", user "kingle", database "gpdb", SSL off|||||||0||auth.c|623|
    19 2019-07-11 14:02:39.905762 CST|gpadmin|gpdb|p21274|th280524672|[local]||2019-07-11 14:02:39 CST|0|con23||seg-1||||sx1|FATAL: |3D000|database "gpdb" does not exist|||||||0||postinit.c|790|
    20 2019-07-11 14:03:15.951249 CST|kingle|gpdb|p21283|th280524672|[local]||2019-07-11 14:03:15 CST|0|con25||seg-1||||sx1|FATAL: |28000|no pg_hba.conf entry for host "[local]", user "kingle", database "gpdb", SSL off|||||||0||auth.c|623|
    21 2019-07-11 14:03:26.389797 CST|kingle|postgres|p21289|th280524672|[local]||2019-07-11 14:03:26 CST|0|con26||seg-1||||sx1|FATAL: |28000|no pg_hba.conf entry for host "[local]", user "kingle", database "postgres", SSL off|||||||0||auth.c|623|
    22 2019-07-11 14:06:12.037982 CST|kingle|postgres|p21541|th280524672|192.168.0.221|2702|2019-07-11 14:06:12 CST|0|con27||seg-1||||sx1|FATAL: |28000|no pg_hba.conf entry for host "192.168.0.221", user "kingle", database "postgres", SSL off|||||||0||auth.c|623|
    23 2019-07-11 14:07:01.948006 CST|kingle|postgres|p21561|th280524672|192.168.0.221|2720|2019-07-11 14:07:01 CST|0|con28||seg-1||||sx1|FATAL: |28000|no pg_hba.conf entry for host "192.168.0.221", user "kingle", database "postgres", SSL off|||||||0||auth.c|623|
    24 2019-07-11 14:07:13.876319 CST|kingle|postgres|p21564|th280524672|192.168.0.221|2722|2019-07-11 14:07:13 CST|0|con29||seg-1||||sx1|FATAL: |28000|no pg_hba.conf entry for host "192.168.0.221", user "kingle", database "postgres", SSL off|||||||0||auth.c|623|
    25 2019-07-11 14:08:18.729975 CST|gpadmin|gpdb|p21582|th280524672|[local]||2019-07-11 14:08:18 CST|0|con30||seg-1||||sx1|FATAL: |3D000|database "gpdb" does not exist|||||||0||postinit.c|790|
    26 2019-07-11 14:08:50.351436 CST|gpadmin|gpdb|p21609|th280524672|[local]||2019-07-11 14:08:50 CST|0|con33||seg-1||||sx1|FATAL: |3D000|database "gpdb" does not exist|||||||0||postinit.c|790|
    27 2019-07-11 14:09:05.416505 CST|gpadmin|postgres|p21614|th280524672|[local]||2019-07-11 14:09:05 CST|0|con35|cmd1|seg-1||dx11||sx1|ERROR: |42P04|database "gpdb" already exists||||||CREATE DATABASE gpdb;
    28 |0||dbcommands.c|901|
    29 2019-07-11 14:09:48.636153 CST|gpadmin|gpdb|p21625|th280524672|[local]||2019-07-11 14:09:27 CST|0|con37|cmd1|seg-1||dx12||sx1|ERROR: |42601|syntax error at or near ";"||||||grant
    30 ;|8||scan.l|982|
    31 2019-07-11 14:10:17.089067 CST|gpadmin|gpdb|p21625|th280524672|[local]||2019-07-11 14:09:27 CST|0|con37|cmd2|seg-1||dx13||sx1|ERROR: |42601|syntax error at or near ";"||||||grant
    32 ;|8||scan.l|982|
    33 2019-07-11 14:10:32.484569 CST|gpadmin|gpdb|p21625|th280524672|[local]||2019-07-11 14:09:27 CST|0|con37|cmd5|seg-1||dx15||sx1|ERROR: |3D000|database "demo" does not exist||||||GRANT all on database demo to kingle
    34 ;|0||dbcommands.c|2519|
    35 2019-07-11 14:11:10.314802 CST|gpadmin|gpdb|p21625|th280524672|[local]||2019-07-11 14:09:27 CST|0|con37|cmd9|seg-1||dx18||sx1|ERROR: |3F000|schema "gpdb" does not exist||||||GRANT USAGE  on SCHEMA  gpdb to kingle
    36 ;|0||aclchk.c|598|
    37 2019-07-11 14:13:19.213757 CST|gpadmin|gpdb|p21625|th280524672|[local]||2019-07-11 14:09:27 CST|0|con37|cmd21|seg-1||dx25||sx1|ERROR: |42P07|relation "test001" already exists||||||create table test001(id int,name varchar(128));|0||heap.c|1546|
    38 2019-07-11 14:13:43.227208 CST|gpadmin|gpdb|p21625|th280524672|[local]||2019-07-11 14:09:27 CST|0|con37|cmd25|seg-1||dx29||sx1|ERROR: |42601|syntax error at or near ","||||||create table test005(id int primary,name varchar(128));|36||scan.l|982|
    39 2019-07-11 14:14:05.356883 CST|gpadmin|gpdb|p21625|th280524672|[local]||2019-07-11 14:09:27 CST|0|con37|cmd39|seg-1||dx36||sx1|ERROR: |42P01|relation "test2" does not exist||||||select * from test2;|15||namespace.c|286|
    40 2019-07-11 14:14:12.261512 CST|gpadmin|gpdb|p21625|th280524672|[local]||2019-07-11 14:09:27 CST|0|con37|cmd40|seg-1||dx37||sx1|ERROR: |42P01|relation "test2" does not exist||||||select * from test2;|15||namespace.c|286|
    41 2019-07-11 14:14:25.038044 CST|gpadmin|gpdb|p21625|th280524672|[local]||2019-07-11 14:09:27 CST|0|con37|cmd41|seg-1||dx38||sx1|ERROR: |42P01|relation "test2" does not exist||||||select * from test2;|15||namespace.c|286|
    42 2019-07-11 14:14:48.737385 CST|gpadmin|gpdb|p21625|th280524672|[local]||2019-07-11 14:09:27 CST|0|con37|cmd42|seg-1||dx39||sx1|ERROR: |42P01|relation "test1" does not exist||||||select * from test1 x,test2 y where x.id=y.id;|15||namespace.c|286|
    43 2019-07-11 14:47:15.035344 CST|kingle|postgres|p22272|th280524672|192.168.0.221|3476|2019-07-11 14:47:15 CST|0|con38||seg-1||||sx1|FATAL: |28000|no pg_hba.conf entry for host "192.168.0.221", user "kingle", database "postgres", SSL off|||||||0||auth.c|623|
    44 2019-07-11 14:52:35.122438 CST|kingle|postgres|p22360|th280524672|192.168.0.221|3558|2019-07-11 14:52:35 CST|0|con39||seg-1||||sx1|FATAL: |28000|no pg_hba.conf entry for host "192.168.0.221", user "kingle", database "postgres", SSL off|||||||0||auth.c|623|
    45 2019-07-11 14:52:41.158396 CST|kingle|postgres|p22378|th280524672|[local]||2019-07-11 14:52:41 CST|0|con40||seg-1||||sx1|FATAL: |28000|no pg_hba.conf entry for host "[local]", user "kingle", database "postgres", SSL off|||||||0||auth.c|623|
    46 2019-07-11 14:52:51.572521 CST|kingle|postgres|p22380|th280524672|192.168.0.221|3576|2019-07-11 14:52:51 CST|0|con41||seg-1||||sx1|FATAL: |28000|no pg_hba.conf entry for host "192.168.0.221", user "kingle", database "postgres", SSL off|||||||0||auth.c|623|
    47 2019-07-11 14:53:06.302376 CST|kingle|postgres|p22383|th280524672|192.168.0.221|3578|2019-07-11 14:53:06 CST|0|con42||seg-1||||sx1|FATAL: |28000|no pg_hba.conf entry for host "192.168.0.221", user "kingle", database "postgres", SSL off|||||||0||auth.c|623|
    48 2019-07-11 15:20:56.537899 CST|kingle|postgres|p22922|th280524672|192.168.0.221|4066|2019-07-11 15:20:40 CST|0|con46|cmd1|seg-1||dx41||sx1|ERROR: |42P01|relation "test0001" does not exist||||||select * from test0001
    49 ;|15||namespace.c|286|
    50 2019-07-11 15:30:44.055204 CST|kingle|gpdb|p23075|th280524672|192.168.0.221|4212|2019-07-11 15:30:24 CST|0|con47|cmd3|seg-1||dx46||sx1|ERROR: |42501|permission denied for relation test001||||||select * from test001;|0||aclchk.c|1870|
    51 2019-07-11 15:34:19.475082 CST|kingle|gpdb|p23075|th280524672|192.168.0.221|4212|2019-07-11 15:30:24 CST|0|con47|cmd6|seg-1||dx48||sx1|ERROR: |42501|permission denied for relation test001||||||GRANT all on TABLE test001 to kingle;|0||aclchk.c|1870|
    52 2019-07-11 15:35:33.309475 CST|kingle|postgres|p23222|th280524672|192.168.0.221|4310|2019-07-11 15:35:21 CST|0|con49|cmd1|seg-1||dx50||sx1|ERROR: |42P01|relation "test001" does not exist||||||select * from test001
    53 ;|15||namespace.c|286|
    54 2019-07-11 15:45:58.918525 CST|gpadmin|gpdb|p23517|th280524672|[local]||2019-07-11 15:45:37 CST|0|con56|cmd1|seg-1||dx56||sx1|ERROR: |42P01|relation "schema" does not exist||||||grant all on schema to kingle
    55 ;|0||namespace.c|286|
    56 2019-07-11 16:03:46.770910 CST|gpadmin|gpdb|p23944|th280524672|[local]||2019-07-11 16:02:16 CST|0|con57|cmd16|seg-1||dx60||sx1|ERROR: |42P01|relation "table_name" does not exist||||||SELECT gp_segment_id, count(*)
    57    FROM table_name GROUP BY gp_segment_id;|41||namespace.c|286|
    58 2019-07-11 16:03:48.903080 CST|gpadmin|gpdb|p23944|th280524672|[local]||2019-07-11 16:02:16 CST|0|con57|cmd17|seg-1||dx61||sx1|ERROR: |42P01|relation "table_name" does not exist||||||SELECT gp_segment_id, count(*)
    59    FROM table_name GROUP BY gp_segment_id;|41||namespace.c|286|
    60 2019-07-11 16:11:22.854459 CST|gpadmin|gpdb|p24178|th280524672|[local]||2019-07-11 16:11:11 CST|0|con60|cmd1|seg-1||dx72||sx1|ERROR: |42601|syntax error at or near "psql"||||||psql -c "SELECT * FROM gp_segment_configuration WHERE status='d';"
    61 q
    62 ;|1||scan.l|982|
    63 2019-07-11 16:11:33.673982 CST|gpadmin|gpdb|p24178|th280524672|[local]||2019-07-11 16:11:11 CST|0|con60|cmd2|seg-1||dx73||sx1|ERROR: |42601|syntax error at or near "psql"||||||psql -c "SELECT * FROM gp_segment_configuration WHERE status='d';
    64 
    65 q
    66 
    67 "
    68 ;|1||scan.l|982|
    69 2019-07-11 16:11:54.103690 CST|gpadmin|gpdb|p24178|th280524672|[local]||2019-07-11 16:11:11 CST|0|con60|cmd3|seg-1||dx74||sx1|ERROR: |42601|syntax error at or near "psql"||||||psql -c "SELECT * FROM gp_segment_configuration WHERE status='d';"
    70 '
    71 ';|1||scan.l|982|
    72        in:     460 lines,     460 log entries; timestamps from 2019-07-11 11:31:29.463761 to 2019-07-11 16:12:19.107822
    73     match:      36 lines,      36 log entries; timestamps from 2019-07-11 13:53:29.393443 to 2019-07-11 16:11:54.103690
    74       out:      36 lines,      36 log entries; timestamps from 2019-07-11 13:53:29.393443 to 2019-07-11 16:11:54.103690
    75 ----------  /greenplum/data/master/gpseg-1/pg_log/gp_era ----------
    76        in:       3 lines,       1 log entries; no timestamps found
    77     match:       0 lines
    78       out:       0 lines,       0 log entries
    View Code

      对于WARNINGERRORFATAL或者PANIC日志级别的消息,使用gplogfilter检查Master的日志文件

      每个Segment实例上的WARNINGERRORFATAL或者PANIC日志级别的消息,使用gpssh检查

    gpssh -f seg_hosts -e 'source 
    /usr/local/greenplum-db/greenplum_path.sh ; gplogfilter -t 
    /data1/primary/*/pg_log/gpdb*.log' > seglog.out
     1 [gpadmin@greenplum01 conf]$ gpssh -f seg_hosts_file -e 'source
     2 > /usr/local/greenplum-db/greenplum_path.sh ; gplogfilter -t
     3 > gpssh -f seg_hosts_file -e 'source
     4 /usr/local/greenplum-db/greenplum_path.sh ; gplogfilter -t ^C
     5 [gpadmin@greenplum01 conf]$ gpssh -f seg_hosts -e 'source
     6 > /usr/local/greenplum-db/greenplum_path.sh ; gplogfilter -t
     7 > /data1/primary/*/pg_log/gpdb*.log' > seglog.out
     8 [gpadmin@greenplum01 conf]$ more seglog.out
     9 [greenplum02] > /usr/local/greenplum-db/greenplum_path.sh ; gplogfilter -t
    10 [greenplum02] > /data1/primary/*/pg_log/gpdb*.log"; source
    11 [greenplum02] source
    12 [greenplum02] /usr/local/greenplum-db/greenplum_path.sh ; gplogfilter -t
    13 [greenplum02] /data1/primary/*/pg_log/gpdb*.log
    14 [greenplum02] -bash: source: filename argument required
    15 [greenplum02] source: usage: source filename [arguments]
    16 [greenplum03] > /usr/local/greenplum-db/greenplum_path.sh ; gplogfilter -t
    17 [greenplum03] > /data1/primary/*/pg_log/gpdb*.log"; source
    18 [greenplum03] source
    19 [greenplum03] /usr/local/greenplum-db/greenplum_path.sh ; gplogfilter -t
    20 [greenplum03] /data1/primary/*/pg_log/gpdb*.log
    21 [greenplum03] -bash: source: filename argument required
    22 [greenplum03] source: usage: source filename [arguments]
    View Code

      

  • 相关阅读:
    【leetcode】538/1038: 把二叉搜索树转化为累加树
    k8s-nginx二进制报Illegal instruction (core dumped)
    k8s-记一次安全软件导致镜像加载失败
    Ubuntu1804下k8s-CoreDNS占CPU高问题排查
    Ubuntu 18.04 永久修改DNS的方法
    NLP资源
    《转载》14种文本分类中的常用算法
    PyCharm 使用技巧
    python模块包调用问题
    强化学习(8)------动态规划(通俗解释)
  • 原文地址:https://www.cnblogs.com/kingle-study/p/11170831.html
Copyright © 2020-2023  润新知