• pg-xl 的基本方式安装与使用


    os: centos 7.4
    pgxl:pg.version ‘10.3 (Postgres-XL 10alpha2)

    pgxl 是一款非常实用的横向扩展的开源软件,继承了很多pgxc的功能,在replication 和sharding 方面有着非常棒的用处。
    pgxl 不严格的说是 pgxc的升级加强版。是对官方 postgresql 的版本的修改提升,为大牛点赞。

    Global Transaction Monitor (GTM)
    全局事务管理器,确保群集范围内的事务一致性。 GTM负责发放事务ID和快照作为其多版本并发控制的一部分。
    集群可选地配置一个备用GTM,以改进可用性。此外,可以在协调器间配置代理GTM, 可用于改善可扩展性,减少GTM的通信量。

    GTM Standby
    GTM的备节点,在pgxc,pgxl中,GTM控制所有的全局事务分配,如果出现问题,就会导致整个集群不可用,为了增加可用性,增加该备用节点。当GTM出现问题时,GTM Standby可以升级为GTM,保证集群正常工作。

    GTM-Proxy
    GTM需要与所有的Coordinators通信重点内容,为了降低压力,可以在每个Coordinator机器上部署一个GTM-Proxy。

    Coordinator
    协调员管理用户会话,并与GTM和数据节点进行交互。协调员解析,并计划查询,并给语句中的每一个组件发送下一个序列化的全局性计划。
    为节省机器,通常此服务和数据节点部署在一起。

    Data Node
    数据节点是数据实际存储的地方。数据的分布可以由DBA来配置。为了提高可用性,可以配置数据节点的热备以便进行故障转移准备。

    总结:
    gtm是负责ACID的,保证分布式数据库全局事务一致性。得益于此,就算数据节点是分布的,但是你在主节点操作增删改查事务时,就如同只操作一个数据库一样简单。
    Coordinator是调度的,将操作指令发送到各个数据节点。
    datanodes是数据节点,分布式存储数据。

    规划如下:
    node1 192.168.56.101 gtm,gtm-proxy,coordinator1,coordinator2

    node2 192.168.56.102 datanode
    node3 192.168.56.103 datanode

    在网上了解是都是把 coordinator 和 datanode放在一起,本次实验没有放在一起,运行结果完全ok。

    下载

    https://www.postgres-xl.org/download/

    https://git.postgresql.org/gitweb/?p=postgres-xl.git;a=summary
    git://git.postgresql.org/git/postgres-xl.git

    安装

    node1 需要安装依赖包

    # yum install -y bison flex perl-ExtUtils-Embed readline-devel zlib-devel pam-devel libxml2-devel libxslt-devel openldap-devel python-devel gcc gcc-c++ openssl-devel cmake openjade docbook-style-dsssl uuid uuid-devel

    关闭防火墙,selinux,其余linux性能设置请参考postgresql的设置

    # systemctl stop firewalld.service 
    # systemctl disable firewalld.service 
    # vim /etc/selinux/config
    disabled

    node1 节点上创建用户

    # groupadd postgres
    # useradd postgres -g postgres 
    # passwd postgres
    
    # mkdir -p /usr/pgxl-10
    # chown -R postgres:postgres /usr/pgxl-10
    
    # mkdir -p /var/lib/pgxl
    # mkdir -p /var/lib/pgxl/coordinator1
    # mkdir -p /var/lib/pgxl/coordinator2
    # mkdir -p /var/lib/pgxl/gtm
    # mkdir -p /var/lib/pgxl/gtm_standby
    # mkdir -p /var/lib/pgxl/gtm_proxy1
    # mkdir -p /var/lib/pgxl/gtm_proxy2
    # chown -R postgres:postgres /var/lib/pgxl

    node1 节点 postgres 用户的环境变量

    # su - postgres
    $ vi ~/.bash_profile
    export PGUSER=postgres
    export PGHOME=/usr/pgxl-10
    
    export LD_LIBRARY_PATH=$PGHOME/lib
    export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/lib:/usr/lib:/usr/local/lib
    export PATH=$PGHOME/bin:$PATH
    
    export TEMP=/tmp
    export TMPDIR=/tmp
    

    node1 上编译安装,注意不需要安装社区版

    $ cd /tmp
    $ git clone git://git.postgresql.org/git/postgres-xl.git
    $ cd postgres-xl
    $ git branch -r
      origin/HEAD -> origin/master
      origin/XL9_5_STABLE
      origin/XL_10_STABLE
      origin/master
      origin/xl_dbt3_expt
      origin/xl_doc_update
      origin/xl_test
    $ git checkout XL_10_STABLE
    Branch XL_10_STABLE set up to track remote branch XL_10_STABLE from origin.
    Switched to a new branch 'XL_10_STABLE'
    $ git status
    # On branch XL_10_STABLE
    nothing to commit, working directory clean  
    $ ./configure --prefix=/usr/pgxl-10 --with-perl --with-python --with-openssl --with-pam --with-ldap --with-libxml --with-libxslt
    $ make world
    $ make install-world
    
    $ cd contrib
    $ make 
    $ make install
    

    node1、node2、node3配置ssh相互免密登录

    过程略

    node1、node2、node3同步下时间

    # ntpdate asia.pool.ntp.org

    node1节点上 gtm

    $ which initgtm
    /usr/pgxl-10/bin/initgtm

    初始化 gtm

    $ initgtm -Z gtm -D /var/lib/pgxl/gtm
    
    The files belonging to this GTM system will be owned by user "postgres".
    This user must also own the server process.
    
    
    fixing permissions on existing directory /var/lib/pgxl/gtm ... ok
    creating configuration files ... ok
    creating control file ... ok
    
    Success.
    You can now start the GTM server using:
    
        gtm -D /var/lib/pgxl/gtm
    or
        gtm_ctl -Z gtm -D /var/lib/pgxl/gtm -l logfile start
    

    gtm.conf 的详细内容

    $ vi /var/lib/pgxl/gtm/gtm.conf
    
    nodename = 'node1_gtm'
    listen_addresses = '*'
    port = 6666
    startup = ACT
    log_file = 'gtm.log'
    log_min_messages = NOTICE
    

    node1节点上 gtm_proxy

    由于node1上计划安装两个 coordinator,所以也需要安装两个 gtm_proxy

    $ which initgtm
    /usr/pgxl-10/bin/initgtm

    初始化第一个 gtm_proxy

    $ initgtm -Z gtm_proxy -D /var/lib/pgxl/gtm_proxy1
    The files belonging to this GTM system will be owned by user "postgres".
    This user must also own the server process.
    
    
    fixing permissions on existing directory /var/lib/pgxl/gtm_proxy1 ... ok
    creating configuration files ... ok
    
    Success.
    You can now start the GTM proxy server using:
    
        gtm_proxy -D /var/lib/pgxl/gtm_proxy1
    or
        gtm_ctl -Z gtm_proxy -D /var/lib/pgxl/gtm_proxy1 -l logfile start

    第一个 gtm_proxy 参数文件

    $ vi /var/lib/pgxl/gtm_proxy1/gtm_proxy.conf
    
    nodename = 'node1_gtm_proxy1'
    listen_addresses = '*'
    port = 6668
    gtm_host = 'node1'
    gtm_port = 6666
    log_file = 'gtm_proxy.log'
    log_min_messages = NOTICE
    

    初始化第二个 gtm_proxy

    $ initgtm -Z gtm_proxy -D /var/lib/pgxl/gtm_proxy2
    The files belonging to this GTM system will be owned by user "postgres".
    This user must also own the server process.
    
    
    fixing permissions on existing directory /var/lib/pgxl/gtm_proxy2 ... ok
    creating configuration files ... ok
    
    Success.
    You can now start the GTM proxy server using:
    
        gtm_proxy -D /var/lib/pgxl/gtm_proxy2
    or
        gtm_ctl -Z gtm_proxy -D /var/lib/pgxl/gtm_proxy2 -l logfile start
    

    第二个 gtm_proxy 参数文件

    $ vi /var/lib/pgxl/gtm_proxy2/gtm_proxy.conf
    
    nodename = 'node1_gtm_proxy2'
    listen_addresses = '*'
    port = 6669
    gtm_host = 'node1'
    gtm_port = 6666
    log_file = 'gtm_proxy.log'
    log_min_messages = NOTICE
    

    node1节点上 coordinator

    两个 coordinator

    初始化第一个 coordinator

    $ initdb -D /var/lib/pgxl/coordinator1 --nodename coordinator1 -E UTF8 --locale=C -U postgres -W
    
    Success.
    
    Success. You can now start the database server of the Postgres-XL coordinator using:
    
        pg_ctl -D /var/lib/pgxl/coordinator1 -l logfile start -Z coordinator
    
    or
     You can now start the database server of the Postgres-XL datanode using:
    
        pg_ctl -D /var/lib/pgxl/coordinator1 -l logfile start -Z datanode
    

    第一个 coordinator 参数文件

    $ vi /var/lib/pgxl/coordinator1/pg_hba.conf
    
    host    all             all             192.168.56.101/32         trust
    host    all             all             192.168.56.102/32         trust
    host    all             all             192.168.56.103/32         trust
    
    $ vi /var/lib/pgxl/coordinator1/postgresql.conf
    
    listen_addresses = '*'
    port = 5432
    max_connections = 100
    
    pooler_port = 6670
    max_pool_size = 100
    
    gtm_host = 'node1'
    gtm_port = 6668
    pgxc_node_name = 'coordinator1' 
    

    初始化第二个 coordinator

    $ initdb -D /var/lib/pgxl/coordinator2 --nodename coordinator2 -E UTF8 --locale=C -U postgres -W
    
    Success.
    
    Success. You can now start the database server of the Postgres-XL coordinator using:
    
        pg_ctl -D /var/lib/pgxl/coordinator2 -l logfile start -Z coordinator
    
    or
     You can now start the database server of the Postgres-XL datanode using:
    
        pg_ctl -D /var/lib/pgxl/coordinator2 -l logfile start -Z datanode
    

    第二个 coordinator 参数文件

    $ vi /var/lib/pgxl/coordinator2/pg_hba.conf
    
    host    all             all             192.168.56.101/32         trust
    host    all             all             192.168.56.102/32         trust
    host    all             all             192.168.56.103/32         trust
    
    $ vi /var/lib/pgxl/coordinator2/postgresql.conf
    
    listen_addresses = '*'
    port = 5433
    max_connections = 100
    
    pooler_port = 6671
    max_pool_size = 100
    
    gtm_host = 'node1'
    gtm_port = 6669
    pgxc_node_name = 'coordinator2' 

    node2节点上 datanode

    安装 pgxl,可参考node1节点上的安装过程

    $ initdb -D /var/lib/pgxl/data --nodename datanode1 -E UTF8 --locale=C -U postgres -W
    Success. You can now start the database server of the Postgres-XL coordinator using:
    
        pg_ctl -D /var/lib/pgxl/data -l logfile start -Z coordinator
    
    or
     You can now start the database server of the Postgres-XL datanode using:
    
        pg_ctl -D /var/lib/pgxl/data -l logfile start -Z datanode
    
    $ vi /var/lib/pgxl/data/pg_hba.conf
    
    host    all             all             192.168.56.101/32         trust
    host    all             all             192.168.56.102/32         trust
    host    all             all             192.168.56.103/32         trust
    
    
    $ vi /var/lib/pgxl/data/postgresql.conf
    
    listen_addresses = '*'
    port = 5432
    max_connections = 100
    
    pooler_port = 6667
    max_pool_size = 100
    
    gtm_host = 'node1'
    gtm_port = 6668
    pgxc_node_name = 'datanode1' 
    

    node3节点上 datanode

    安装 pgxl,可参考node1节点上的安装过程

    $ initdb -D /var/lib/pgxl/data --nodename datanode2 -E UTF8 --locale=C -U postgres -W
    Success. You can now start the database server of the Postgres-XL coordinator using:
    
        pg_ctl -D /var/lib/pgxl/data -l logfile start -Z coordinator
    
    or
     You can now start the database server of the Postgres-XL datanode using:
    
        pg_ctl -D /var/lib/pgxl/data -l logfile start -Z datanode
    
    $ vi /var/lib/pgxl/data/pg_hba.conf
    
    host    all             all             192.168.56.101/32         trust
    host    all             all             192.168.56.102/32         trust
    host    all             all             192.168.56.103/32         trust
    
    $ vi /var/lib/pgxl/data/postgresql.conf    
    
    listen_addresses = '*'
    port = 5432
    max_connections = 100
    
    pooler_port = 6667
    max_pool_size = 100
    
    gtm_host = 'node1'
    gtm_port = 6669
    pgxc_node_name = 'datanode2' 
    

    pg-xl集群启动

    启动顺序是 GTM=>GTM Standby=>GTM-Proxy=>Datanodes=>Coordinators

    node1 节点启动 gtm

    $ gtm_ctl -Z gtm -D /var/lib/pgxl/gtm start

    对应的关闭

    $ gtm_ctl -Z gtm -D /var/lib/pgxl/gtm stop

    node1 节点启动 gtm-proxy

    $ gtm_ctl -Z gtm_proxy -D /var/lib/pgxl/gtm_proxy1 start
    $ gtm_ctl -Z gtm_proxy -D /var/lib/pgxl/gtm_proxy2 start

    对应的关闭

    $ gtm_ctl -Z gtm_proxy -D /var/lib/pgxl/gtm_proxy1 stop
    $ gtm_ctl -Z gtm_proxy -D /var/lib/pgxl/gtm_proxy2 stop

    node2 节点启动 datanode

    $ pg_ctl -D /var/lib/pgxl/data start -Z datanode

    对应的关闭

    $ pg_ctl stop -m fast -D /var/lib/pgxl/data -Z datanode

    node3 节点启动 datanode

    $ pg_ctl -D /var/lib/pgxl/data start -Z datanode

    对应的关闭

    $ pg_ctl stop -m fast -D /var/lib/pgxl/data -Z datanode

    node1 节点启动 coordinator

    $ pg_ctl -D /var/lib/pgxl/coordinator1 start -Z coordinator
    $ pg_ctl -D /var/lib/pgxl/coordinator2 start -Z coordinator

    对应的关闭

    $ pg_ctl stop -m fast -D /var/lib/pgxl/coordinator1 -Z coordinator
    $ pg_ctl stop -m fast -D /var/lib/pgxl/coordinator2 -Z coordinator

    pg-xl配置

    node1 节点上登录 coordinator1

    $ psql -p 5432
    psql (PGXL 10alpha2, based on PG 10.3 (Postgres-XL 10alpha2))
    Type "help" for help.
    
    postgres=# 
    postgres=# select * from pgxc_node;
     node_name    | node_type | node_port | node_host | nodeis_primary | nodeis_preferred |  node_id   
    --------------+-----------+-----------+-----------+----------------+------------------+------------
     coordinator1 | C         |      5432 | localhost | f              | f                | 1148549230
    (1 row)
    
    postgres=# alter node coordinator1 with (type=coordinator,host='node1', port=5432);
    postgres=# create node coordinator2 with (type=coordinator,host='node1', port=5433);
    
    postgres=# create node datanode1 with (type=datanode, host='node2',port=5432,primary,preferred);
    postgres=# create node datanode2 with (type=datanode, host='node3',port=5432);
    
    postgres=# select pgxc_pool_reload();
    postgres=# select * from pgxc_node;
      node_name   | node_type | node_port | node_host | nodeis_primary | nodeis_preferred |   node_id   
    --------------+-----------+-----------+-----------+----------------+------------------+-------------
     coordinator1 | C         |      5432 | node1     | f              | f                |  1148549230
     coordinator2 | C         |      5433 | node1     | f              | f                | -2089598990
     datanode1    | D         |      5432 | node2     | t              | t                |   888802358
     datanode2    | D         |      5432 | node3     | f              | f                |  -905831925
    (4 rows)
    

    依次操作 node1 节点的 coordinator2,node2 节点的 datanode,node3 节点的datanode

    验证

    只能通过node1 节点的 coordinator1、coordinator1 去操作。node2,node3节点的数据都是只读的。

    目前的coordinator配置如下,也可以往前翻看看记录。
    node1 coordinator1 5432
    node1 coordinator2 5433

    $ psql -p 5432
    psql (PGXL 10alpha2, based on PG 10.3 (Postgres-XL 10alpha2))
    Type "help" for help.
    
    postgres=#
    postgres=# create table tmp_t0(c0 varchar(100),c1 varchar(100));
    peiybdb=# insert into tmp_t0(c0,c1) SELECT id::varchar,md5(id::varchar) FROM generate_series(1,10000) as id;
    INSERT 0 10000
    peiybdb=# d+ tmp_t0
                                              Table "public.tmp_t0"
     Column |          Type          | Collation | Nullable | Default | Storage  | Stats target | Description 
    --------+------------------------+-----------+----------+---------+----------+--------------+-------------
     c0     | character varying(100) |           |          |         | extended |              | 
     c1     | character varying(100) |           |          |         | extended |              | 
    Distribute By: HASH(c0)
    Location Nodes: ALL DATANODES
    

    到 node2、node3的数据节点查看对应的数据

    node2 查询结果

    peiybdb=# select count(1) from tmp_t0;
     count 
    -------
      5081
    (1 row)

    node3 查询结果

    peiybdb=# select count(1) from tmp_t0;
     count 
    -------
      4919
    (1 row)
    

    如果在node2,node3上插入数据,会收到报错提示。

    peiybdb=# insert into tmp_t0(c0,c1) SELECT id::varchar,md5(id::varchar) FROM generate_series(1,10000) as id;
    ERROR:  cannot execute INSERT in a read-only transaction

    下面是简单的建表分析

    CREATE TABLE table_name(...)
    DISTRIBUTE BY 
    HASH(col)|MODULO(col)|ROUNDROBIN|REPLICATION
    TO NODE(nodename1,nodename2...)

    可以看到,如果DISTRIBUTE BY 后面有如下选项:
    REPLICATION,则是复制模式,其余则是分片模式,
    HASH 指的是按照指定列的哈希值分布数据,
    MODULO 指的是按照指定列的取摩运算分布数据,
    ROUNDROBIN 指的是按照轮询的方式分布数据

    TO NODE指定了数据分布的节点范围,如果没有指定则默认所有数据节点参与数据分布。如果没有指定分布模式,即使用普通的CREATE TABLE语句,PGXL会默认采用分片模式将数据分布到所有数据节点。

    参考:
    https://www.postgres-xl.org/
    https://www.postgres-xl.org/overview/
    https://www.postgres-xl.org/download/
    https://www.postgres-xl.org/documentation/

    https://git.postgresql.org/gitweb/?p=postgres-xl.git;a=summary

    https://www.2ndquadrant.com/en/resources/postgres-xl/

    https://www.postgresql.org/download

  • 相关阅读:
    MapReduce原理
    用redis构建分布式锁
    Python中类的特殊变量
    Python之元类
    python之WSGI与Guincorn
    一种消息和任务队列——beanstalkd
    LRU 算法
    extern、static、restrict、volatile 关键字
    bigtable原理
    Go的微服务库kite
  • 原文地址:https://www.cnblogs.com/ctypyb2002/p/9792937.html
Copyright © 2020-2023  润新知