• MPP 一、Greenplum 集群安装


    Installating and Initializing a Greenplum Database System...

    1 安装说明

    1.1 环境说明

    名称版本下载地址
    虚拟机 Oracle VirtualBox 4.3.10 http://www.virtualbox.org
    操作系统 CentOS 6.7 64bit https://www.centos.org
    greenplum 5.0.0-alpha.5 https://network.pivotal.io/products/pivotal-gpdb
    文件系统 ext4  

    1.2 集群说明

    角色数量主机名IP
    Greenplum Master 1 gp-master 192.168.56.10
    Greenplum Standby
    Greenplum Segment 3 gp-sdw1、gp-sdw2、gp-sdw1 192.168.56.12、192.168.56.14、192.168.56.16

    2 准备工作

    2.1 Linux用户

    在所有节点上创建greenplum管理员用户。

    groupadd -g 530 gpadmin
    useradd -g 530 -u 530 -m -d /home/gpadmin -s /bin/bash gpadmin
    chown -R gpadmin:gpadmin /home/gpadmin
    echo "gpadmin" | passwd --stdin gpadmin
    

    2.2 主机名和hosts配置

    相同的配置先在一个节点上配置,配置完成后在2.6小节中复制到其它节点上。

    vi /etc/hosts
    
    192.168.56.10 gp-master
    192.168.56.12 gp-sdw1
    192.168.56.14 gp-sdw2
    192.168.56.16 gp-sdw3
    

    分别对应每一台主机修改主机名;

    vi /etc/sysconfig/network
    

    2.3 防火墙

    禁用防火墙;

    vi /etc/selinux/config
    
    SELINUX=disabled
    
    service iptables stop
    chkconfig iptables off
    

    查看防火墙状态 service iptables status

    2.4 系统资源配置

    vi /etc/sysctl.conf
    
    kernel.shmmni = 4096
    kernel.shmall = 4000000000
    kernel.sem = 250 512000 100 2048
    kernel.sysrq = 1
    kernel.core_uses_pid = 1
    kernel.msgmnb = 65536
    kernel.msgmax = 65536
    kernel.msgmni = 2048
    net.ipv4.tcp_syncookies = 1
    net.ipv4.ip_forward = 0
    net.ipv4.tcp_tw_recycle = 1
    net.ipv4.tcp_max_syn_backlog = 4096
    net.ipv4.conf.defalut.arp_filter = 1
    net.ipv4.ip_local_port_range = 1025 65535
    net.core.netdev_max_backlog = 10000
    net.core.rmem_max = 2097152
    net.core.wmem_max = 2097152
    #vm.overcommit_memory = 2     ### 测试环境要取消这个,否则oracle启不来 ### 值为1
    

    使资源文件生效;

    sysctl -p
    

    进程数配置;

    vi /etc/security/limits.d/90-nproc.conf
    
    *          soft    nproc     131072
    root       soft    nproc     unlimited
    

    2.5 暂时启用gpadmin sudo

    因为后面的集群节点上安装greenplum时会涉及到创建目录和文件操作,在此临时启用sudo,安装成功后撤销。

    visudo
    
    gpadmin    ALL=(ALL)       ALL
    gpadmin    ALL=(ALL)       NOPASSWD:ALL
    

    2.6 复制配置文件到所有节点上

    scp /etc/hosts gp-sdw1:/etc
    scp /etc/sysctl.conf gp-sdw1:/etc
    scp /etc/security/limits.d/90-nproc.conf gp-sdw1:/etc/security/limits.d
    scp /etc/selinux/config gp-sdw1:/etc/selinux
    

    重启操作系统。

    3 安装Greenplum DB

    3.1 在Master节点上安装Greenplum DB

    首先在master节点上安装,设置安装路径为/opt/greenplum/greenplum-db-5.0.0-alpha.5;

    cd /tmp
    unzip greenplum-db-5.0.0-alpha.5-rhel6-x86_64.zip
    /tmp/greenplum-db-5.0.0-alpha.5-rhel6-x86_64.bin 
    
    ********************************************************************************
    Do you accept the Pivotal Database license agreement? [yes|no]
    ********************************************************************************
    
    yes
    
    ********************************************************************************
    Provide the installation path for Greenplum Database or press ENTER to 
    accept the default installation path: /usr/local/greenplum-db-5.0.0-alpha.5
    ********************************************************************************
    
    /opt/greenplum/greenplum-db-5.0.0-alpha.5
    
    ********************************************************************************
    Install Greenplum Database into /opt/greenplum/greenplum-db-5.0.0-alpha.5? [yes|no]
    ********************************************************************************
    
    yes
    
    ********************************************************************************
    /opt/greenplum/greenplum-db-5.0.0-alpha.5 does not exist.
    Create /opt/greenplum/greenplum-db-5.0.0-alpha.5 ? [yes|no]
    (Selecting no will exit the installer)
    ********************************************************************************
    
    yes
    
    Extracting product to /opt/greenplum/greenplum-db-5.0.0-alpha.5
    
    ********************************************************************************
    Installation complete.
    Greenplum Database is installed in /opt/greenplum/greenplum-db-5.0.0-alpha.5
    
    Pivotal Greenplum documentation is available
    for download at http://gpdb.docs.pivotal.io
    ********************************************************************************
    

    安装过程中系统会默认创建一个指向greenplum-db-5.0.0-alpha.5的软链接(greenplum-db);

    ls -ltr /opt/greenplum/
    total 8
    lrwxrwxrwx  1 gpadmin gpadmin   28 May 30 12:14 greenplum-db -> ./greenplum-db-5.0.0-alpha.5
    drwxr-xr-x 11 gpadmin gpadmin 4096 May 30 12:19 greenplum-db-5.0.0-alpha.5
    

    修改目录权限和所有者为gpadmin;

    chown -R gpadmin:gpadmin /opt/greenplum/
    chown -R gpadmin:gpadmin /opt/greenplum/greenplum-db
    

    3.2 在Master节点上配置集群host

    su - gpadmin
    mkdir -p /opt/greenplum/greenplum-db/conf
    vi /opt/greenplum/greenplum-db/conf/hostlist
    
    gp-master
    gp-sdw1
    gp-sdw2
    gp-sdw3
    

    创建一个 seg_hosts ,包含所有的Segment Host的主机名;

    vi /opt/greenplum/greenplum-db/conf/seg_hosts
    
    gp-sdw1
    gp-sdw2
    gp-sdw3
    

    3.3 配置SSH免密连接

    su - gpadmin
    source /opt/greenplum/greenplum-db/greenplum_path.sh   # 不设置报错Error: unable to import module: No module named gppylib.commands
    /opt/greenplum/greenplum-db/bin/gpssh-exkeys -f /opt/greenplum/greenplum-db/conf/hostlist
    
    [STEP 1 of 5] create local ID and authorize on local host
      ... /home/gpadmin/.ssh/id_rsa file exists ... key generation skipped
    
    [STEP 2 of 5] keyscan all hosts and update known_hosts file
    
    [STEP 3 of 5] authorize current user on remote hosts
      ... send to gp-sdw1
      ... send to gp-sdw2
      ... send to gp-sdw3
    
    [STEP 4 of 5] determine common authentication file content
    
    [STEP 5 of 5] copy authentication files to all remote hosts
      ... finished key exchange with gp-sdw1
      ... finished key exchange with gp-sdw2
      ... finished key exchange with gp-sdw3
    
    [INFO] completed successfully
    

    测试ssh gp-sdw1,不需要密码即可登录。

    3.4 Segment节点上安装Greenplum DB

    在Master节点上远程创建Segment节点所需的目录,并更改目录权限和所有者为gpadmin;

    su - gpadmin
    source /opt/greenplum/greenplum-db/greenplum_path.sh 
    /opt/greenplum/greenplum-db/bin/gpssh -f /opt/greenplum/greenplum-db/conf/seg_hosts -e -v "sudo mkdir -p /opt/greenplum && sudo chown gpadmin:gpadmin -R /opt/greenplum"
    
    [INFO] login gp-master
    [INFO] login gp-sdw1
    [INFO] login gp-sdw2
    [INFO] login gp-sdw3
    [  gp-sdw1] sudo mkdir -p /opt/greenplum && sudo chown gpadmin:gpadmin -R /opt/greenplum
    [  gp-sdw2] sudo mkdir -p /opt/greenplum && sudo chown gpadmin:gpadmin -R /opt/greenplum
    [  gp-sdw3] sudo mkdir -p /opt/greenplum && sudo chown gpadmin:gpadmin -R /opt/greenplum
    [INFO] completed successfully
    
    [Cleanup...]
    

    将Master节点上安装的Greenplum db文件复制到所有Segment节点上安装;

    su - gpadmin
    source /opt/greenplum/greenplum-db/greenplum_path.sh 
    /opt/greenplum/greenplum-db/bin/gpseginstall -f /opt/greenplum/greenplum-db/conf/hostlist -u gpadmin -p gpadmin
    
    20170530:12:26:50:004409 gpseginstall:gp-master:gpadmin-[INFO]:-Installation Info:
    link_name greenplum-db
    binary_path /opt/greenplum/greenplum-db-5.0.0-alpha.5
    binary_dir_location /opt/greenplum
    binary_dir_name greenplum-db-5.0.0-alpha.5
    20170530:12:26:50:004409 gpseginstall:gp-master:gpadmin-[INFO]:-check cluster password access
    20170530:12:26:50:004409 gpseginstall:gp-master:gpadmin-[INFO]:-de-duplicate hostnames
    20170530:12:26:50:004409 gpseginstall:gp-master:gpadmin-[INFO]:-master hostname: gp-master
    20170530:12:26:51:004409 gpseginstall:gp-master:gpadmin-[INFO]:-rm -f /opt/greenplum/greenplum-db-5.0.0-alpha.5.tar; rm -f /opt/greenplum/greenplum-db-5.0.0-alpha.5.tar.gz
    20170530:12:26:51:004409 gpseginstall:gp-master:gpadmin-[INFO]:-cd /opt/greenplum; tar cf greenplum-db-5.0.0-alpha.5.tar greenplum-db-5.0.0-alpha.5
    20170530:12:26:54:004409 gpseginstall:gp-master:gpadmin-[INFO]:-gzip /opt/greenplum/greenplum-db-5.0.0-alpha.5.tar
    20170530:12:27:22:004409 gpseginstall:gp-master:gpadmin-[INFO]:-remote command: mkdir -p /opt/greenplum
    20170530:12:27:23:004409 gpseginstall:gp-master:gpadmin-[INFO]:-remote command: rm -rf /opt/greenplum/greenplum-db-5.0.0-alpha.5
    20170530:12:27:24:004409 gpseginstall:gp-master:gpadmin-[INFO]:-scp software to remote location
    20170530:12:27:40:004409 gpseginstall:gp-master:gpadmin-[INFO]:-remote command: gzip -f -d /opt/greenplum/greenplum-db-5.0.0-alpha.5.tar.gz
    20170530:12:28:00:004409 gpseginstall:gp-master:gpadmin-[INFO]:-md5 check on remote location
    20170530:12:28:05:004409 gpseginstall:gp-master:gpadmin-[INFO]:-remote command: cd /opt/greenplum; tar xf greenplum-db-5.0.0-alpha.5.tar
    20170530:12:28:38:004409 gpseginstall:gp-master:gpadmin-[INFO]:-remote command: rm -f /opt/greenplum/greenplum-db-5.0.0-alpha.5.tar
    20170530:12:28:39:004409 gpseginstall:gp-master:gpadmin-[INFO]:-remote command: cd /opt/greenplum; rm -f greenplum-db; ln -fs greenplum-db-5.0.0-alpha.5 greenplum-db
    20170530:12:28:40:004409 gpseginstall:gp-master:gpadmin-[INFO]:-rm -f /opt/greenplum/greenplum-db-5.0.0-alpha.5.tar.gz
    20170530:12:28:41:004409 gpseginstall:gp-master:gpadmin-[INFO]:-version string on master: gpssh version 5.0.0 alpha.5 build commit:2e87c5aa435c779b2f3837fa8c7273876497f6ba
    20170530:12:28:41:004409 gpseginstall:gp-master:gpadmin-[INFO]:-remote command: . /opt/greenplum/greenplum-db/./greenplum_path.sh; /opt/greenplum/greenplum-db/./bin/gpssh --version
    20170530:12:28:48:004409 gpseginstall:gp-master:gpadmin-[INFO]:-remote command: . /opt/greenplum/greenplum-db-5.0.0-alpha.5/greenplum_path.sh; /opt/greenplum/greenplum-db-5.0.0-alpha.5/bin/gpssh --version
    20170530:12:28:49:004409 gpseginstall:gp-master:gpadmin-[INFO]:-SUCCESS -- Requested commands completed
    

    检查每个节点安装和目录情况;

    su - gpadmin
    source /opt/greenplum/greenplum-db/greenplum_path.sh 
    /opt/greenplum/greenplum-db/bin/gpssh -f /opt/greenplum/greenplum-db/conf/hostlist -e ls -l $GPHOME
    ```xml
    [gp-master] ls -l /opt/greenplum/greenplum-db/.
    [gp-master] total 40
    [gp-master] drwxr-xr-x 7 gpadmin gpadmin 4096 May 20 02:29 bin
    [gp-master] drwxrwxr-x 2 gpadmin gpadmin 4096 May 30 12:21 conf
    [gp-master] drwxr-xr-x 4 gpadmin gpadmin 4096 May 20 02:20 docs
    [gp-master] drwxr-xr-x 2 gpadmin gpadmin 4096 May 20 02:20 etc
    [gp-master] drwxr-xr-x 3 gpadmin gpadmin 4096 May 20 02:20 ext
    [gp-master] -rw-r--r-- 1 gpadmin gpadmin  745 May 30 12:14 greenplum_path.sh
    [gp-master] drwxr-xr-x 6 gpadmin gpadmin 4096 May 20 02:20 include
    [gp-master] drwxr-xr-x 7 gpadmin gpadmin 4096 May 20 02:20 lib
    [gp-master] drwxr-xr-x 2 gpadmin gpadmin 4096 May 20 02:27 sbin
    [gp-master] drwxr-xr-x 4 gpadmin gpadmin 4096 May 20 02:16 share
    [  gp-sdw2] ls -l /opt/greenplum/greenplum-db/.
    [  gp-sdw2] total 40
    [  gp-sdw2] drwxr-xr-x 7 gpadmin gpadmin 4096 May 20 02:29 bin
    [  gp-sdw2] drwxrwxr-x 2 gpadmin gpadmin 4096 May 30 12:21 conf
    [  gp-sdw2] drwxr-xr-x 4 gpadmin gpadmin 4096 May 20 02:20 docs
    [  gp-sdw2] drwxr-xr-x 2 gpadmin gpadmin 4096 May 20 02:20 etc
    [  gp-sdw2] drwxr-xr-x 3 gpadmin gpadmin 4096 May 20 02:20 ext
    [  gp-sdw2] -rw-r--r-- 1 gpadmin gpadmin  745 May 30 12:14 greenplum_path.sh
    [  gp-sdw2] drwxr-xr-x 6 gpadmin gpadmin 4096 May 20 02:20 include
    [  gp-sdw2] drwxr-xr-x 7 gpadmin gpadmin 4096 May 20 02:20 lib
    [  gp-sdw2] drwxr-xr-x 2 gpadmin gpadmin 4096 May 20 02:27 sbin
    [  gp-sdw2] drwxr-xr-x 4 gpadmin gpadmin 4096 May 20 02:16 share
    [  gp-sdw1] ls -l /opt/greenplum/greenplum-db/.
    [  gp-sdw1] total 40
    [  gp-sdw1] drwxr-xr-x 7 gpadmin gpadmin 4096 May 20 02:29 bin
    [  gp-sdw1] drwxrwxr-x 2 gpadmin gpadmin 4096 May 30 12:21 conf
    [  gp-sdw1] drwxr-xr-x 4 gpadmin gpadmin 4096 May 20 02:20 docs
    [  gp-sdw1] drwxr-xr-x 2 gpadmin gpadmin 4096 May 20 02:20 etc
    [  gp-sdw1] drwxr-xr-x 3 gpadmin gpadmin 4096 May 20 02:20 ext
    [  gp-sdw1] -rw-r--r-- 1 gpadmin gpadmin  745 May 30 12:14 greenplum_path.sh
    [  gp-sdw1] drwxr-xr-x 6 gpadmin gpadmin 4096 May 20 02:20 include
    [  gp-sdw1] drwxr-xr-x 7 gpadmin gpadmin 4096 May 20 02:20 lib
    [  gp-sdw1] drwxr-xr-x 2 gpadmin gpadmin 4096 May 20 02:27 sbin
    [  gp-sdw1] drwxr-xr-x 4 gpadmin gpadmin 4096 May 20 02:16 share
    [  gp-sdw3] ls -l /opt/greenplum/greenplum-db/.
    [  gp-sdw3] total 40
    [  gp-sdw3] drwxr-xr-x 7 gpadmin gpadmin 4096 May 20 02:29 bin
    [  gp-sdw3] drwxrwxr-x 2 gpadmin gpadmin 4096 May 30 12:21 conf
    [  gp-sdw3] drwxr-xr-x 4 gpadmin gpadmin 4096 May 20 02:20 docs
    [  gp-sdw3] drwxr-xr-x 2 gpadmin gpadmin 4096 May 20 02:20 etc
    [  gp-sdw3] drwxr-xr-x 3 gpadmin gpadmin 4096 May 20 02:20 ext
    [  gp-sdw3] -rw-r--r-- 1 gpadmin gpadmin  745 May 30 12:14 greenplum_path.sh
    [  gp-sdw3] drwxr-xr-x 6 gpadmin gpadmin 4096 May 20 02:20 include
    [  gp-sdw3] drwxr-xr-x 7 gpadmin gpadmin 4096 May 20 02:20 lib
    [  gp-sdw3] drwxr-xr-x 2 gpadmin gpadmin 4096 May 20 02:27 sbin
    [  gp-sdw3] drwxr-xr-x 4 gpadmin gpadmin 4096 May 20 02:16 share
    

    创建数据存储区域目录;

    su - gpadmin
    source /opt/greenplum/greenplum-db/greenplum_path.sh
    /opt/greenplum/greenplum-db/bin/gpssh -f /opt/greenplum/greenplum-db/conf/hostlist -e 'mkdir -p /opt/greenplum/data'
    

    在master上创建master数据存储区域;

    su - gpadmin
    source /opt/greenplum/greenplum-db/greenplum_path.sh
    /opt/greenplum/greenplum-db/bin/gpssh -h gp-master -e 'mkdir -p /opt/greenplum/data/master'
    

    在Segment节点上创建数据存储区域

    su - gpadmin
    source /opt/greenplum/greenplum-db/greenplum_path.sh
    /opt/greenplum/greenplum-db/bin/gpssh -f /opt/greenplum/greenplum-db/conf/seg_hosts -e 'mkdir -p /opt/greenplum/data/primary && mkdir -p /opt/greenplum/data/mirror'
    

    3.5 环境变量配置

    gpssh -f /opt/greenplum/greenplum-db/conf/hostlist -e -v "cat >> /home/gpadmin/.bash_profile <<EOF
    
    source /opt/greenplum/greenplum-db/greenplum_path.sh
    export MASTER_DATA_DIRECTORY=/opt/greenplum/data/master
    export GPPORT=5432
    export PGDATABASE=gp_sydb
    EOF"
    

    3.6 NTP 配置

    启用master节点上的ntp,并在Segment节点上配置和启用NTP;

    echo "server gp-master perfer" >>/etc/ntp.conf
    /opt/greenplum/greenplum-db/bin/gpssh -f /opt/greenplum/greenplum-db/conf/hostlist -v -e 'sudo ntpd'
    /opt/greenplum/greenplum-db/bin/gpssh -f /opt/greenplum/greenplum-db/conf/hostlist -v -e 'sudo /etc/init.d/ntpd start && sudo chkconfig --level 35 ntpd on'
    

    4 初始化Greenplum DB

    4.1 初始化前检查

    检查主机名配置;

    su gpadmin
    source /opt/greenplum/greenplum-db/greenplum_path.sh
    gpssh -f /opt/greenplum/greenplum-db/conf/hostlist -e hostname
    
    [  gp-sdw3] hostname
    [  gp-sdw3] gp-sdw3
    [  gp-sdw1] hostname
    [  gp-sdw1] gp-sdw1
    [gp-master] hostname
    [gp-master] gp-master
    [  gp-sdw2] hostname
    [  gp-sdw2] gp-sdw2
    

    检查节点与节点之间文件读取;

    gpcheckperf -h gp-sdw1 -h gp-sdw2 -d /tmp -r d -D -v
    gpcheckperf -f /opt/greenplum/greenplum-db/conf/hostlist -d /tmp -r d -D -v
    $ gpcheckperf -f /opt/greenplum/greenplum-db/conf/hostlist -r N -d /tmp
    /opt/greenplum/greenplum-db/./bin/gpcheckperf -f /opt/greenplum/greenplum-db/conf/hostlist -r N -d /tmp
    
    -------------------
    --  NETPERF TEST
    -------------------
    
    ====================
    ==  RESULT
    ====================
    Netperf bisection bandwidth test
    gp-master -> gp-sdw1 = 72.220000
    gp-sdw2 -> gp-sdw3 = 21.470000
    gp-sdw1 -> gp-master = 43.510000
    gp-sdw3 -> gp-sdw2 = 44.200000
    
    Summary:
    sum = 181.40 MB/sec
    min = 21.47 MB/sec
    max = 72.22 MB/sec
    avg = 45.35 MB/sec
    median = 44.20 MB/sec
    
    [Warning] connection between gp-sdw2 and gp-sdw3 is no good
    [Warning] connection between gp-sdw1 and gp-master is no good
    [Warning] connection between gp-sdw3 and gp-sdw2 is no good
    

    4.2 初始化

    初始化 Greenplum 配置文件模板都在/opt/greenplum/greenplum-db/docs/cli_help/gpconfigs目录下,gpinitsystem_config是初始化 Greenplum 的模板,此模板中 Mirror Segment的配置都被注释;创建一个副本,对其修改;

    cd /opt/greenplum/greenplum-db/docs/cli_help/gpconfigs
    cp gpinitsystem_config initgp_config
    vi initgp_config        
    
    declare -a DATA_DIRECTORY=(/opt/greenplum/data/primary /opt/greenplum/data/primary /opt/greenplum/data/primary)
    MASTER_HOSTNAME=gp-master
    MASTER_DIRECTORY=/opt/greenplum/data/master
    declare -a MIRROR_DATA_DIRECTORY=(/opt/greenplum/data/mirror /opt/greenplum/data/mirror /opt/greenplum/data/mirror)
    DATABASE_NAME=gp_sydb
    MACHINE_LIST_FILE=/opt/greenplum/greenplum-db/conf/seg_hosts
    

    执行初始化;

    gpinitsystem -c initgp_config -S
    

    若初始化失败,需要删除数据目录重新初始化;

    5 后续操作

    5.1 停止和启动集群

    gpstop -a
    gpstart -a
    

    5.2 登录数据库

    $ psql -d postgres
    
    postgres=# l # 查询数据库
                     List of databases
       Name    |  Owner  | Encoding |  Access privileges  
    -----------+---------+----------+---------------------
     gp_sydb   | gpadmin | UTF8     | 
     postgres  | gpadmin | UTF8     | 
     template0 | gpadmin | UTF8     | =c/gpadmin          
                                    : gpadmin=CTc/gpadmin
     template1 | gpadmin | UTF8     | =c/gpadmin          
                                    : gpadmin=CTc/gpadmin
    (4 rows)
    
    postgres=# l # 查询数据库表
    

    5.3 集群状态

    gpstate -e #查看mirror的状态
    gpstate -f #查看standby master的状态
    gpstate -s #查看整个GP群集的状态
    gpstate -i #查看GP的版本
    gpstate --help #帮助文档,可以查看gpstate更多用法
    
  • 相关阅读:
    PyMongo系列一:操作MongoDB
    MongoDB副本集配置系列十一:MongoDB 数据同步原理和自动故障转移的原理
    MongoDB副本集配置系列十:MongoDB local库详解和数据同步原理
    MongoDB副本集配置系列九:MongoDB 常见问题
    MongoDB副本集配置系列八:MongoDB监控
    MySQL模拟:线上误update的恢复
    Atlas+Keepalived系列二:管理Atlas
    BI 系列随笔列表 (SSIS, SSRS, SSAS, MDX, SQL Server)
    数据仓库设计小知识之一个属性的维度设计
    Microsoft 家族新成员 Datazen 移动BI 介绍
  • 原文地址:https://www.cnblogs.com/lanston/p/installating_and_initializing_Greenplumdb.html
Copyright © 2020-2023  润新知