• Kubernetes容器集群部署(二进制方式)上


    1.kubernetes集群部署方式介绍

    搭建Kubernetes集群环境有以下三种方式:

    1. Minikube安装方式

    Minikube是一个工具,可以在本地快速运行一个单点的Kubernetes,尝试Kubernetes或日常开发的用户使用。但是这种方式仅可用于学习和测试部署,不能用于生产环境

    2. Kubeadm安装方式

    kubeadm是一个kubernetes官方提供的快速安装和初始化拥有最佳实践(best practice)的kubernetes集群的工具,提供kubeadm init和kubeadm join,用于快速部署Kubernetes集群。目前kubeadm还处于beta 和alpha状态,不推荐用在生产环境,但是可以通过学习这种部署方法来体会一些官方推荐的kubernetes最佳实践的设计和思想。

    kubeadm的目标是提供一个最小可用的可以通过Kubernetes一致性测试的集群,所以并不会安装任何除此之外的非必须的addon。kubeadm默认情况下并不会安装一个网络解决方案,所以用kubeadm安装完之后,需要自己来安装一个网络的插件。所以说,目前的kubeadm是不能用于生产环境的

    3. 二进制包安装方式(生产部署的推荐方式)

    从官方下载发行版的二进制包,手动部署每个组件,组成Kubernetes集群,这种方式符合企业生产环境标准的Kubernetes集群环境的安装,可用于生产方式部署

    2.部署环境信息和部署架构简介

    1.集群节点信息和服务版本

    使用Kubernetes1.19.7,所有节点机操作系统是Centos7.6。本文档部署中所需kubernetes相关安装包和镜像可提前在FQ服务器上下载,然后同步到k8s部署机器上。具体信息如下:

    ip地址 主机名 角色
    172.31.46.28 k8s-master01 主节点1、etc节点1,DNS节点
    172.31.46.63 k8s-master02 主节点2、etc节点2
    172.31.46.67 k8s-master03 主节点3、etc节点3
    172.31.46.26  k8s-node01 工作节点1
    172.31.46.38 k8s-node02 工作节点2
    172.31.46.15 k8s-node03 工作节点3
    172.31.46.22 k8s-ha01 nginx节点1、harbor节点1
    172.31.46.3 k8s-ha02 nginx节点2、harbor节点2

    3.环境初始化准备

    1.部署DNS服务bind9

    #以下操作均在DNS节点k8s-master01操作
    [root@k8s-master01 ~]# yum install bind bind-utils -y
    #修改并校验配置文件。
    [root@k8s
    -master01 ~]# vim /etc/named.conf listen-on port 53 { 172.31.46.28; }; allow-query { any; }; forwarders { 10.255.255.88; }; #上一层DNS地址(网关或公网DNS) recursion yes; dnssec-enable no; dnssec-validation no [root@k8s-master01 ~]# named-checkconf #检查配置文件格式是否正确 #在域配置中增加自定义域
    [root@k8s
    -master01 ~]# cat >>/etc/named.rfc1912.zones <<'EOF' #添加自定义主机域
    zone "host.com" IN { type master; file "host.com.zone"; allow-update { 172.31.46.28; }; };
    #添加自定义业务域(这个可以根据自己的业务添加,也可不添加)
    zone "zq.com" IN { type master; file "zq.com.zone"; allow-update { 172.31.46.28; }; }; EOF
    #host.com和zq.com都是我们自定义的域名,一般用host.com做为主机域
    #zq.com为业务域,业务不同可以配置多个
    #为自定义域host.com创建配置文件,后续加入新的集群节点,只需在配置文件中添加主机信息即可。
    [root@k8s-master01 ~]# cat >/var/named/host.com.zone <<'EOF'
    $ORIGIN host.com.
    $TTL 600    ; 10 minutes
    @       IN SOA  dns.host.com. dnsadmin.host.com. (
                    2020041601 ; serial
                    10800      ; refresh (3 hours)
                    900        ; retry (15 minutes)
                    604800     ; expire (1 week)
                    86400      ; minimum (1 day)
                    )
                NS   dns.host.com.
    $TTL 60 ; 1 minute
    dns                   A   172.31.46.28
    k8s-master01          A   172.31.46.28
    k8s-master02          A   172.31.46.63
    k8s-master03          A   172.31.46.67
    k8s-node01            A   172.31.46.26
    k8s-node02            A   172.31.46.38
    k8s-node03            A   172.31.46.15
    k8s-ha01              A   172.31.46.22
    k8s-ha02              A   172.31.46.3

    EOF
    #为自定义域zq.com创建配置文件
    [root@k8s-master01 ~]# cat >/var/named/zq.com.zone <<'EOF'
    $ORIGIN zq.com.
    $TTL 600    ; 10 minutes
    @       IN SOA  dns.zq.com. dnsadmin.zq.com. (
                    2020041601 ; serial
                    10800      ; refresh (3 hours)
                    900        ; retry (15 minutes)
                    604800     ; expire (1 week)
                    86400      ; minimum (1 day)
                    )
                NS   dns.zq.com.
    $TTL 60 ; 1 minute
    dns                A    172.31.46.28
    
    EOF
    host.com域用于主机之间通信,所以要先增加上所有主机
    zq.com域用于后面的业务解析用,因此不需要先添加主机
    #再次检查配置并启动dns服务

    [root@k8s-master01 ~]# named-checkconf
    [root@k8s-master01 ~]# systemctl start named
    [root@k8s-master01 ~]# ss -lntup|grep 53

    #验证结果

    [root@k8s-master01 ~]# dig -t A k8s-master02.host.com @172.31.46.28 +short
    172.31.46.63
    [root@k8s-master01 ~]# dig -t A k8s-master03.host.com @172.31.46.28 +short
    172.31.46.67

    #以下操作需要在集群所有节点操作

    #修改集群节点的网络配置,使其绑定我们的DNS服务。

    [root@k8s-master01 ~]# sed -i 's#^DNS.*#DNS1=172.31.46.28#g' /etc/sysconfig/network-scripts/ifcfg-eth0

    [root@k8s-master01 ~]# systemctl restart network

    [root@k8s-master01 ~]# sed -i '/^nameserver.*/i search host.com' /etc/resolv.conf

    #检查DNS配置,是否成功。

    [root@k8s-master01 ~]# cat /etc/resolv.conf
    ; generated by /usr/sbin/dhclient-script
    search host.com
    nameserver 172.31.46.28
    [root@k8s-master01 ~]# ping k8s-node01
    PING k8s-node01.host.com (172.31.46.26) 56(84) bytes of data.
    64 bytes from node3 (172.31.46.26): icmp_seq=1 ttl=64 time=0.813 ms

    [root@k8s-master01 ~]# ping k8s-ha02
    PING k8s-ha02.host.com (172.31.46.3) 56(84) bytes of data.
    64 bytes from 172.31.46.3 (172.31.46.3): icmp_seq=1 ttl=64 time=0.403 ms

    
    

    2.基础环境配置


    #以下操作需在所有集群节点操作
    #1.主机名修改

    [root@k8s-master01 ~]# hostnamectl set-hostname k8s-master01 #2.添加docker账号
    [root@k8s
    -master01 ~]# useradd -m docker #3.更新PATH变量
    将可执行文件目录添加到PATH环境变量中(这个文件目录根据自己的实际目录填)

    [root@k8s-master01 ~]# echo 'PATH=/opt/k8s/bin:$PATH' >>/root/.bashrc
    [root@k8s-master01 ~]# source /root/.bashrc
    #4.安装依赖包(这里需提前配好epel源)
    [root@k8s-master01 ~]# yum install -y conntrack ntpdate ntp ipvsadm ipset jq iptables curl sysstat libseccomp wget lsof telnet
    #5.关闭无关服务
    [root@k8s-master01 ~]# systemctl stop postfix && systemctl disable postfix
    #6.关闭防火墙和selinux,清理防火墙规则,设置默认转发策略
    [root@k8s-master01 ~]# systemctl stop firewalld
    [root@k8s-master01 ~]# iptables -F && iptables -X && iptables -F -t nat && iptables -X -t nat
    [root@k8s-master01 ~]# iptables -P FORWARD ACCEPT
    [root@k8s-master01 ~]# firewall-cmd --state
    not running
    [root@k8s-master01 ~]# setenforce 0
    setenforce: SELinux is disabled
    [root@k8s-master01 ~]# sed -i 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config
    #7.关闭swap分区
    如果开启了swap分区,kubelet会启动失败(可以通过将参数 --fail-swap-on 设置为false来忽略swap on),故需要在每个node节点机器上关闭swap分区。
    这里索性将所有节点的swap分区都关闭,同时注释/etc/fstab中相应的条目,防止开机自动挂载swap分区:
    [root@k8s-master01 ~]# swapoff -a
    [root@k8s-master01 ~]# sed -i '/ swap / s/^(.*)$/#1/g' /etc/fstab
    #8.关闭dnsmasq
    linux系统开启了dnsmasq后(如 GUI 环境),将系统DNS Server设置为 127.0.0.1,这会导致docker容器无法解析域名,需要关闭它 (centos7系统可能默认没有安装这个服务)
    [root@k8s-master01 ~]# systemctl stop dnsmasq
    [root@k8s-master01 ~]# systemctl disable dnsmasq
    #9.加载内核模块
    [root@k8s-master01 ~]# modprobe ip_vs_rr
    [root@k8s-master01 ~]# modprobe br_netfilter
    #10.优化内核参数
    [root@k8s-master01 ~]# cat > kubernetes.conf <<EOF
    net.bridge.bridge-nf-call-iptables=1
    net.bridge.bridge-nf-call-ip6tables=1
    net.ipv4.ip_forward=1
    net.ipv4.tcp_tw_recycle=0  #由于tcp_tw_recycle与kubernetes的NAT冲突,必须关闭!否则会导致服务不通
    vm.swappiness=0            #禁止使用 swap 空间,只有当系统 OOM (内存不足)时才允许使用它
    vm.overcommit_memory=1     #不检查物理内存是否够用
    vm.panic_on_oom=0          #开启 OOM
    fs.inotify.max_user_instances=8192
    fs.inotify.max_user_watches=1048576
    fs.file-max=52706963
    fs.nr_open=52706963
    net.ipv6.conf.all.disable_ipv6=1  #关闭不使用的ipv6协议栈,防止触发docker BUG.
    net.netfilter.nf_conntrack_max=2310720
    EOF

    [root@k8s-master01 ~]# cp kubernetes.conf /etc/sysctl.d/kubernetes.conf
    [root@k8s-master01 ~]# sysctl -p /etc/sysctl.d/kubernetes.conf

    #11.调整系统时钟

    [root@k8s-master01 ~]# timedatectl set-timezone Asia/Shanghai

    [root@k8s-master01 ~]# timedatectl set-local-rtc 0

    [root@k8s-master01 ~]# systemctl restart rsyslog
    [root@k8s-master01 ~]# systemctl restart crond

    #12.设置rsyslogd 和systemd journald.

    systemd 的 journald 是 Centos 7 缺省的日志记录工具,它记录了所有系统、内核、Service Unit 的日志。相比 systemd,journald 记录的日志有如下优势:
    -> 可以记录到内存或文件系统;(默认记录到内存,对应的位置为 /run/log/jounal);
    -> 可以限制占用的磁盘空间、保证磁盘剩余空间;
    -> 可以限制日志文件大小、保存的时间;
    -> journald 默认将日志转发给 rsyslog,这会导致日志写了多份,/var/log/messages 中包含了太多无关日志,不方便后续查看,同时也影响系统性能。
    [root@k8s-master01 ~]# mkdir /var/log/journal           #创建持久化保存日志的目录。
    [root@k8s-master01 ~]# mkdir /etc/systemd/journald.conf.d
    [root@k8s-master01 ~]# cat > /etc/systemd/journald.conf.d/99-prophet.conf <<EOF
    [Journal]
    # 持久化保存到磁盘
    Storage=persistent
         
    # 压缩历史日志
    Compress=yes
         
    SyncIntervalSec=5m
    RateLimitInterval=30s
    RateLimitBurst=1000
         
    # 最大占用空间 10G
    SystemMaxUse=10G
         
    # 单日志文件最大 200M
    SystemMaxFileSize=200M
         
    # 日志保存时间 2 周
    MaxRetentionSec=2week
         
    # 不将日志转发到 syslog
    ForwardToSyslog=no
    EOF
         
    [root@k8s-master01 ~]# systemctl restart systemd-journald
    #13.创建k8s相关目录
    [root@k8s-master01 ~]# mkdir -p /opt/k8s/{bin,work} /etc/{kubernetes,etcd}/cert
    #14.升级内核
    CentOS 7.x系统自带的3.10.x内核存在一些Bugs,导致运行的Docker、Kubernetes不稳定,例如:
    -> 高版本的 docker(1.13 以后) 启用了3.10 kernel实验支持的kernel memory account功能(无法关闭),当节点压力大如频繁启动和停止容器时会导致 cgroup memory leak;
    -> 网络设备引用计数泄漏,会导致类似于报错:"kernel:unregister_netdevice: waiting for eth0 to become free. Usage count = 1";
         
    解决方案如下:
    -> 升级内核到 4.4.X 以上;
    -> 或者,手动编译内核,disable CONFIG_MEMCG_KMEM 特性;
    -> 或者安装修复了该问题的 Docker 18.09.1 及以上的版本。但由于 kubelet 也会设置 kmem(它 vendor 了 runc),所以需要重新编译 kubelet 并指定 GOFLAGS="-tags=nokmem"

    升级内核步骤如下
    要在 CentOS 7 上启用 ELRepo 仓库
    [root@k8s-master01 ~]# rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-3.el7.elrepo.noarch.rpm
    这里需要注意的是,在 ELRepo 中有两个内核选项,一个是 kernel-lt(长期支持版本),一个是 kernel-ml(主线最新版本),采用长期支持版本(kernel-lt),更稳定一些
    [root@k8s-master01 ~]# yum --enablerepo=elrepo-kernel install -y kernel-lt
    #设置开机从新内核启动
    [root@k8s-master01 ~]# grub2-set-default 0
    #重启后检查。
    [root@k8s-master01 ~]# init 6
    [root@k8s-master01 ~]# uname -r
    5.4.90-1.el7.elrepo.x86_64
     
    15.关闭NUMA
    [root@k8s-master01 ~]# cp /etc/default/grub{,.bak}
    [root@k8s-master01 ~]# vim /etc/default/grub   
    .........
    GRUB_CMDLINE_LINUX="...... numa=off"      # 即添加"numa=0ff"内容
    #重新生成grub2配置文件:

    [root@k8s-master01 ~]# cp /boot/grub2/grub.cfg{,.bak}
    [root@k8s-master01 ~]# grub2-mkconfig -o /boot/grub2/grub.cfg

    16.变量脚本文件(重要)

    [root@k8s-master01 ~]# cat >/opt/k8s/bin/environment.sh <<'EOF'

    #!/usr/bin/bash
    # 生成 EncryptionConfig 所需的加密 key
    export ENCRYPTION_KEY=$(head -c 32 /dev/urandom | base64)
    # 集群中所有节点机器IP数组(master,node,etcd节点)
    export NODE_ALL_IPS=(172.31.46.28 172.31.46.63 172.31.46.67 172.31.46.26 172.31.46.38 172.31.46.15)
    # 集群中所有节点IP对应的主机名数组
    export NODE_ALL_NAMES=(k8s-master01 k8s-master02 k8s-master03 k8s-node01 k8s-node02 k8s-node03)
    # 集群中所有master节点集群IP数组
    export NODE_MASTER_IPS=(172.31.46.28 172.31.46.63 172.31.46.67)
    # 集群中master节点IP对应的主机名数组
    export NODE_MASTER_NAMES=(k8s-master01 k8s-master02 k8s-master03)
    # 集群中所有node节点集群IP数组
    export NODE_NODE_IPS=(172.31.46.26 172.31.46.38 172.31.46.15)
    # 集群中node节点IP对应的主机名数组
    export NODE_NODE_NAMES=(k8s-node01 k8s-node02 k8s-node03)
    # 集群中所有etcd节点集群IP数组
    export NODE_ETCD_IPS=(172.31.46.28 172.31.46.63 172.31.46.67)
    # 集群中etcd节点IP对应的主机名数组(这里是和master三节点机器共用)
    export NODE_ETCD_NAMES=(k8s-etcd01 k8s-etcd02 k8s-etcd03)
    # etcd 集群服务地址列表
    export ETCD_ENDPOINTS="https://172.31.46.28:2379,https://172.31.46.63:2379,https://172.31.46.67:2379"
    # etcd 集群间通信的 IP 和端口
    export ETCD_NODES="k8s-etcd01=https://172.31.46.28:2380,k8s-etcd02=https://172.31.46.63:2380,k8s-etcd03=https://172.31.46.67:2380"
    # kube-apiserver 的反向代理(地址端口.这里也就是nginx代理层的VIP地址)
    export KUBE_APISERVER="https://172.31.46.47:8443"
    # 节点间互联网络接口名称
    export IFACE="eth0"
    # etcd 数据目录
    export ETCD_DATA_DIR="/data/k8s/etcd/data"
    # etcd WAL (预写日志)目录,建议是 SSD 磁盘分区,或者和 ETCD_DATA_DIR 不同的磁盘分区
    export ETCD_WAL_DIR="/data/k8s/etcd/wal"
    # k8s 各组件数据目录
    export K8S_DIR="/data/k8s/k8s"
    # docker 数据目录
    export DOCKER_DIR="/data/k8s/docker"
    ## 以下参数一般不需要修改
    # TLS Bootstrapping 使用的 Token,可以使用命令 head -c 16 /dev/urandom | od -An -t x | tr -d ' ' 生成
    BOOTSTRAP_TOKEN="29bba288771c469c997962680f179953"
    # 最好使用 当前未用的网段 来定义服务网段和 Pod 网段
    # 服务网段,部署前路由不可达,部署后集群内路由可达(kube-proxy 保证)
    SERVICE_CIDR="10.254.0.0/16"
    # Pod 网段,建议 /16 段地址,部署前路由不可达,部署后集群内路由可达(flanneld 保证)
    CLUSTER_CIDR="172.30.0.0/16"
    # 服务端口范围 (NodePort Range)
    export NODE_PORT_RANGE="30000-32767"
    # flanneld 网络配置前缀
    export FLANNEL_ETCD_PREFIX="/kubernetes/network"
    # kubernetes 服务 IP (一般是 SERVICE_CIDR 中第一个IP)
    export CLUSTER_KUBERNETES_SVC_IP="10.254.0.1"
    # 集群 DNS 服务 IP (从 SERVICE_CIDR 中预分配)
    export CLUSTER_DNS_SVC_IP="10.254.0.2"
    # 集群 DNS 域名(末尾不带点号)
    export CLUSTER_DNS_DOMAIN="cluster.local"
    # 将二进制目录 /opt/k8s/bin 加到 PATH 中
    export PATH=/opt/k8s/bin:$PATH


    #以下操作均在节点k8s-master01操作
    #本篇部署文档有很有操作都是在k8s-master01节点上执行,然后远程分发文件到其他节点机器上并远程执行命令,所以需要添加该节点到其它节点的ssh信任关系。
    [root@k8s-master01 ~]# ssh-keygen -t rsa
    [root@k8s-master01 ~]# cp /root/.ssh/id_rsa.pub /root/.ssh/authorized_keys
    [root@k8s-master01 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub -p22 root@k8s-master01
    [root@k8s-master01 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub -p22 root@k8s-master02
    [root@k8s-master01 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub -p22 root@k8s-master03
    [root@k8s-master01 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub -p22 root@k8s-node01
    [root@k8s-master01 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub -p22 root@k8s-node02
    [root@k8s-master01 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub -p22 root@k8s-node03

    3.创建集群中需要的CA证书和秘钥

    为确保安全,kubernetes 系统各组件需要使用 x509 证书对通信进行加密和认证。CA (Certificate Authority) 是自签名的根证书,用来签名后续创建的其它证书。这里使用 CloudFlare 的 PKI 工具集 cfssl 创建所有证书。下面部署命令均在k8s-master01节点上执行,然后远程分发文件和执行命令。

    1)安装cfssl工具集
    [root@k8s-master01 ~]# mkdir -p /opt/k8s/work && cd /opt/k8s/work
    [root@k8s-master01 work]# wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
    [root@k8s-master01 work]# mv cfssl_linux-amd64 /opt/k8s/bin/cfssl
       
    [root@k8s-master01 work]# wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
    [root@k8s-master01 work]# mv cfssljson_linux-amd64 /opt/k8s/bin/cfssljson
       
    [root@k8s-master01 work]# wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64
    [root@k8s-master01 work]# mv cfssl-certinfo_linux-amd64 /opt/k8s/bin/cfssl-certinfo
       
    [root@k8s-master01 work]# chmod +x /opt/k8s/bin/*
    [root@k8s-master01 work]# export PATH=/opt/k8s/bin:$PATH
       
    2)创建根证书 (CA)
    CA 证书是集群所有节点共享的,只需要创建一个 CA 证书,后续创建的所有证书都由它签名。
    2.1)创建配置文件
    CA 配置文件用于配置根证书的使用场景 (profile) 和具体参数 (usage,过期时间、服务端认证、客户端认证、加密等),后续在签名其它证书时需要指定特定场景。
    [root@k8s-master01 work]# cd /opt/k8s/work
    [root@k8s-master01 work]# cat > ca-config.json <<EOF
    {
      "signing": {
        "default": {
          "expiry": "87600h"
        },
        "profiles": {
          "kubernetes": {
            "usages": [
                "signing",
                "key encipherment",
                "server auth",
                "client auth"
            ],
            "expiry": "87600h"
          }
        }
      }
    }
    EOF
       
    配置说明:
    signing:表示该证书可用于签名其它证书,生成的 ca.pem 证书中 CA=TRUE;
    server auth:表示 client 可以用该该证书对 server 提供的证书进行验证;
    client auth:表示 server 可以用该该证书对 client 提供的证书进行验证;
       
    2.2)创建证书签名请求文件
    [root@k8s-master01 work]# cd /opt/k8s/work
    [root@k8s-master01 work]# cat > ca-csr.json <<EOF
    {
      "CN": "kubernetes",
      "key": {
        "algo": "rsa",
        "size": 2048
      },
      "names": [
        {
          "C": "CN",
          "ST": "Hefei",
          "L": "Hefei",
          "O": "k8s",
          "OU": "4Paradigm"
        }
      ]
    }
    EOF
       
    配置说明:
    CN:Common Name,kube-apiserver 从证书中提取该字段作为请求的用户名 (User Name),浏览器使用该字段验证网站是否合法;
    O:Organization,kube-apiserver 从证书中提取该字段作为请求用户所属的组 (Group);
    kube-apiserver 将提取的 User、Group 作为 RBAC 授权的用户标识;
       
    2.3)生成 CA 证书和私钥
    [root@k8s-master01 work]# cd /opt/k8s/work
    [root@k8s-master01 work]# cfssl gencert -initca ca-csr.json | cfssljson -bare ca
    #ca-key.pem是私钥,ca.pem就是CA证书(也就是根证书),ca.csr是证书签名请求文件(用于交叉签名或重新签名)
    [root@k8s-master01 work]# ls ca* ca-config.json ca.csr ca-csr.json ca-key.pem ca.pem [root@k8s-master01 work]# 3)分发证书文件 将生成的 CA 证书、秘钥文件、配置文件拷贝到所有节点的 /etc/kubernetes/cert 目录下: [root@k8s-master01 work]# cd /opt/k8s/work [root@k8s-master01 work]# source /opt/k8s/bin/environment.sh [root@k8s-master01 work]# for node_all_ip in ${NODE_ALL_IPS[@]} do echo ">>> ${node_all_ip}" ssh root@${node_all_ip} "mkdir -p /etc/kubernetes/cert" scp ca*.pem ca-config.json root@${node_all_ip}:/etc/kubernetes/cert done

    4.部署kubectl命令行工具

    kubectl 是 kubernetes 集群的命令行管理工具. kubectl 默认从 ~/.kube/config 文件读取kube-apiserver地址和认证信息,如果没有配置,执行kubectl命令时就会报错!kubectl只需要部署一次,生成的kubeconfig文件是通用的,可以拷贝到需要执行kubectl命令的节点机器,重命名为 ~/.kube/config;这里我将kubectl节点只部署到三个master节点机器上,其他节点不部署kubectl命令。也就是说后续进行kubectl命令管理就只能在master节点上操作。

    [root@k8s-master01 work]# wget https://storage.googleapis.com/kubernetes-release/release/v1.19.6/kubernetes-server-linux-amd64.tar.gz
    [root@k8s-master01 work]# tar -zxvf kubernetes-server-linux-amd64.tar.gz
    分发到所有使用kubectl的节点,这里只分发到三个master节点
    [root@k8s-master01 work]# source /opt/k8s/bin/environment.sh
    [root@k8s-master01 work]# for node_master_ip in ${NODE_MASTER_IPS[@]}
    do
      echo ">>> ${node_master_ip}"
      scp kubernetes/server/bin/kubectl root@${node_master_ip}:/opt/k8s/bin/
      ssh root@${node_master_ip} "chmod +x /opt/k8s/bin/*"
    done
    [root@k8s-master01 work]# cd /opt/k8s/work
    [root@k8s-master01 work]# cat > admin-csr.json <<EOF
    {
      "CN": "admin",
      "hosts": [],
      "key": {
        "algo": "rsa",
        "size": 2048
      },
      "names": [
        {
          "C": "CN",
          "ST": "Hefei",
          "L": "Hefei",
          "O": "system:masters",
          "OU": "4Paradigm"
        }
      ]
    }
    EOF
    配置说明:
    O为system:masters,kube-apiserver 收到该证书后将请求的 Group 设置为 system:masters;
    预定义的 ClusterRoleBinding cluster-admin 将Group system:masters 与 Role cluster-admin 绑定,该 Role 授予所有 API的权限;
    该证书只会被kubectl当做client证书使用,所以hosts字段为空;
    生成证书和私钥:
    [root@k8s-master01 work]# cfssl gencert -ca=/opt/k8s/work/ca.pem 
      -ca-key=/opt/k8s/work/ca-key.pem 
      -config=/opt/k8s/work/ca-config.json 
      -profile=kubernetes admin-csr.json | cfssljson -bare admin
    [root@k8s-master01 work]# ls admin*
    admin.csr  admin-csr.json  admin-key.pem  admin.pem

    3.创建 kubeconfig 文件

    kubeconfig 为 kubectl 的配置文件,包含访问 apiserver 的所有信息,如 apiserver 地址、CA 证书和自身使用的证书;
    [root@k8s-master01 work]# cd /opt/k8s/work
    [root@k8s-master01 work]# source /opt/k8s/bin/environment.sh
      
    设置集群参数
    [root@k8s-master01 work]# kubectl config set-cluster kubernetes 
      --certificate-authority=/opt/k8s/work/ca.pem 
      --embed-certs=true 
      --server=${KUBE_APISERVER} 
      --kubeconfig=kubectl.kubeconfig
      
    设置客户端认证参数
    [root@k8s-master01 work]# kubectl config set-credentials admin 
      --client-certificate=/opt/k8s/work/admin.pem 
      --client-key=/opt/k8s/work/admin-key.pem 
      --embed-certs=true 
      --kubeconfig=kubectl.kubeconfig
      
    设置上下文参数
    [root@k8s-master01 work]# kubectl config set-context kubernetes 
      --cluster=kubernetes 
      --user=admin 
      --kubeconfig=kubectl.kubeconfig
      
    设置默认上下文
    [root@k8s-master01 work]# kubectl config use-context kubernetes --kubeconfig=kubectl.kubeconfig
      
    配置说明:
    --certificate-authority:验证 kube-apiserver 证书的根证书;
    --client-certificate、--client-key:刚生成的 admin 证书和私钥,连接 kube-apiserver 时使用;
    --embed-certs=true:将 ca.pem 和 admin.pem 证书内容嵌入到生成的 kubectl.kubeconfig 文件中(不加时,写入的是证书文件路径,
    后续拷贝 kubeconfig 到其它机器时,还需要单独拷贝证书文件,这就很不方便了)

    4.分发 kubeconfig 文件, 保存的文件名为 ~/.kube/config

    分发到所有使用 kubectl 命令的节点,即分发到三个master节点上

    [root@k8s-master01 work]# cd /opt/k8s/work
    [root@k8s-master01 work]# source /opt/k8s/bin/environment.sh
    [root@k8s-master01 work]# for node_master_ip in ${NODE_MASTER_IPS[@]}
    do
      echo ">>> ${node_master_ip}"
      ssh root@${node_master_ip} "mkdir -p ~/.kube"
      scp kubectl.kubeconfig root@${node_master_ip}:~/.kube/config
    done

    5.部署etcd集群

    etcd是基于Raft的分布式key-value存储系统,由CoreOS开发,常用于服务发现、共享配置以及并发控制(如leader选举、分布式锁等)。kubernetes使用etcd存储所有运行数据。需要注意的是:由于etcd是负责存储,所以不建议搭建单点集群,如zookeeper一样,由于存在选举策略,所以一般推荐奇数个集群,如3,5,7。只要集群半数以上的结点存活,那么集群就可以正常运行,否则集群可能无法正常使用。下面部署命令均在k8s-master01节点上执行,然后远程分发文件和执行命令。

    1.下载和分发etcd二进制文件

    [root@k8s-master01 ~]# cd /opt/k8s/work
    [root@k8s-master01 work]# wget https://github.com/coreos/etcd/releases/download/v3.3.13/etcd-v3.3.13-linux-amd64.tar.gz
    [root@k8s-master01 work]# tar -xvf etcd-v3.3.13-linux-amd64.tar.gz
    分发二进制文件到etcd集群所有节点:
    [root@k8s-master01 work]# cd /opt/k8s/work
    [root@k8s-master01 work]# source /opt/k8s/bin/environment.sh
    [root@k8s-master01 work]# for node_etcd_ip in ${NODE_ETCD_IPS[@]}
      do
        echo ">>> ${node_etcd_ip}"
        scp etcd-v3.3.13-linux-amd64/etcd* root@${node_etcd_ip}:/opt/k8s/bin
        ssh root@${node_etcd_ip} "chmod +x /opt/k8s/bin/*"
      done

    2.创建etcd证书和私钥

    创建证书签名请求:
    [root@k8s-master01 work]# cd /opt/k8s/work
    [root@k8s-master01 work]# cat > etcd-csr.json <<EOF
    {
      "CN": "etcd",
      "hosts": [
        "127.0.0.1",
        "172.31.46.28",
        "172.31.46.63",
        "172.31.46.67"
      ],
      "key": {
        "algo": "rsa",
        "size": 2048
      },
      "names": [
        {
          "C": "CN",
          "ST": "Heifei",
          "L": "Hefei",
          "O": "k8s",
          "OU": "4Paradigm"
        }
      ]
    }
    EOF
       
    配置说明:
    hosts 字段指定授权使用该证书的 etcd 节点 IP 或域名列表,需要将 etcd 集群的三个节点 IP 都列在其中;
       
    生成证书和私钥
    [root@k8s-master01 work]# cd /opt/k8s/work
    [root@k8s-master01 work]# cfssl gencert -ca=/opt/k8s/work/ca.pem 
        -ca-key=/opt/k8s/work/ca-key.pem 
        -config=/opt/k8s/work/ca-config.json 
        -profile=kubernetes etcd-csr.json | cfssljson -bare etcd
       
    [root@k8s-master01 work]# ls etcd*pem
    etcd-key.pem  etcd.pem
       
    分发生成的证书和私钥到各etcd节点
    [root@k8s-master01 work]# cd /opt/k8s/work
    [root@k8s-master01 work]# source /opt/k8s/bin/environment.sh
    [root@k8s-master01 work]# for node_etcd_ip in ${NODE_ETCD_IPS[@]}
      do
        echo ">>> ${node_etcd_ip}"
        ssh root@${node_etcd_ip} "mkdir -p /etc/etcd/cert"
        scp etcd*.pem root@${node_etcd_ip}:/etc/etcd/cert/
      done

    3.创建etcd的systemd unit模板文件

    [root@k8s-master01 work]# cd /opt/k8s/work
    [root@k8s-master01 work]# source /opt/k8s/bin/environment.sh
    [root@k8s-master01 work]# cat > etcd.service.template <<EOF
    [Unit]
    Description=Etcd Server
    After=network.target
    After=network-online.target
    Wants=network-online.target
    Documentation=https://github.com/coreos
       
    [Service]
    Type=notify
    WorkingDirectory=${ETCD_DATA_DIR}
    ExecStart=/opt/k8s/bin/etcd \
      --data-dir=${ETCD_DATA_DIR} \
      --wal-dir=${ETCD_WAL_DIR} \
      --name=##NODE_ETCD_NAME## \
      --cert-file=/etc/etcd/cert/etcd.pem \
      --key-file=/etc/etcd/cert/etcd-key.pem \
      --trusted-ca-file=/etc/kubernetes/cert/ca.pem \
      --peer-cert-file=/etc/etcd/cert/etcd.pem \
      --peer-key-file=/etc/etcd/cert/etcd-key.pem \
      --peer-trusted-ca-file=/etc/kubernetes/cert/ca.pem \
      --peer-client-cert-auth \
      --client-cert-auth \
      --listen-peer-urls=https://##NODE_ETCD_IP##:2380 \
      --initial-advertise-peer-urls=https://##NODE_ETCD_IP##:2380 \
      --listen-client-urls=https://##NODE_ETCD_IP##:2379,http://127.0.0.1:2379 \
      --advertise-client-urls=https://##NODE_ETCD_IP##:2379 \
      --initial-cluster-token=etcd-cluster-0 \
      --initial-cluster=${ETCD_NODES} \
      --initial-cluster-state=new \
      --auto-compaction-mode=periodic \
      --auto-compaction-retention=1 \
      --max-request-bytes=33554432 \
      --quota-backend-bytes=6442450944 \
      --heartbeat-interval=250 \
      --election-timeout=2000
    Restart=on-failure
    RestartSec=5
    LimitNOFILE=65536
       
    [Install]
    WantedBy=multi-user.target
    EOF
       
    配置说明:
    WorkingDirectory、--data-dir:指定工作目录和数据目录为 ${ETCD_DATA_DIR},需在启动服务前创建这个目录;
    --wal-dir:指定 wal 目录,为了提高性能,一般使用 SSD 或者和 --data-dir 不同的磁盘;
    --name:指定节点名称,当 --initial-cluster-state 值为 new 时,--name 的参数值必须位于 --initial-cluster 列表中;
    --cert-file、--key-file:etcd server 与 client 通信时使用的证书和私钥;
    --trusted-ca-file:签名 client 证书的 CA 证书,用于验证 client 证书;
    --peer-cert-file、--peer-key-file:etcd 与 peer 通信使用的证书和私钥;
    --peer-trusted-ca-file:签名 peer 证书的 CA 证书,用于验证 peer 证书;

    4.为各etc节点创建和分发etcd systemd unit 文件

    [root@k8s-master01 work]# cd /opt/k8s/work
    [root@k8s-master01 work]# source /opt/k8s/bin/environment.sh
    [root@k8s-master01 work]# for (( i=0; i < 3; i++ ))
      do
        sed -e "s/##NODE_ETCD_NAME##/${NODE_ETCD_NAMES[i]}/" -e "s/##NODE_ETCD_IP##/${NODE_ETCD_IPS[i]}/" etcd.service.template > etcd-${NODE_ETCD_IPS[i]}.service
      done
    
    配置说明:
    NODE_ETCD_NAMES 和 NODE_ETCD_IPS 为相同长度的bash数组,分别为etcd集群节点名称和对应的IP;
    
    [root@k8s-master01 work]# ls *.service
    etcd-172.31.46.28.service  etcd-172.31.46.63.service  etcd-172.31.46.67.service
    最好手动查看其中一个etcd节点的启动文件里的--name名称和ip是否都已修改过来了
    [root@k8s-master01 work]# cat etcd-172.31.46.63.service 
    。。。。。。。
    --peer-trusted-ca-file=/etc/kubernetes/cert/ca.pem 
      --peer-client-cert-auth 
      --client-cert-auth 
      --listen-peer-urls=https://172.31.46.63:2380 
      --initial-advertise-peer-urls=https://172.31.46.63:2380 
      --listen-client-urls=https://172.31.46.63:2379,http://127.0.0.1:2379 
      --advertise-client-urls=https://172.31.46.63:2379 
      --initial-cluster-token=etcd-cluster-0 
      --initial-cluster=k8s-etcd01=https://172.31.46.28:2380,k8s-etcd02=https://172.31.46.63:2380,k8s-etcd03=https://172.31.46.67:2380 
      --initial-cluster-state=new 
    。。。。。。
    分发生成的 systemd unit 文件:
    [root@k8s-master01 work]# cd /opt/k8s/work
    [root@k8s-master01 work]# source /opt/k8s/bin/environment.sh
    [root@k8s-master01 work]# for node_etcd_ip in ${NODE_ETCD_IPS[@]}
      do
        echo ">>> ${node_etcd_ip}"
        scp etcd-${node_etcd_ip}.service root@${node_etcd_ip}:/etc/systemd/system/etcd.service
      done
       
    配置说明: 文件重命名为 etcd.service;

    5.启动etcd服务

    [root@k8s-master01 work]# cd /opt/k8s/work
    [root@k8s-master01 work]# source /opt/k8s/bin/environment.sh
    [root@k8s-master01 work]# for node_etcd_ip in ${NODE_ETCD_IPS[@]}
      do
        echo ">>> ${node_etcd_ip}"
        ssh root@${node_etcd_ip} "mkdir -p ${ETCD_DATA_DIR} ${ETCD_WAL_DIR}"
        ssh root@${node_etcd_ip} "systemctl daemon-reload && systemctl enable etcd && systemctl restart etcd "
      done
       
    配置说明:
    必须先创建 etcd 数据目录和工作目录;
    etcd 进程首次启动时会等待其它节点的 etcd 加入集群,命令 systemctl start etcd 会卡住一段时间,为正常现象;

    6.检查etcd服务启动结果和服务状态

    [root@k8s-master01 work]# cd /opt/k8s/work
    [root@k8s-master01 work]# source /opt/k8s/bin/environment.sh
    [root@k8s-master01 work]# for node_etcd_ip in ${NODE_ETCD_IPS[@]}
      do
        echo ">>> ${node_etcd_ip}"
        ssh root@${node_etcd_ip} "systemctl status etcd|grep Active"
      done
    预期输出结果为:
    >>> 172.31.46.28
       Active: active (running) since Wed 2021-01-20 16:27:53 CST; 1h 25min ago
    >>> 172.31.46.63
       Active: active (running) since Wed 2021-01-20 16:27:53 CST; 1h 25min ago
    >>> 172.31.46.67
       Active: active (running) since Wed 2021-01-20 16:27:53 CST; 1h 25min ago
    确保状态均为为active (running),否则查看日志,确认原因 (可以执行"journalctl -u etcd"命令查看启动失败原因)

    验证服务状态

    [root@k8s-master01 work]# cd /opt/k8s/work

    [root@k8s-master01 work]# source /opt/k8s/bin/environment.sh

    [root@k8s-master01 work]# for node_etcd_ip in ${NODE_ETCD_IPS[@]}

    do

    echo ">>> ${node_etcd_ip}"

    ssh root@${node_etcd_ip} "

    ETCDCTL_API=3 /opt/k8s/bin/etcdctl

    --endpoints=https://${node_etcd_ip}:2379

    --cacert=/etc/kubernetes/cert/ca.pem

    --cert=/etc/etcd/cert/etcd.pem

    --key=/etc/etcd/cert/etcd-key.pem endpoint health "

    done

    预期输出结果为:

    >>> 172.31.46.28
    https://172.31.46.28:2379 is healthy: successfully committed proposal: took = 1.332226ms
    >>> 172.31.46.63
    https://172.31.46.63:2379 is healthy: successfully committed proposal: took = 1.732246ms
    >>> 172.31.46.67
    https://172.31.46.67:2379 is healthy: successfully committed proposal: took = 1.512986ms

    输出均为 healthy 时表示集群服务正常。

    7.查看当前etcd集群中的leader(主)

    在三台etcd节点中的任意一个节点机器上执行下面命令
    [root@k8s-master02 ~]# source /opt/k8s/bin/environment.sh
    [root@k8s-master02 ~]# ETCDCTL_API=3 /opt/k8s/bin/etcdctl 
    >  -w table --cacert=/etc/kubernetes/cert/ca.pem 
    >   --cert=/etc/etcd/cert/etcd.pem 
    >   --key=/etc/etcd/cert/etcd-key.pem 
    >   --endpoints=${ETCD_ENDPOINTS} endpoint status
    
    预期输出结果为:
    +---------------------------+------------------+---------+---------+-----------+-----------+------------+
    |         ENDPOINT          |        ID        | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
    +---------------------------+------------------+---------+---------+-----------+-----------+------------+
    | https://172.31.46.28:2379 | e8b4f8d15e0ed9ba |  3.3.13 |   20 kB |      true |         2 |          9 |
    | https://172.31.46.63:2379 | fc277334df3f12d1 |  3.3.13 |   20 kB |     false |         2 |          9 |
    | https://172.31.46.67:2379 | b6e7cab93bd2c0b2 |  3.3.13 |   20 kB |     false |         2 |          9 |
    +---------------------------+------------------+---------+---------+-----------+-----------+------------+
    由上面结果可见,当前的leader节点为172.31.46.28

    6.Flannel容器网络方案部署

    kubernetes要求集群内各节点(这里指master和node节点)能通过Pod网段互联互通。flannel使用vxlan(虚拟局域网)技术为各节点创建一个可以互通的Pod网络,使用的端口为UDP 8472(需要开放该端口,如公有云AWS等)。flanneld第一次启动时,从etcd获取配置的Pod网段信息,为本节点分配一个未使用的地址段,然后创建flannedl.1网络接口(也可能是其它名称,如flannel1等)。flannel将分配给自己的Pod网段信息写入/run/flannel/docker文件,docker后续使用这个文件中的环境变量设置docker0网桥,从而从这个地址段为本节点的所有Pod容器分配IP。下面部署命令均在k8s-master01节点上执行,然后远程分发文件和执行命令。

    1.下载和分发 flanneld 二进制文件

    #从flannel的release页面(https://github.com/coreos/flannel/releases)下载最新版本的安装包:
    [root@k8s-master01 work]# cd /opt/k8s/work/
    [root@k8s-master01 work]# mkdir flannel
    [root@k8s-master01 work]# wget https://github.com/coreos/flannel/releases/download/v0.12.0/flannel-v0.12.0-linux-amd64.tar.gz
    [root@k8s-master01 work]# tar -zxvf flannel-v0.12.0-linux-amd64.tar.gz -C flannel
    #分发二进制文件到集群所有节点:
    [root@k8s-master01 work]# source /opt/k8s/bin/environment.sh 
    [root@k8s-master01 work]# for node_all_ip in ${NODE_ALL_IPS[@]}
    do
         echo ">>> ${node_all_ip}"
         scp flannel/{flanneld,mk-docker-opts.sh} root@${node_all_ip}:/opt/k8s/bin/
         ssh root@${node_all_ip} "chmod +x /opt/k8s/bin/*"
    done

    2.创建flannel证书和私钥

    #flanneld 从 etcd 集群存取网段分配信息,而 etcd 集群启用了双向 x509 证书认证,所以需要为 flanneld 生成证书和私钥。
    创建证书签名请求:
    [root@k8s-master01 work]# cd /opt/k8s/work/
    [root@k8s-master01 work]# cat > flanneld-csr.json <<EOF
    {
       "CN": "flanneld",
       "hosts": [],
       "key": {
         "algo": "rsa",
         "size": 2048
       },
       "names": [
         {
           "C": "CN",
           "ST": "Hefei",
           "L": "Hefei",
           "O": "k8s",
           "OU": "4Paradigm"
         }
       ]
     }
    EOF
    #该证书只会被 kubectl 当做 client 证书使用,所以 hosts 字段为空;
    #生成证书和私钥:
    [root@k8s-master01 work]# cfssl gencert -ca=/opt/k8s/work/ca.pem 
     -ca-key=/opt/k8s/work/ca-key.pem 
     -config=/opt/k8s/work/ca-config.json 
     -profile=kubernetes flanneld-csr.json | cfssljson -bare flanneld
    #将生成的证书和私钥分发到所有节点(master 和 node):
    [root@k8s-master01 work]# source /opt/k8s/bin/environment.sh
    [root@k8s-master01 work]# for node_all_ip in ${NODE_ALL_IPS[@]}
      do
        echo ">>> ${node_all_ip}"
        ssh root@${node_all_ip} "mkdir -p /etc/flanneld/cert"
        scp flanneld*.pem root@${node_all_ip}:/etc/flanneld/cert
      done

    3.向 etcd 写入集群 Pod 网段信息 (注意:本步骤只需执行一次)

    [root@k8s-master01 ~]# cd /opt/k8s/work/
    [root@k8s-master01 work]# source /opt/k8s/bin/environment.sh 
    [root@k8s-master01 work]# etcdctl 
    >   --endpoints=${ETCD_ENDPOINTS} 
    >   --ca-file=/opt/k8s/work/ca.pem 
    >   --cert-file=/opt/k8s/work/flanneld.pem 
    >   --key-file=/opt/k8s/work/flanneld-key.pem 
    mk ${FLANNEL_ETCD_PREFIX}/config '{"Network":"'${CLUSTER_CIDR}'", "SubnetLen": 21, "Backend": {"Type": "vxlan"}}'
    #写入的 Pod 网段 ${CLUSTER_CIDR} 地址段(如 /16)必须小于 SubnetLen(子网),必须与 kube-controller-manager 的 --cluster-cidr 参数值一致;

    4.创建flanneld的systemd unit文件

    [root@k8s-master01 work]# cd /opt/k8s/work
    [root@k8s-master01 work]# source /opt/k8s/bin/environment.sh
    [root@k8s-master01 work]# cat > flanneld.service << EOF
    [Unit]
    Description=Flanneld overlay address etcd agent
    After=network.target
    After=network-online.target
    Wants=network-online.target
    After=etcd.service
    Before=docker.service
     
    [Service]
    Type=notify
    ExecStart=/opt/k8s/bin/flanneld \
      -etcd-cafile=/etc/kubernetes/cert/ca.pem \
      -etcd-certfile=/etc/flanneld/cert/flanneld.pem \
      -etcd-keyfile=/etc/flanneld/cert/flanneld-key.pem \
      -etcd-endpoints=${ETCD_ENDPOINTS} \
      -etcd-prefix=${FLANNEL_ETCD_PREFIX} \
      -iface=${IFACE} \
      -ip-masq
    ExecStartPost=/opt/k8s/bin/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d /run/flannel/docker
    Restart=always
    RestartSec=5
    StartLimitInterval=0
     
    [Install]
    WantedBy=multi-user.target
    RequiredBy=docker.service
    EOF
    解决说明:
    mk-docker-opts.sh 脚本将分配给 flanneld 的 Pod 子网段信息写入 /run/flannel/docker 文件,后续 docker 启动时使用这个文件中的环境变量配置 docker0 网桥;
    flanneld 使用系统缺省路由所在的接口与其它节点通信,对于有多个网络接口(如内网和公网)的节点,可以用 -iface 参数指定通信接口;
    flanneld 运行时需要 root 权限;
    -ip-masq: flanneld 为访问 Pod 网络外的流量设置 SNAT 规则,同时将传递给 Docker 的变量 --ip-masq(/run/flannel/docker 文件中)设置为 false
    这样 Docker 将不再创建 SNAT 规则; Docker 的 --ip-masq 为 true 时,创建的 SNAT 规则比较“暴力”:将所有本节点 Pod 发起的、访问非 docker0 接口的请求做 SNAT,
    这样访问其他节点 Pod 的请求来源 IP 会被设置为 flannel.1 接口的 IP,导致目的 Pod 看不到真实的来源 Pod IP。 flanneld 创建的 SNAT 规则比较温和,只对访问非 Pod 网段的请求做 SNAT。

    5.分发 flanneld systemd unit 文件到所有节点

    [root@k8s-master01 work]# cd /opt/k8s/work
    [root@k8s-master01 work]# source /opt/k8s/bin/environment.sh
    [root@k8s-master01 work]# for node_all_ip in ${NODE_ALL_IPS[@]}
      do
        echo ">>> ${node_all_ip}"
        scp flanneld.service root@${node_all_ip}:/etc/systemd/system/
      done
     

    6.启动 flanneld 服务

    [root@k8s-master01 work]# source /opt/k8s/bin/environment.sh
    [root@k8s-master01 work]# for node_all_ip in ${NODE_ALL_IPS[@]}
      do
        echo ">>> ${node_all_ip}"
        ssh root@${node_all_ip} "systemctl daemon-reload && systemctl enable flanneld && systemctl restart flanneld"
      done

    7.检查启动结果

    [root@k8s-master01 work]# source /opt/k8s/bin/environment.sh
    [root@k8s-master01 work]# for node_all_ip in ${NODE_ALL_IPS[@]}
      do
        echo ">>> ${node_all_ip}"
        ssh root@${node_all_ip} "systemctl status flanneld|grep Active"
      done
     
    确保状态为 active (running),否则查看日志,确认原因"journalctl -u flanneld"

    8.检查分配给各 flanneld 的 Pod 网段信息

    #查看集群 Pod 网段(/16):
    [root@k8s-master01 work]# source /opt/k8s/bin/environment.sh
    [root@k8s-master01 work]# etcdctl 
      --endpoints=${ETCD_ENDPOINTS} 
      --ca-file=/etc/kubernetes/cert/ca.pem 
      --cert-file=/etc/flanneld/cert/flanneld.pem 
      --key-file=/etc/flanneld/cert/flanneld-key.pem 
      get ${FLANNEL_ETCD_PREFIX}/config
     
    预期输出: {"Network":"172.30.0.0/16", "SubnetLen": 21, "Backend": {"Type": "vxlan"}}
     
    查看已分配的 Pod 子网段列表(/24):
    [root@k8s-master01 work]# source /opt/k8s/bin/environment.sh
    [root@k8s-master01 work]# etcdctl 
      --endpoints=${ETCD_ENDPOINTS} 
      --ca-file=/etc/kubernetes/cert/ca.pem 
      --cert-file=/etc/flanneld/cert/flanneld.pem 
      --key-file=/etc/flanneld/cert/flanneld-key.pem 
      ls ${FLANNEL_ETCD_PREFIX}/subnets
    预期输出:
    /kubernetes/network/subnets/172.30.240.0-21
    /kubernetes/network/subnets/172.30.80.0-21
    /kubernetes/network/subnets/172.30.24.0-21
    /kubernetes/network/subnets/172.30.96.0-21
    /kubernetes/network/subnets/172.30.232.0-21
    /kubernetes/network/subnets/172.30.184.0-21
    #查看某一 Pod 网段对应的节点 IP 和 flannel 接口地址:
    [root@k8s-master01 work]# source /opt/k8s/bin/environment.sh
    [root@k8s-master01 work]# etcdctl 
      --endpoints=${ETCD_ENDPOINTS} 
      --ca-file=/etc/kubernetes/cert/ca.pem 
      --cert-file=/etc/flanneld/cert/flanneld.pem 
      --key-file=/etc/flanneld/cert/flanneld-key.pem 
      get ${FLANNEL_ETCD_PREFIX}/subnets/172.30.24.0-21
    预期输出:{"PublicIP":"172.31.46.67","BackendType":"vxlan","BackendData":{"VtepMAC":"ea:b1:38:7e:42:59"}}
    解决说明:
    172.30.40.0/21 被分配给节点k8s-master03(172.31.46.67);
    VtepMAC 为k8s-master03节点的 flannel.1 网卡 MAC 地址;
     

    9.检查节点 flannel 网络信息 (比如k8s-master01节点)

    [root@k8s-master01 work]# ip addr show
    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        inet 127.0.0.1/8 scope host lo
           valid_lft forever preferred_lft forever
    2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
        link/ether fa:16:3e:53:f5:90 brd ff:ff:ff:ff:ff:ff
        inet 172.31.46.28/24 brd 172.31.46.255 scope global dynamic eth0
           valid_lft 41472sec preferred_lft 41472sec
    3: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default 
        link/ether 72:f3:d8:ca:cf:5b brd ff:ff:ff:ff:ff:ff
        inet 172.30.240.0/32 scope global flannel.1
           valid_lft forever preferred_lft forever
    注意: flannel.1 网卡的地址为分配的 Pod 子网段的第一个 IP(.0),且是 /32 的地址
    #通过下面我们看出,到其它节点 Pod 网段请求都被转发到 flannel.1 网卡;
    [root@k8s-master01 work]# ip route show |grep flannel.1
    172.30.24.0/21 via 172.30.24.0 dev flannel.1 onlink 
    172.30.80.0/21 via 172.30.80.0 dev flannel.1 onlink 
    172.30.96.0/21 via 172.30.96.0 dev flannel.1 onlink 
    172.30.184.0/21 via 172.30.184.0 dev flannel.1 onlink 
    172.30.232.0/21 via 172.30.232.0 dev flannel.1 onlink
    flanneld 根据 etcd 中子网段的信息,如 ${FLANNEL_ETCD_PREFIX}/subnets/172.30.232.0-21 ,来决定进请求发送给哪个节点的互联 IP;

    10.验证各节点能通过 Pod 网段互通

    #在各节点上部署 flannel 后,检查是否创建了 flannel 接口(名称可能为 flannel0、flannel.0、flannel.1 等):
    [root@k8s-master01 work]# source /opt/k8s/bin/environment.sh
    [root@k8s-master01 work]# for node_all_ip in ${NODE_ALL_IPS[@]}
      do
        echo ">>> ${node_all_ip}"
        ssh ${node_all_ip} "/usr/sbin/ip addr show flannel.1|grep -w inet"
      done
    预期输出:
    >>> 172.31.46.28
        inet 172.30.240.0/32 scope global flannel.1
    >>> 172.31.46.63
        inet 172.30.80.0/32 scope global flannel.1
    >>> 172.31.46.67
        inet 172.30.24.0/32 scope global flannel.1
    >>> 172.31.46.26
        inet 172.30.96.0/32 scope global flannel.1
    >>> 172.31.46.38
        inet 172.30.232.0/32 scope global flannel.1
    >>> 172.31.46.15
        inet 172.30.184.0/32 scope global flannel.1
    在各节点上 ping 所有 flannel 接口 IP,确保能通:
    [root@k8s-master01 work]# source /opt/k8s/bin/environment.sh
    [root@k8s-master01 work]# for node_all_ip in ${NODE_ALL_IPS[@]}
      do
        echo ">>> ${node_all_ip}"
        ssh ${node_all_ip} "ping -c 1 172.30.240.0"
        ssh ${node_all_ip} "ping -c 1 172.30.80.0"
        ssh ${node_all_ip} "ping -c 1 172.30.24.0"
        ssh ${node_all_ip} "ping -c 1 172.30.96.0"
        ssh ${node_all_ip} "ping -c 1 172.30.232.0"
        ssh ${node_all_ip} "ping -c 1 172.30.184.0"
    done

    7.基于nginx四层代理环境

    这里采用nginx 4 层透明代理功能实现 K8S 节点( master 节点和 worker 节点)高可用访问 kube-apiserver。控制节点的 kube-controller-manager、kube-scheduler 是多实例(3个)部署,所以只要有一个实例正常,就可以保证高可用;搭建nginx+keepalived环境,对外提供一个统一的vip地址,后端对接多个 apiserver 实例,nginx 对它们做健康检查和负载均衡;kubelet、kube-proxy、controller-manager、scheduler 通过vip地址访问 kube-apiserver,从而实现 kube-apiserver 的高可用;

    1.安装和配置nginx

    下面操作在172.31.46.22、172.31.46.3两个节点机器上同样操作,也就是上面规划的k8s-ha01和k8s-ha02节点。

    1.下载和编译 nginx 
    [root@k8s-ha01 ~]# yum -y install gcc pcre-devel zlib-devel openssl-devel wget lsof
    [root@k8s-ha01 ~]# cd /opt/k8s/work
    [root@k8s-ha01 work]# wget http://nginx.org/download/nginx-1.15.3.tar.gz
    [root@k8s-ha01 work]# tar -xzvf nginx-1.15.3.tar.gz
    [root@k8s-ha01 work]# cd nginx-1.15.3
    [root@k8s-ha01 nginx-1.15.3]# mkdir nginx-prefix
    [root@k8s-ha01 nginx-1.15.3]# ./configure --with-stream --without-http --prefix=$(pwd)/nginx-prefix --without-http_uwsgi_module --without-http_scgi_module --without-http_fastcgi_module
     
    解决说明:
    --with-stream:开启 4 层透明转发(TCP Proxy)功能;
    --without-xxx:关闭所有其他功能,这样生成的动态链接二进制程序依赖最小;
     
    预期输出:
    Configuration summary
      + PCRE library is not used
      + OpenSSL library is not used
      + zlib library is not used
    
      nginx path prefix: "/opt/k8s/work/nginx-1.15.3/nginx-prefix"
      nginx binary file: "/opt/k8s/work/nginx-1.15.3/nginx-prefix/sbin/nginx"
      nginx modules path: "/opt/k8s/work/nginx-1.15.3/nginx-prefix/modules"
      nginx configuration prefix: "/opt/k8s/work/nginx-1.15.3/nginx-prefix/conf"
      nginx configuration file: "/opt/k8s/work/nginx-1.15.3/nginx-prefix/conf/nginx.conf"
      nginx pid file: "/opt/k8s/work/nginx-1.15.3/nginx-prefix/logs/nginx.pid"
      nginx error log file: "/opt/k8s/work/nginx-1.15.3/nginx-prefix/logs/error.log"
      nginx http access log file: "/opt/k8s/work/nginx-1.15.3/nginx-prefix/logs/access.log"
      nginx http client request body temporary files: "client_body_temp"
      nginx http proxy temporary files: "proxy_temp"
    继续编译和安装:
    [root@k8s-ha01 nginx-1.15.3]# make && make install
    2.验证编译的 nginx
    [root@k8s-ha01 nginx-1.15.3]# ./nginx-prefix/sbin/nginx -v
    nginx version: nginx/1.15.3
    查看 nginx 动态链接的库:
    [root@k8s-ha01 nginx-1.15.3]# ldd ./nginx-prefix/sbin/nginx
        linux-vdso.so.1 =>  (0x00007ffeab3f9000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007fbe69175000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fbe68f59000)
        libc.so.6 => /lib64/libc.so.6 (0x00007fbe68b8c000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fbe69379000)
    由于只开启了 4 层透明转发功能,所以除了依赖 libc 等操作系统核心 lib 库外,没有对其它 lib 的依赖(如 libz、libssl 等),这样可以方便部署到各版本操作系统中;
    3.安装和部署 nginx
    [root@k8s-ha01 nginx-1.15.3]# mkdir -p /opt/k8s/kube-nginx/{conf,logs,sbin}
    [root@k8s-ha01 nginx-1.15.3]# cp /opt/k8s/work/nginx-1.15.3/nginx-prefix/sbin/nginx /opt/k8s/kube-nginx/sbin/kube-nginx
    [root@k8s-ha01 nginx-1.15.3]# chmod a+x /opt/k8s/kube-nginx/sbin/*
    配置 nginx,开启 4 层透明转发功能:
    [root@k8s-ha01 nginx-1.15.3]# vim /opt/k8s/kube-nginx/conf/kube-nginx.conf
    worker_processes 2;
     
    events {
        worker_connections  65525;
    }
     
    stream {
        upstream backend {
            hash $remote_addr consistent;
            server 172.31.46.28:6443        max_fails=3 fail_timeout=30s;
            server 172.31.46.63:6443        max_fails=3 fail_timeout=30s;
            server 172.31.46.67:6443        max_fails=3 fail_timeout=30s;
        }
     
        server {
            listen 8443;
            proxy_connect_timeout 1s;
            proxy_pass backend;
        }
    }
    [root@k8s-ha01 nginx-1.15.3]# ulimit -n 65525
    [root@k8s-ha01 nginx-1.15.3]# vim /etc/security/limits.conf # 文件底部添加下面四行内容
    * soft nofile 65525
    * hard nofile 65525
    * soft nproc 65525
    * hard nproc 65525
    4.配置 systemd unit 文件,启动服务
    [root@k8s-ha01 nginx-1.15.3]# vim /etc/systemd/system/kube-nginx.service
    
    [Unit]
    Description=kube-apiserver nginx proxy
    After=network.target
    After=network-online.target
    Wants=network-online.target
     
    [Service]
    Type=forking
    ExecStartPre=/opt/k8s/kube-nginx/sbin/kube-nginx -c /opt/k8s/kube-nginx/conf/kube-nginx.conf -p /opt/k8s/kube-nginx -t
    ExecStart=/opt/k8s/kube-nginx/sbin/kube-nginx -c /opt/k8s/kube-nginx/conf/kube-nginx.conf -p /opt/k8s/kube-nginx
    ExecReload=/opt/k8s/kube-nginx/sbin/kube-nginx -c /opt/k8s/kube-nginx/conf/kube-nginx.conf -p /opt/k8s/kube-nginx -s reload
    PrivateTmp=true
    Restart=always
    RestartSec=5
    StartLimitInterval=0
    LimitNOFILE=65536
     
    [Install]
    WantedBy=multi-user.target
    [root@k8s-ha01 nginx-1.15.3]# systemctl daemon-reload && systemctl enable kube-nginx && systemctl restart kube-nginx
    [root@k8s-ha01 nginx-1.15.3]# lsof -i:8443
    COMMAND     PID   USER   FD   TYPE   DEVICE SIZE/OFF NODE NAME
    kube-ngin 32008   root    5u  IPv4 16980744      0t0  TCP *:pcsync-https (LISTEN)
    kube-ngin 32009 nobody    5u  IPv4 16980744      0t0  TCP *:pcsync-https (LISTEN)
    kube-ngin 32010 nobody    5u  IPv4 16980744      0t0  TCP *:pcsync-https (LISTEN)
    测试下8443代理端口连通性
    [root@k8s-ha01 nginx-1.15.3]# telnet 172.31.46.47 8443
    Trying 172.31.46.47...
    telnet: connect to address 172.31.46.47: No route to host
    这是因为三个kube-apiserver服务还没有部署,即后端三个apiserver实例的6443端口还没有起来。

    2.安装和配置keepalived 

    下面操作在172.31.46.22、172.31.46.3两个节点机器上操作,也就是上面规划的k8s-ha01和k8s-ha02节点。

    1.编译安装keepalived (两个节点上同样操作)
    [root@k8s-ha01 ~]# cd /opt/k8s/work/
    [root@k8s-ha01 work]# wget https://www.keepalived.org/software/keepalived-2.0.16.tar.gz
    [root@k8s-ha01 work]# tar -zvxf keepalived-2.0.16.tar.gz
    [root@k8s-ha01 work]# cd keepalived-2.0.16
    [root@k8s-ha01 keepalived-2.0.16]# ./configure
    [root@k8s-ha01 keepalived-2.0.16]# make && make install
    [root@k8s-ha01 keepalived-2.0.16]# cp keepalived/etc/init.d/keepalived /etc/rc.d/init.d/
    [root@k8s-ha01 keepalived-2.0.16]# cp /usr/local/etc/sysconfig/keepalived /etc/sysconfig/
    [root@k8s-ha01 keepalived-2.0.16]# mkdir /etc/keepalived
    [root@k8s-ha01 keepalived-2.0.16]# cp /usr/local/etc/keepalived/keepalived.conf /etc/keepalived/
    [root@k8s-ha01 keepalived-2.0.16]# cp /usr/local/sbin/keepalived /usr/sbin/
    [root@k8s-ha01 keepalived-2.0.16]# echo "/etc/init.d/keepalived start" >> /etc/rc.local
    2.配置keepalived
    172.31.46.22(也就是k8s-ha01节点上的keepalived配置内容)
    [root@k8s-ha01 ~]# cp /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf.bak
    [root@k8s-ha01 ~]# >/etc/keepalived/keepalived.conf
    [root@k8s-ha01 ~]# vim /etc/keepalived/keepalived.conf
    ! Configuration File for keepalived    
       
    global_defs {
    notification_email {    
    ops@wangshibo.cn 
    tech@wangshibo.cn
    }
       
    notification_email_from ops@wangshibo.cn 
    smtp_server 127.0.0.1     
    smtp_connect_timeout 30   
    router_id master-node    
    }
       
    vrrp_script chk_http_port {     
        script "/opt/chk_nginx.sh" 
        interval 2                  
        weight -5                  
        fall 2              
        rise 1                 
    }
       
    vrrp_instance VI_1 {   
        state MASTER   
        interface eth0
        mcast_src_ip 172.31.46.22
        virtual_router_id 51        
        priority 101               
        advert_int 1                
        authentication {           
            auth_type PASS         
            auth_pass 1111         
        }
        virtual_ipaddress {       
            172.31.46.47
        }
      
    track_script {                     
       chk_http_port                   
    }
    }
    另一个节点172.31.46.3上的keepalived配置内容为:
    [root@k8s-ha02 ~]# cp /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf.bak
    [root@k8s-ha02 ~]# >/etc/keepalived/keepalived.conf
    [root@k8s-ha02 ~]# vim /etc/keepalived/keepalived.conf
    ! Configuration File for keepalived    
       
    global_defs {
    notification_email {    
    ops@wangshibo.cn 
    tech@wangshibo.cn
    }
       
    notification_email_from ops@wangshibo.cn 
    smtp_server 127.0.0.1     
    smtp_connect_timeout 30   
    router_id slave-node    
    }
       
    vrrp_script chk_http_port {     
        script "/opt/chk_nginx.sh" 
        interval 2                  
        weight -5                  
        fall 2              
        rise 1                 
    }
       
    vrrp_instance VI_1 {   
        state MASTER   
        interface eth0
        mcast_src_ip 172.31.46.3
        virtual_router_id 51        
        priority 99              
        advert_int 1                
        authentication {           
            auth_type PASS         
            auth_pass 1111         
        }
        virtual_ipaddress {       
            172.31.46.67
        }
      
    track_script {                     
       chk_http_port                   
    }
    }
    3.配置两个节点的nginx监控脚本(该脚本会在keepalived.conf配置中被引用)
    [root@k8s-ha01 ~]# vim /opt/chk_nginx.sh
    #!/bin/bash
    counter=$(ps -ef|grep -w kube-nginx|grep -v grep|wc -l)
    if [ "${counter}" = "0" ]; then
        systemctl start kube-nginx
        sleep 2
        counter=$(ps -ef|grep kube-nginx|grep -v grep|wc -l)
        if [ "${counter}" = "0" ]; then
            /etc/init.d/keepalived stop
        fi
    fi
     
    [root@k8s-ha01 ~]# chmod 755 /opt/chk_nginx.sh
    4.启动两个节点的keepalived服务
    [root@k8s-ha01 ~]# /etc/init.d/keepalived start
    Starting keepalived (via systemctl):                       [  OK  ]
    [root@k8s-ha01 ~]# ps -ef|grep keepalived
    root      2283     1  0 11:42 ?        00:00:00 /usr/local/sbin/keepalived -D
    root      2284  2283  0 11:42 ?        00:00:00 /usr/local/sbin/keepalived -D
    root      2348 31756  0 11:42 pts/0    00:00:00 grep --color=auto keepalived
    查看vip情况. 发现vip默认起初会在master节点上
    [root@k8s-ha01 ~]# ip addr
    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        inet 127.0.0.1/8 scope host lo
           valid_lft forever preferred_lft forever
    2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
        link/ether fa:16:3e:47:7d:8a brd ff:ff:ff:ff:ff:ff
        inet 172.31.46.22/24 brd 172.31.46.255 scope global dynamic eth0
           valid_lft 40761sec preferred_lft 40761sec
        inet 172.31.46.47/32 scope global eth0
           valid_lft forever preferred_lft forever
    5.测试vip故障转移

    [root@k8s-ha01 ~]# /etc/init.d/keepalived stop
    Stopping keepalived (via systemctl): [ OK ]
    [root@k8s-ha01 ~]# ps -ef |grep keepalived
    root 22843 22670 0 14:15 pts/0 00:00:00 grep --color=auto keepalived
    [root@k8s-ha01 ~]# ip addr
    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    valid_lft forever preferred_lft forever
    2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether fa:16:3e:47:7d:8a brd ff:ff:ff:ff:ff:ff
    inet 172.31.46.22/24 brd 172.31.46.255 scope global dynamic eth0
    valid_lft 31611sec preferred_lft 31611sec

    [root@k8s-ha02 ~]# ip addr
    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    valid_lft forever preferred_lft forever
    2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether fa:16:3e:c6:a1:51 brd ff:ff:ff:ff:ff:ff
    inet 172.31.46.3/24 brd 172.31.46.255 scope global dynamic eth0
    valid_lft 29997sec preferred_lft 29997sec
    inet 172.31.46.67/32 scope global eth0
    valid_lft forever preferred_lft forever

    测试发现:
    当master节点的keepalived服务挂掉,vip会自动漂移到slave节点上
    当master节点的keepliaved服务恢复后,从将vip资源从slave节点重新抢占回来(keepalived配置文件中的priority优先级决定的)
    并且当两个节点的nginx挂掉后,keepaived会引用nginx监控脚本自启动nginx服务,如启动失败,则强杀keepalived服务,从而实现vip转移。
  • 相关阅读:
    图像处理基本算法(整理)
    Java 数据校验自动化(validation)
    Java Web文件上传
    JavaScript中call、apply、bind、slice的使用
    在不借助其他工具的情况下破解Windows开机密码
    【Docker】iptables failed: iptables --wait -t nat -A DOCKER -p tcp -d 0/0 --dport 8480 -j DNAT --to-destination 172.17.0.2:80 ! -i docker0: iptables: No chain/target/match by that name
    【异常】Caused by: java.lang.IllegalStateException: RequestParam.value() was empty on parameter 0
    【Docker】docker的安装和常用命令
    【监控】jvisualvm之jmx远程连接 jar启动应用
    【监控】jvisualvm之jmx远程连接 tomcat war启动应用
  • 原文地址:https://www.cnblogs.com/qingbaizhinian/p/14290333.html
Copyright © 2020-2023  润新知