• linux运维、架构之路-Kubernetes集群部署TLS双向认证


    一、kubernetes的认证授权

          Kubernetes集群的所有操作基本上都是通过kube-apiserver这个组件进行的,它提供HTTP RESTful形式的API供集群内外客户端调用。需要注意的是:认证授权过程只存在HTTPS形式的API中。也就是说,如果客户端使用HTTP连接到kube-apiserver,那么是不会进行认证授权的。所以说,可以这么设置,在集群内部组件间通信使用HTTP,集群外部就使用HTTPS,这样既增加了安全性,也不至于太复杂。 

          Kubernetes 提供了多种安全认证机制, 其中对于集群通讯间可采用 TLS(https) 双向认证机制,也可采用基于 Token 或用户名密码的单向 tls 认证。k8s一般在内网部署,采用私有 IP 地址进行通讯,权威CA只能签署域名证书,我们这里采用自建CA。

    1、k8s集群组件版本、环境

    Kubernetes 1.12.3
    Docker 18.09.0-ce
    Etcd 3.3.10
    Flanneld 0.10.0
    插件:
    Coredns
    Dashboard
    Heapster (influxdb、grafana)
    Metrics-Server
    EFK (elasticsearch、fluentd、kibana)
    镜像仓库:
    docker registry
    harbor

    2、集群架构图

    3、系统环境

    [root@k8s-master ~]# cat /etc/redhat-release 
    CentOS Linux release 7.2.1511 (Core) 
    [root@k8s-master ~]# uname -r
    3.10.0-327.el7.x86_64
    [root@k8s-master ~]# systemctl status firewalld.service 
    ● firewalld.service - firewalld - dynamic firewall daemon
       Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled)
       Active: inactive (dead)
    [root@k8s-master ~]# getenforce 
    Disabled

    4、服务器规划

    节点及功能

    主机名

    IP

    Master、etcd、registry

    K8s-node-1

    10.0.0.206

    Node1 kube-proxy、kubelet

    K8s-node-2

    10.0.0.207

    Node2 kube-proxy、kubelet

    K8s-node-3

    10.0.0.208

    5、统一hosts解析

    cat >> /etc/hosts <<EOF
    10.0.0.206  K8s-node-1      K8s-node-1
    10.0.0.207  K8s-node-2      K8s-node-2
    10.0.0.208  K8s-node-3      K8s-node-3
    EOF

    6、免密码 ssh 登录其它节点

    ssh-keygen -t rsa
    ssh-copy-id root@K8s-node-1
    ssh-copy-id root@K8s-node-2
    ssh-copy-id root@K8s-node-3

    7、环境变量设置

    echo 'PATH=/opt/k8s/bin:$PATH' >>/root/.bashrc

    8、安装依赖包

    yum install -y epel-release
    yum install -y conntrack ipvsadm ipset jq iptables curl sysstat libseccomp
    /usr/sbin/modprobe ip_vs

    9、优化内核

    cat > kubernetes.conf <<EOF
    net.bridge.bridge-nf-call-iptables=1
    net.bridge.bridge-nf-call-ip6tables=1
    net.ipv4.ip_forward=1
    net.ipv4.tcp_tw_recycle=0
    vm.swappiness=0 # 禁止使用 swap 空间,只有当系统 OOM 时才允许使用它
    vm.overcommit_memory=1 # 不检查物理内存是否够用
    vm.panic_on_oom=0 # 开启 OOM
    fs.inotify.max_user_instances=8192
    fs.inotify.max_user_watches=1048576
    fs.file-max=52706963
    fs.nr_open=52706963
    net.ipv6.conf.all.disable_ipv6=1
    net.netfilter.nf_conntrack_max=2310720
    EOF
    cp kubernetes.conf  /etc/sysctl.d/kubernetes.conf
    sysctl -p /etc/sysctl.d/kubernetes.conf

    10、时间同步

    yum -y install ntp ntpdate
    [root@k8s-master ~]# crontab -l
    #sync time
    5 * * * * /usr/sbin/ntpdate cn.pool.ntp.org >/dev/null 2>&1

    11、设置 rsyslogd 和 systemd journald

    systemd的journald是Centos7缺省的日志记录工具,它记录了所有系统、内核、Service Unit 的日志。相比 systemd,journald 记录的日志有如下优势:
    可以记录到内存或文件系统;(默认记录到内存,对应的位置为 /run/log/jounal)
    可以限制占用的磁盘空间、保证磁盘剩余空间;
    可以限制日志文件大小、保存的时间;
    journald 默认将日志转发给 rsyslog,这会导致日志写了多份,/var/log/messages 中包含了太多无关日志,不方便后续查看,同时也影响系统性能。

    mkdir /var/log/journal # 持久化保存日志的目录
    mkdir /etc/systemd/journald.conf.d
    cat > /etc/systemd/journald.conf.d/99-prophet.conf <<EOF
    [Journal]
    # 持久化保存到磁盘
    Storage=persistent
    
    # 压缩历史日志
    Compress=yes
    
    SyncIntervalSec=5m
    RateLimitInterval=30s
    RateLimitBurst=1000
    
    # 最大占用空间 10G
    SystemMaxUse=10G
    
    # 单日志文件最大 200M
    SystemMaxFileSize=200M
    
    # 日志保存时间 2 周
    MaxRetentionSec=2week
    
    # 不将日志转发到 syslog
    ForwardToSyslog=no
    EOF
    systemctl restart systemd-journald

    12、创建相关目录

    mkdir -p  /opt/k8s/{bin,work} /etc/kubernetes/cert /etc/etcd/cert 

    13、分发集群环境变量定义脚本

    #!/usr/bin/bash
    
    # 生成 EncryptionConfig 所需的加密 key
    export ENCRYPTION_KEY=$(head -c 32 /dev/urandom | base64)
    
    # 集群各机器 IP 数组
    export NODE_IPS=(10.0.0.206 10.0.0.207 10.0.0.208)
    
    # 集群各 IP 对应的 主机名数组
    export NODE_NAMES=(K8s-node-1 K8s-node-2 K8s-node-3 )
    
    # etcd 集群服务地址列表
    export ETCD_ENDPOINTS="https://10.0.0.206:2379,https://10.0.0.207:2379,https://10.0.0.208:2379"
    
    # etcd 集群间通信的 IP 和端口
    export ETCD_NODES="K8s-node-1=https://10.0.0.206:2380,K8s-node-2=https://10.0.0.207:2380,K8s-node-3=https://10.0.0.208:2380"
    
    # kube-apiserver 的反向代理(kube-nginx)地址端口
    export KUBE_APISERVER="https://127.0.0.1:8443"
    
    # 节点间互联网络接口名称
    export IFACE="eth0"
    
    # etcd 数据目录
    export ETCD_DATA_DIR="/data/k8s/etcd/data"
    
    # etcd WAL 目录,建议是 SSD 磁盘分区,或者和 ETCD_DATA_DIR 不同的磁盘分区
    export ETCD_WAL_DIR="/data/k8s/etcd/wal"
    
    # k8s 各组件数据目录
    export K8S_DIR="/data/k8s/k8s"
    
    # docker 数据目录
    export DOCKER_DIR="/data/k8s/docker"
    
    ## 以下参数一般不需要修改
    
    # TLS Bootstrapping 使用的 Token,可以使用命令 head -c 16 /dev/urandom | od -An -t x | tr -d ' ' 生成
    BOOTSTRAP_TOKEN="41f7e4ba8b7be874fcff18bf5cf41a7c"
    
    # 最好使用 当前未用的网段 来定义服务网段和 Pod 网段
    
    # 服务网段,部署前路由不可达,部署后集群内路由可达(kube-proxy 保证)
    SERVICE_CIDR="10.254.0.0/16"
    
    # Pod 网段,建议 /16 段地址,部署前路由不可达,部署后集群内路由可达(flanneld 保证)
    CLUSTER_CIDR="172.30.0.0/16"
    
    # 服务端口范围 (NodePort Range)
    export NODE_PORT_RANGE="30000-32767"
    
    # flanneld 网络配置前缀
    export FLANNEL_ETCD_PREFIX="/kubernetes/network"
    
    # kubernetes 服务 IP (一般是 SERVICE_CIDR 中第一个IP)
    export CLUSTER_KUBERNETES_SVC_IP="10.254.0.1"
    
    # 集群 DNS 服务 IP (从 SERVICE_CIDR 中预分配)
    export CLUSTER_DNS_SVC_IP="10.254.0.2"
    
    # 集群 DNS 域名(末尾不带点号)
    export CLUSTER_DNS_DOMAIN="cluster.local"
    
    # 将二进制目录 /opt/k8s/bin 加到 PATH 中
    export PATH=/opt/k8s/bin:$PATH

    全局变量定义脚本拷贝到所有节点的 /opt/k8s/bin 目录

    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        scp /opt/k8s/bin/environment.sh root@${node_ip}:/opt/k8s/bin/
        ssh root@${node_ip} "chmod +x /opt/k8s/bin/*"
      done

     至此,基础环境准备完毕,如没有特殊说明的话,接下来所有的操作都在K8s-node-1执行然后分发到其它节点

    二、创建CA 证书和密钥

    安装CFSSL

    mkdir -p /opt/k8s/cert && cd /opt/k8s
    wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
    mv cfssl_linux-amd64 /opt/k8s/bin/cfssl
    
    wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
    mv cfssljson_linux-amd64 /opt/k8s/bin/cfssljson
    
    wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64
    mv cfssl-certinfo_linux-amd64 /opt/k8s/bin/cfssl-certinfo
    
    chmod +x /opt/k8s/bin/*
    export PATH=/opt/k8s/bin:$PATH

    ①创建根证书CA

    CA 证书是集群所有节点共享的,只需要创建一个 CA 证书,后续创建的所有证书都由它签名

    创建配置文件

    cd /opt/k8s/work
    cat > ca-config.json <<EOF
    {
      "signing": {
        "default": {
          "expiry": "87600h"
        },
        "profiles": {
          "kubernetes": {
            "usages": [
                "signing",
                "key encipherment",
                "server auth",
                "client auth"
            ],
            "expiry": "87600h"
          }
        }
      }
    }
    EOF

    signing:表示该证书可用于签名其它证书,生成的 ca.pem 证书中 CA=TRUE;
    server auth:表示 client 可以用该该证书对 server 提供的证书进行验证;
    client auth:表示 server 可以用该该证书对 client 提供的证书进行验证;

    创建证书签名请求文件

    cd /opt/k8s/work
    cat > ca-csr.json <<EOF
    {
      "CN": "kubernetes",
      "key": {
        "algo": "rsa",
        "size": 2048
      },
      "names": [
        {
          "C": "CN",
          "ST": "BeiJing",
          "L": "BeiJing",
          "O": "k8s",
          "OU": "4Paradigm"
        }
      ]
    }
    EOF

    ②生成CA 证书和私钥

    cd /opt/k8s/work
    cfssl gencert -initca ca-csr.json | cfssljson -bare ca
    ls ca*

    ③分发证书文件

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh # 导入 NODE_IPS 环境变量
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        ssh root@${node_ip} "mkdir -p /etc/kubernetes/cert"
        scp ca*.pem ca-config.json root@${node_ip}:/etc/kubernetes/cert
      done

    三、部署 kubectl 

    1、下载和分发 kubectl 二进制文件

    ①下载解压

    cd /opt/k8s/work
    wget https://dl.k8s.io/v1.12.3/kubernetes-client-linux-amd64.tar.gz
    tar -xzvf kubernetes-client-linux-amd64.tar.gz

    ②分发到所有使用kubectl节点

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        scp kubernetes/client/bin/kubectl root@${node_ip}:/opt/k8s/bin/
        ssh root@${node_ip} "chmod +x /opt/k8s/bin/*"
      done

    2、创建 admin 证书和私钥

    kubectl 与 apiserver https 安全端口通信,apiserver 对提供的证书进行认证和授权。
    kubectl 作为集群的管理工具,需要被授予最高权限。这里创建具有最高权限的 admin 证书。

    ①创建证书签名请求

    cd /opt/k8s/work
    cat > admin-csr.json <<EOF
    {
      "CN": "admin",
      "hosts": [],
      "key": {
        "algo": "rsa",
        "size": 2048
      },
      "names": [
        {
          "C": "CN",
          "ST": "BeiJing",
          "L": "BeiJing",
          "O": "system:masters",
          "OU": "4Paradigm"
        }
      ]
    }
    EOF

    ②生成证书和私钥

    cd /opt/k8s/work
    cfssl gencert -ca=/opt/k8s/work/ca.pem 
      -ca-key=/opt/k8s/work/ca-key.pem 
      -config=/opt/k8s/work/ca-config.json 
      -profile=kubernetes admin-csr.json | cfssljson -bare admin
    ls admin*

    3、创建 kubeconfig 文件

    ①kubeconfig 为 kubectl 的配置文件

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    
    # 设置集群参数
    kubectl config set-cluster kubernetes 
      --certificate-authority=/opt/k8s/work/ca.pem 
      --embed-certs=true 
      --server=${KUBE_APISERVER} 
      --kubeconfig=kubectl.kubeconfig
    
    # 设置客户端认证参数
    kubectl config set-credentials admin 
      --client-certificate=/opt/k8s/work/admin.pem 
      --client-key=/opt/k8s/work/admin-key.pem 
      --embed-certs=true 
      --kubeconfig=kubectl.kubeconfig
    
    # 设置上下文参数
    kubectl config set-context kubernetes 
      --cluster=kubernetes 
      --user=admin 
      --kubeconfig=kubectl.kubeconfig
      
    # 设置默认上下文
    kubectl config use-context kubernetes --kubeconfig=kubectl.kubeconfig

    --certificate-authority:验证 kube-apiserver 证书的根证书;
    --client-certificate、--client-key:刚生成的 admin 证书和私钥,连接 kube-apiserver 时使用;
    --embed-certs=true:将 ca.pem 和 admin.pem 证书内容嵌入到生成的 kubectl.kubeconfig 文件中(不加时,写入的是证书文件路径)

    分发 kubeconfig 文件

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        ssh root@${node_ip} "mkdir -p ~/.kube"
        scp kubectl.kubeconfig root@${node_ip}:~/.kube/config
      done
    • 保存到用户的 ~/.kube/config 文件

    四、部署高可用etcd 集群

     1、下载和分发 etcd 二进制文件

    ①下载解压

    cd /opt/k8s/work
    wget https://github.com/coreos/etcd/releases/download/v3.3.10/etcd-v3.3.10-linux-amd64.tar.gz
    tar -xvf etcd-v3.3.10-linux-amd64.tar.gz

    ②分发到集群所有节点

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        scp etcd-v3.3.10-linux-amd64/etcd* root@${node_ip}:/opt/k8s/bin
        ssh root@${node_ip} "chmod +x /opt/k8s/bin/*"
      done

    2、创建 etcd 证书和私钥

    ①创建证书签名请求

    cd /opt/k8s/work
    cat > etcd-csr.json <<EOF
    {
      "CN": "etcd",
      "hosts": [
        "127.0.0.1",
        "10.0.0.206",
        "10.0.0.207",
        "10.0.0.208"
      ],
      "key": {
        "algo": "rsa",
        "size": 2048
      },
      "names": [
        {
          "C": "CN",
          "ST": "BeiJing",
          "L": "BeiJing",
          "O": "k8s",
          "OU": "4Paradigm"
        }
      ]
    }
    EOF

    ②生成证书和私钥

    cd /opt/k8s/work
    cfssl gencert -ca=/opt/k8s/work/ca.pem 
        -ca-key=/opt/k8s/work/ca-key.pem 
        -config=/opt/k8s/work/ca-config.json 
        -profile=kubernetes etcd-csr.json | cfssljson -bare etcd
    ls etcd*pem

    ③分发生成的证书和私钥到各 etcd 节点

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        ssh root@${node_ip} "mkdir -p /etc/etcd/cert"
        scp etcd*.pem root@${node_ip}:/etc/etcd/cert/
      done

    3、创建etcd 的systemd unit模板文件

    ①创建模板文件

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    cat > etcd.service.template <<EOF
    [Unit]
    Description=Etcd Server
    After=network.target
    After=network-online.target
    Wants=network-online.target
    Documentation=https://github.com/coreos
    
    [Service]
    Type=notify
    WorkingDirectory=${ETCD_DATA_DIR}
    ExecStart=/opt/k8s/bin/etcd \
      --data-dir=${ETCD_DATA_DIR} \
      --wal-dir=${ETCD_WAL_DIR} \
      --name=##NODE_NAME## \
      --cert-file=/etc/etcd/cert/etcd.pem \
      --key-file=/etc/etcd/cert/etcd-key.pem \
      --trusted-ca-file=/etc/kubernetes/cert/ca.pem \
      --peer-cert-file=/etc/etcd/cert/etcd.pem \
      --peer-key-file=/etc/etcd/cert/etcd-key.pem \
      --peer-trusted-ca-file=/etc/kubernetes/cert/ca.pem \
      --peer-client-cert-auth \
      --client-cert-auth \
      --listen-peer-urls=https://##NODE_IP##:2380 \
      --initial-advertise-peer-urls=https://##NODE_IP##:2380 \
      --listen-client-urls=https://##NODE_IP##:2379,http://127.0.0.1:2379 \
      --advertise-client-urls=https://##NODE_IP##:2379 \
      --initial-cluster-token=etcd-cluster-0 \
      --initial-cluster=${ETCD_NODES} \
      --initial-cluster-state=new \
      --auto-compaction-mode=periodic \
      --auto-compaction-retention=1 \
      --max-request-bytes=33554432 \
      --quota-backend-bytes=6442450944 \
      --heartbeat-interval=250 \
      --election-timeout=2000
    Restart=on-failure
    RestartSec=5
    LimitNOFILE=65536
    
    [Install]
    WantedBy=multi-user.target
    EOF
    • WorkingDirectory--data-dir:指定工作目录和数据目录为 ${ETCD_DATA_DIR},需在启动服务前创建这个目录;
    • --wal-dir:指定 wal 目录,为了提高性能,一般使用 SSD 或者和 --data-dir 不同的磁盘;
    • --name:指定节点名称,当 --initial-cluster-state 值为 new 时,--name 的参数值必须位于 --initial-cluster列表中;
    • --cert-file--key-file:etcd server 与 client 通信时使用的证书和私钥;
    • --trusted-ca-file:签名 client 证书的 CA 证书,用于验证 client 证书;
    • --peer-cert-file--peer-key-file:etcd 与 peer 通信使用的证书和私钥;
    • --peer-trusted-ca-file:签名 peer 证书的 CA 证书,用于验证 peer 证书;

    ②替换模板文件中的变量,为各节点创建 systemd unit 文件

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    for (( i=0; i < 3; i++ ))
      do
        sed -e "s/##NODE_NAME##/${NODE_NAMES[i]}/" -e "s/##NODE_IP##/${NODE_IPS[i]}/" etcd.service.template > etcd-${NODE_IPS[i]}.service 
      done
    ls *.service

    ③分发节点

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        scp etcd-${node_ip}.service root@${node_ip}:/etc/systemd/system/etcd.service
      done

    4、启动etcd服务

    ①启动etcd集群服务

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        ssh root@${node_ip} "mkdir -p ${ETCD_DATA_DIR} ${ETCD_WAL_DIR}"
        ssh root@${node_ip} "systemctl daemon-reload && systemctl enable etcd && systemctl restart etcd " &
      done
    • 必须创建 etcd 数据目录和工作目录;
    • etcd 进程首次启动时会等待其它节点的 etcd 加入集群,命令 systemctl start etcd 会卡住一段时间,为正常现象

    ②检查集群启动状态

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        ssh root@${node_ip} "systemctl status etcd|grep Active"
      done

    ③查看etcd日志

    journalctl -u etcd

    ④验证集群

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        ETCDCTL_API=3 /opt/k8s/bin/etcdctl 
        --endpoints=https://${node_ip}:2379 
        --cacert=/opt/k8s/work/ca.pem 
        --cert=/etc/etcd/cert/etcd.pem 
        --key=/etc/etcd/cert/etcd-key.pem endpoint health
      done

    输出:

    >>> 10.0.0.206
    https://10.0.0.206:2379 is healthy: successfully committed proposal: took = 4.622709ms
    >>> 10.0.0.207
    https://10.0.0.207:2379 is healthy: successfully committed proposal: took = 3.621197ms
    >>> 10.0.0.208
    https://10.0.0.208:2379 is healthy: successfully committed proposal: took = 3.186656ms

    查看当前的 leader

    source /opt/k8s/bin/environment.sh
    ETCDCTL_API=3 /opt/k8s/bin/etcdctl 
      -w table --cacert=/opt/k8s/work/ca.pem 
      --cert=/etc/etcd/cert/etcd.pem 
      --key=/etc/etcd/cert/etcd-key.pem 
      --endpoints=${ETCD_ENDPOINTS} endpoint status

    输出:

    +-------------------------+------------------+---------+---------+-----------+-----------+------------+
    | ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
    +-------------------------+------------------+---------+---------+-----------+-----------+------------+
    | https://10.0.0.206:2379 | 23d31ba59ca79fa0 | 3.3.10 | 20 kB | false | 2 | 8 |
    | https://10.0.0.207:2379 | 2323451019f6428d | 3.3.10 | 20 kB | false | 2 | 8 |
    | https://10.0.0.208:2379 | 7a42012c95def99e | 3.3.10 | 20 kB | true | 2 | 8 |
    +-------------------------+------------------+---------+---------+-----------+-----------+------------+

    五、部署 flannel 网络

    1、下载和分发 flanneld 文件

    ①下载解压

    cd /opt/k8s/work
    mkdir flannel
    wget https://github.com/coreos/flannel/releases/download/v0.10.0/flannel-v0.10.0-linux-amd64.tar.gz
    tar -xzvf flannel-v0.10.0-linux-amd64.tar.gz -C flannel

    ②分发 flanneld 到集群所有节点

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        scp flannel/{flanneld,mk-docker-opts.sh} root@${node_ip}:/opt/k8s/bin/
        ssh root@${node_ip} "chmod +x /opt/k8s/bin/*"
      done

    2、创建 flannel 证书和私钥

    ①创建证书签名请求

    cd /opt/k8s/work
    cat > flanneld-csr.json <<EOF
    {
      "CN": "flanneld",
      "hosts": [],
      "key": {
        "algo": "rsa",
        "size": 2048
      },
      "names": [
        {
          "C": "CN",
          "ST": "BeiJing",
          "L": "BeiJing",
          "O": "k8s",
          "OU": "4Paradigm"
        }
      ]
    }
    EOF

    ②生成证书和私钥

    cfssl gencert -ca=/opt/k8s/work/ca.pem 
      -ca-key=/opt/k8s/work/ca-key.pem 
      -config=/opt/k8s/work/ca-config.json 
      -profile=kubernetes flanneld-csr.json | cfssljson -bare flanneld
    ls flanneld*pem

    ③分发证书和私钥到集群节点

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        ssh root@${node_ip} "mkdir -p /etc/flanneld/cert"
        scp flanneld*.pem root@${node_ip}:/etc/flanneld/cert
      done

    ④向etcd 写入集群Pod网段信息

    注意:本步骤只需执行一次

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    etcdctl 
      --endpoints=${ETCD_ENDPOINTS} 
      --ca-file=/opt/k8s/work/ca.pem 
      --cert-file=/opt/k8s/work/flanneld.pem 
      --key-file=/opt/k8s/work/flanneld-key.pem 
      set ${FLANNEL_ETCD_PREFIX}/config '{"Network":"'${CLUSTER_CIDR}'", "SubnetLen": 21, "Backend": {"Type": "vxlan"}}'
    • flanneld 当前版本 (v0.10.0) 不支持 etcd v3,故使用 etcd v2 API 写入配置 key 和网段数据;
    • 写入的 Pod 网段 ${CLUSTER_CIDR} 地址段如 /16 必须小于 SubnetLen,必须与 kube-controller-manager 的 --cluster-cidr 参数值一致

    3、创建 flanneld的systemd unit 文件

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    cat > flanneld.service << EOF
    [Unit]
    Description=Flanneld overlay address etcd agent
    After=network.target
    After=network-online.target
    Wants=network-online.target
    After=etcd.service
    Before=docker.service
    
    [Service]
    Type=notify
    ExecStart=/opt/k8s/bin/flanneld \
      -etcd-cafile=/etc/kubernetes/cert/ca.pem \
      -etcd-certfile=/etc/flanneld/cert/flanneld.pem \
      -etcd-keyfile=/etc/flanneld/cert/flanneld-key.pem \
      -etcd-endpoints=${ETCD_ENDPOINTS} \
      -etcd-prefix=${FLANNEL_ETCD_PREFIX} \
      -iface=${IFACE} \
      -ip-masq
    ExecStartPost=/opt/k8s/bin/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d /run/flannel/docker
    Restart=always
    RestartSec=5
    StartLimitInterval=0
    
    [Install]
    WantedBy=multi-user.target
    RequiredBy=docker.service
    EOF

    ①分发 flanneld systemd unit 文件到集群节点

    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        scp flanneld.service root@${node_ip}:/etc/systemd/system/
      done

    ②启动 flanneld 服务

    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        ssh root@${node_ip} "systemctl daemon-reload && systemctl enable flanneld && systemctl restart flanneld"
      done

    ③检查集群状态

    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        ssh root@${node_ip} "systemctl status flanneld|grep Active"
      done

    ④如有报错查看日志

    journalctl -u flanneld

    ⑤检查分配给各flanneld的Pod 网段信息

    source /opt/k8s/bin/environment.sh
    etcdctl 
      --endpoints=${ETCD_ENDPOINTS} 
      --ca-file=/etc/kubernetes/cert/ca.pem 
      --cert-file=/etc/flanneld/cert/flanneld.pem 
      --key-file=/etc/flanneld/cert/flanneld-key.pem 
      get ${FLANNEL_ETCD_PREFIX}/config

    ⑥验证各节点能通过 Pod 网段互通

    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        ssh ${node_ip} "/usr/sbin/ip addr show flannel.1|grep -w inet"
      done

    输出:

    >>> 10.0.0.206
        inet 172.30.168.0/32 scope global flannel.1
    >>> 10.0.0.207
        inet 172.30.160.0/32 scope global flannel.1
    >>> 10.0.0.208
        inet 172.30.112.0/32 scope global flannel.1

    六、kube-apiserver 高可用之 nginx 代理

    1、下载和编译 nginx

    cd /opt/k8s/work
    wget http://nginx.org/download/nginx-1.15.3.tar.gz
    tar -xzvf nginx-1.15.3.tar.gz
    
    cd /opt/k8s/work/nginx-1.15.3
    mkdir nginx-prefix
    ./configure --with-stream --without-http --prefix=$(pwd)/nginx-prefix --without-http_uwsgi_module --without-http_scgi_module --without-http_fastcgi_module
    cd /opt/k8s/work/nginx-1.15.3
    make && make install
    • --with-stream:开启 4 层透明转发(TCP Proxy)功能;
    • --without-xxx:关闭所有其他功能,这样生成的动态链接二进制程序依赖最小

    2、安装和部署 nginx

    ①创建目录结构

    mkdir -p /opt/k8s/kube-nginx/{conf,logs,sbin}

    ②拷贝文件

    cp /opt/k8s/work/nginx-1.15.3/nginx-prefix/sbin/nginx  /opt/k8s/kube-nginx/sbin/kube-nginx
    chmod a+x /opt/k8s/kube-nginx/sbin/*

    ③配置Nginx转发

    cat > /opt/k8s/kube-nginx/conf/kube-nginx.conf <<EOF
    worker_processes 1;
    
    events {
        worker_connections  1024;
    }
    
    stream {
        upstream backend {
            hash $remote_addr consistent;
            server 10.0.0.206:6443          max_fails=3 fail_timeout=30s;
            server 10.0.0.207:6443          max_fails=3 fail_timeout=30s;
            server 10.0.0.208:6443          max_fails=3 fail_timeout=30s;
        }
    
        server {
            listen 127.0.0.1:8443;
            proxy_connect_timeout 1s;
            proxy_pass backend;
        }
    }
    EOF

    ④配置 systemd unit 文件,启动服务

    cat > /etc/systemd/system/kube-nginx.service <<EOF
    [Unit]
    Description=kube-apiserver nginx proxy
    After=network.target
    After=network-online.target
    Wants=network-online.target
    
    [Service]
    Type=forking
    ExecStartPre=/opt/k8s/kube-nginx/sbin/kube-nginx -c /opt/k8s/kube-nginx/conf/kube-nginx.conf -p /opt/k8s/kube-nginx -t
    ExecStart=/opt/k8s/kube-nginx/sbin/kube-nginx -c /opt/k8s/kube-nginx/conf/kube-nginx.conf -p /opt/k8s/kube-nginx
    ExecReload=/opt/k8s/kube-nginx/sbin/kube-nginx -c /opt/k8s/kube-nginx/conf/kube-nginx.conf -p /opt/k8s/kube-nginx -s reload
    PrivateTmp=true
    Restart=always
    RestartSec=5
    StartLimitInterval=0
    LimitNOFILE=65536
    
    [Install]
    WantedBy=multi-user.target
    EOF

    启动 kube-nginx 服务

    systemctl daemon-reload && systemctl enable kube-nginx && systemctl restart kube-nginx

    ⑤检查 kube-nginx 运行状态

    systemctl status kube-nginx |grep 'Active:'

    ⑥如有报错,查看日志

    journalctl -u kube-nginx

    七、部署 master 节点

    kubernetes master 节点运行如下组件:

    • kube-apiserver
    • kube-scheduler
    • kube-controller-manager
    • kube-nginx

    1、下载二进制文件

    cd /opt/k8s/work
    wget https://dl.k8s.io/v1.12.3/kubernetes-server-linux-amd64.tar.gz
    tar -xzvf kubernetes-server-linux-amd64.tar.gz
    cd kubernetes
    tar -xzvf  kubernetes-src.tar.gz

    分发文件到集群节点

    cd /opt/k8s/work/kubernetes
    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        scp server/bin/* root@${node_ip}:/opt/k8s/bin/
        ssh root@${node_ip} "chmod +x /opt/k8s/bin/*"
      done

    2、部署高可用 kube-apiserver 集群

    ①创建 kubernetes 证书和私钥

    创建证书签名请求

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    cat > kubernetes-csr.json <<EOF
    {
      "CN": "kubernetes",
      "hosts": [
        "127.0.0.1",
        "10.0.0.206",
        "10.0.0.207",
        "10.0.0.208",
        "${CLUSTER_KUBERNETES_SVC_IP}",
        "kubernetes",
        "kubernetes.default",
        "kubernetes.default.svc",
        "kubernetes.default.svc.cluster",
        "kubernetes.default.svc.cluster.local"
      ],
      "key": {
        "algo": "rsa",
        "size": 2048
      },
      "names": [
        {
          "C": "CN",
          "ST": "BeiJing",
          "L": "BeiJing",
          "O": "k8s",
          "OU": "4Paradigm"
        }
      ]
    }
    EOF
    • hosts 字段指定授权使用该证书的 IP 或域名列表,这里列出了 VIP 、apiserver 节点 IP、kubernetes 服务 IP 和域名;

    • 域名最后字符不能是 .(如不能为 kubernetes.default.svc.cluster.local.),否则解析时失败,提示: x509: cannot parse dnsName "kubernetes.default.svc.cluster.local."

    • 如果使用非 cluster.local 域名,如 opsnull.com,则需要修改域名列表中的最后两个域名为:kubernetes.default.svc.opsnullkubernetes.default.svc.opsnull.com

    • kubernetes 服务 IP 是 apiserver 自动创建的,一般是 --service-cluster-ip-range 参数指定的网段的第一个IP,后续可以通过如下命令获取:

    生成证书和私钥

    cfssl gencert -ca=/opt/k8s/work/ca.pem 
      -ca-key=/opt/k8s/work/ca-key.pem 
      -config=/opt/k8s/work/ca-config.json 
      -profile=kubernetes kubernetes-csr.json | cfssljson -bare kubernetes
    ls kubernetes*pem

    分发证书和私钥文件到master节点

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        ssh root@${node_ip} "mkdir -p /etc/kubernetes/cert"
        scp kubernetes*.pem root@${node_ip}:/etc/kubernetes/cert/
      done

    创建加密配置文件

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    cat > encryption-config.yaml <<EOF
    kind: EncryptionConfig
    apiVersion: v1
    resources:
      - resources:
          - secrets
        providers:
          - aescbc:
              keys:
                - name: key1
                  secret: ${ENCRYPTION_KEY}
          - identity: {}
    EOF

    分发加密配置文件到集群节点的 /etc/kubernetes 目录下

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        scp encryption-config.yaml root@${node_ip}:/etc/kubernetes/
      done

    ②创建 kube-apiserver systemd unit 模板文件

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    cat > kube-apiserver.service.template <<EOF
    [Unit]
    Description=Kubernetes API Server
    Documentation=https://github.com/GoogleCloudPlatform/kubernetes
    After=network.target
    
    [Service]
    WorkingDirectory=${K8S_DIR}/kube-apiserver
    ExecStart=/opt/k8s/bin/kube-apiserver \
      --enable-admission-plugins=Initializers,NamespaceLifecycle,NodeRestriction,LimitRanger,ServiceAccount,DefaultStorageClass,ResourceQuota \
      --anonymous-auth=false \
      --experimental-encryption-provider-config=/etc/kubernetes/encryption-config.yaml \
      --advertise-address=##NODE_IP## \
      --bind-address=##NODE_IP## \
      --insecure-port=0 \
      --authorization-mode=Node,RBAC \
      --runtime-config=api/all \
      --enable-bootstrap-token-auth \
      --service-cluster-ip-range=${SERVICE_CIDR} \
      --service-node-port-range=${NODE_PORT_RANGE} \
      --tls-cert-file=/etc/kubernetes/cert/kubernetes.pem \
      --tls-private-key-file=/etc/kubernetes/cert/kubernetes-key.pem \
      --client-ca-file=/etc/kubernetes/cert/ca.pem \
      --kubelet-certificate-authority=/etc/kubernetes/cert/ca.pem \
      --kubelet-client-certificate=/etc/kubernetes/cert/kubernetes.pem \
      --kubelet-client-key=/etc/kubernetes/cert/kubernetes-key.pem \
      --kubelet-https=true \
      --service-account-key-file=/etc/kubernetes/cert/ca.pem \
      --etcd-cafile=/etc/kubernetes/cert/ca.pem \
      --etcd-certfile=/etc/kubernetes/cert/kubernetes.pem \
      --etcd-keyfile=/etc/kubernetes/cert/kubernetes-key.pem \
      --etcd-servers=${ETCD_ENDPOINTS} \
      --enable-swagger-ui=true \
      --allow-privileged=true \
      --max-mutating-requests-inflight=2000 \
      --max-requests-inflight=4000 \
      --apiserver-count=3 \
      --audit-log-maxage=30 \
      --audit-log-maxbackup=3 \
      --audit-log-maxsize=100 \
      --audit-log-path=${K8S_DIR}/kube-apiserver/audit.log \
      --event-ttl=168h \
      --logtostderr=true \
      --v=2
    Restart=on-failure
    RestartSec=5
    Type=notify
    LimitNOFILE=65536
    
    [Install]
    WantedBy=multi-user.target
    EOF

    ③创建和分发 kube-apiserver systemd unit 文件

    为各节点创建 systemd unit 文件

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    for (( i=0; i < 3; i++ ))
      do
        sed -e "s/##NODE_NAME##/${NODE_NAMES[i]}/" -e "s/##NODE_IP##/${NODE_IPS[i]}/" kube-apiserver.service.template > kube-apiserver-${NODE_IPS[i]}.service 
      done
    ls kube-apiserver*.service

    分发生成的 systemd unit 文件

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        scp kube-apiserver-${node_ip}.service root@${node_ip}:/etc/systemd/system/kube-apiserver.service
      done

    ④启动 kube-apiserver 服务

    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        ssh root@${node_ip} "mkdir -p ${K8S_DIR}/kube-apiserver"
        ssh root@${node_ip} "systemctl daemon-reload && systemctl enable kube-apiserver && systemctl restart kube-apiserver"
      done

    检查 kube-apiserver 运行状态

    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        ssh root@${node_ip} "systemctl status kube-apiserver |grep 'Active:'"
      done

    如启动报错,查看日志

    journalctl -u kube-apiserver

    ⑤检查集群信息

    [root@k8s-node-1 work]# kubectl cluster-info
    Kubernetes master is running at https://127.0.0.1:8443
    
    To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
    [root@k8s-node-1 work]# kubectl get all --all-namespaces
    NAMESPACE   NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
    default     service/kubernetes   ClusterIP   10.254.0.1   <none>        443/TCP   3m37s
    [root@k8s-node-1 work]# kubectl get componentstatuses
    NAME                 STATUS      MESSAGE                                                                                     ERROR
    scheduler            Unhealthy   Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refused   
    controller-manager   Unhealthy   Get http://127.0.0.1:10252/healthz: dial tcp 127.0.0.1:10252: connect: connection refused   
    etcd-2               Healthy     {"health":"true"}                                                                           
    etcd-1               Healthy     {"health":"true"}                                                                           
    etcd-0               Healthy     {"health":"true"}

     ⑥检查 kube-apiserver 监听的端口

    [root@k8s-node-1 ~]# netstat -lnpt|grep kube
    tcp        0      0 10.0.0.206:6443         0.0.0.0:*               LISTEN      26023/kube-apiserve

    ⑦授予 kubernetes证书访问 kubelet API 的权限

    在执行kubectl exec、run、logs 等命令时,apiserver 会转发到kubelet。这里定义 RBAC 规则,授权 apiserver 调用 kubelet API

    kubectl create clusterrolebinding kube-apiserver:kubelet-apis --clusterrole=system:kubelet-api-admin --user kubernetes

    3、部署高可用 kube-controller-manager 集群

    ①创建 kube-controller-manager 证书和私钥

    创建证书签名请求

    cd /opt/k8s/work
    cat > kube-controller-manager-csr.json <<EOF
    {
        "CN": "system:kube-controller-manager",
        "key": {
            "algo": "rsa",
            "size": 2048
        },
        "hosts": [
          "127.0.0.1",
          "10.0.0.206",
          "10.0.0.207",
          "10.0.0.208"
        ],
        "names": [
          {
            "C": "CN",
            "ST": "BeiJing",
            "L": "BeiJing",
            "O": "system:kube-controller-manager",
            "OU": "4Paradigm"
          }
        ]
    }
    EOF

    生成证书和私钥

    cd /opt/k8s/work
    cfssl gencert -ca=/opt/k8s/work/ca.pem 
      -ca-key=/opt/k8s/work/ca-key.pem 
      -config=/opt/k8s/work/ca-config.json 
      -profile=kubernetes kube-controller-manager-csr.json | cfssljson -bare kube-controller-manager
    ls kube-controller-manager*pem

    分发证书和私钥到集群节点:

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        scp kube-controller-manager*.pem root@${node_ip}:/etc/kubernetes/cert/
      done

    ②创建和分发 kubeconfig 文件

    kubeconfig 文件包含访问 apiserver 的所有信息,如 apiserver 地址、CA 证书和自身使用的证书

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    kubectl config set-cluster kubernetes 
      --certificate-authority=/opt/k8s/work/ca.pem 
      --embed-certs=true 
      --server=${KUBE_APISERVER} 
      --kubeconfig=kube-controller-manager.kubeconfig
    
    kubectl config set-credentials system:kube-controller-manager 
      --client-certificate=kube-controller-manager.pem 
      --client-key=kube-controller-manager-key.pem 
      --embed-certs=true 
      --kubeconfig=kube-controller-manager.kubeconfig
    
    kubectl config set-context system:kube-controller-manager 
      --cluster=kubernetes 
      --user=system:kube-controller-manager 
      --kubeconfig=kube-controller-manager.kubeconfig
    
    kubectl config use-context system:kube-controller-manager --kubeconfig=kube-controller-manager.kubeconfig

    分发 kubeconfig 到集群节点

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        scp kube-controller-manager.kubeconfig root@${node_ip}:/etc/kubernetes/
      done

    ③创建和分发 kube-controller-manager systemd unit 文件

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    cat > kube-controller-manager.service <<EOF
    [Unit]
    Description=Kubernetes Controller Manager
    Documentation=https://github.com/GoogleCloudPlatform/kubernetes
    
    [Service]
    WorkingDirectory=${K8S_DIR}/kube-controller-manager
    ExecStart=/opt/k8s/bin/kube-controller-manager \
      --port=0 \
      --secure-port=10252 \
      --bind-address=127.0.0.1 \
      --kubeconfig=/etc/kubernetes/kube-controller-manager.kubeconfig \
      --authentication-kubeconfig=/etc/kubernetes/kube-controller-manager.kubeconfig \
      --authorization-kubeconfig=/etc/kubernetes/kube-controller-manager.kubeconfig \
      --service-cluster-ip-range=${SERVICE_CIDR} \
      --cluster-name=kubernetes \
      --cluster-signing-cert-file=/etc/kubernetes/cert/ca.pem \
      --cluster-signing-key-file=/etc/kubernetes/cert/ca-key.pem \
      --experimental-cluster-signing-duration=8760h \
      --root-ca-file=/etc/kubernetes/cert/ca.pem \
      --service-account-private-key-file=/etc/kubernetes/cert/ca-key.pem \
      --leader-elect=true \
      --controllers=*,bootstrapsigner,tokencleaner \
      --horizontal-pod-autoscaler-use-rest-clients=true \
      --horizontal-pod-autoscaler-sync-period=10s \
      --tls-cert-file=/etc/kubernetes/cert/kube-controller-manager.pem \
      --tls-private-key-file=/etc/kubernetes/cert/kube-controller-manager-key.pem \
      --use-service-account-credentials=true \
      --kube-api-qps=1000 \
      --kube-api-burst=2000 \
      --logtostderr=true \
      --v=2
    Restart=on-failure
    RestartSec=5
    
    [Install]
    WantedBy=multi-user.target
    EOF

    分发 systemd unit 文件到所有集群节点

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        scp kube-controller-manager.service root@${node_ip}:/etc/systemd/system/
      done

    ④kube-controller-manager 的权限

    kubectl create clusterrolebinding controller-manager:system:auth-delegator --user system:kube-controller-manager --clusterrole system:auth-delegator

    ⑤启动 kube-controller-manager 服务

    启动服务

    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        ssh root@${node_ip} "mkdir -p ${K8S_DIR}/kube-controller-manager"
        ssh root@${node_ip} "systemctl daemon-reload && systemctl enable kube-controller-manager && systemctl restart kube-controller-manager"
      done

    检查集群服务运行状态

    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        ssh root@${node_ip} "systemctl status kube-controller-manager|grep Active"
      done

    确保状态为active(running),否则查看日志,确认原因:

    journalctl -u kube-controller-manager

    ⑥测试 kube-controller-manager 集群的高可用

    停掉一个或两个节点的kube-controller-manager服务,观察其它节点的日志,看是否获取了 leader 权限

    systemctl stop kube-controller-manager.service

    查看当前的 leader

    [root@k8s-node-1 work]# kubectl get endpoints kube-controller-manager --namespace=kube-system  -o yaml
    apiVersion: v1
    kind: Endpoints
    metadata:
      annotations:
        control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"k8s-node-1_e1878012-7c9f-11e9-904c-000c290e828c","leaseDurationSeconds":15,"acquireTime":"2019-05-22T14:43:10Z","renewTime":"2019-05-22T14:44:39Z","leaderTransitions":3}'
      creationTimestamp: 2019-05-22T14:23:04Z
      name: kube-controller-manager
      namespace: kube-system
      resourceVersion: "1706"
      selfLink: /api/v1/namespaces/kube-system/endpoints/kube-controller-manager
      uid: 1ce1e9ba-7c9d-11e9-b7a4-000c290e828c

    4、部署高可用 kube-scheduler 集群

    ①创建 kube-scheduler 证书和私钥

    创建证书签名请求

    cd /opt/k8s/work
    cat > kube-scheduler-csr.json <<EOF
    {
        "CN": "system:kube-scheduler",
        "hosts": [
          "127.0.0.1",
          "10.0.0.206",
          "10.0.0.207",
          "10.0.0.208"
        ],
        "key": {
            "algo": "rsa",
            "size": 2048
        },
        "names": [
          {
            "C": "CN",
            "ST": "BeiJing",
            "L": "BeiJing",
            "O": "system:kube-scheduler",
            "OU": "4Paradigm"
          }
        ]
    }
    EOF

    生成证书和私钥:

    cd /opt/k8s/work
    cfssl gencert -ca=/opt/k8s/work/ca.pem 
      -ca-key=/opt/k8s/work/ca-key.pem 
      -config=/opt/k8s/work/ca-config.json 
      -profile=kubernetes kube-scheduler-csr.json | cfssljson -bare kube-scheduler
    ls kube-scheduler*pem

    ②创建和分发 kubeconfig 文件

    创建kubeconfig

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    kubectl config set-cluster kubernetes 
      --certificate-authority=/opt/k8s/work/ca.pem 
      --embed-certs=true 
      --server=${KUBE_APISERVER} 
      --kubeconfig=kube-scheduler.kubeconfig
    
    kubectl config set-credentials system:kube-scheduler 
      --client-certificate=kube-scheduler.pem 
      --client-key=kube-scheduler-key.pem 
      --embed-certs=true 
      --kubeconfig=kube-scheduler.kubeconfig
    
    kubectl config set-context system:kube-scheduler 
      --cluster=kubernetes 
      --user=system:kube-scheduler 
      --kubeconfig=kube-scheduler.kubeconfig
    
    kubectl config use-context system:kube-scheduler --kubeconfig=kube-scheduler.kubeconfig

    分发kubeconfig 到所节点:

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        scp kube-scheduler.kubeconfig root@${node_ip}:/etc/kubernetes/
      done

    ③创建 kube-scheduler 配置文件

    cat <<EOF | sudo tee kube-scheduler.yaml
    apiVersion: componentconfig/v1alpha1
    kind: KubeSchedulerConfiguration
    clientConnection:
      kubeconfig: "/etc/kubernetes/kube-scheduler.kubeconfig"
    leaderElection:
      leaderElect: true
    EOF

    分发 kube-scheduler 配置文件到所有节点

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        scp kube-scheduler.yaml root@${node_ip}:/etc/kubernetes/
      done

    ④创建和分发 kube-scheduler systemd unit 文件

    cd /opt/k8s/work
    cat > kube-scheduler.service <<EOF
    [Unit]
    Description=Kubernetes Scheduler
    Documentation=https://github.com/GoogleCloudPlatform/kubernetes
    
    [Service]
    WorkingDirectory=${K8S_DIR}/kube-scheduler
    ExecStart=/opt/k8s/bin/kube-scheduler \
      --config=/etc/kubernetes/kube-scheduler.yaml \
      --address=127.0.0.1 \
      --kube-api-qps=100 \
      --logtostderr=true \
      --v=2
    Restart=always
    RestartSec=5
    StartLimitInterval=0
    
    [Install]
    WantedBy=multi-user.target
    EOF

    分发 systemd unit 文件到所有节点

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        scp kube-scheduler.service root@${node_ip}:/etc/systemd/system/
      done

    ⑤启动 kube-scheduler 服务

    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        ssh root@${node_ip} "mkdir -p ${K8S_DIR}/kube-scheduler"
        ssh root@${node_ip} "systemctl daemon-reload && systemctl enable kube-scheduler && systemctl restart kube-scheduler"
      done

    ⑥检查服务运行状态

    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        ssh root@${node_ip} "systemctl status kube-scheduler|grep Active"
      done

    确保状态为 active (running),否则查看日志

     

     

    journalctl -u kube-scheduler

    ⑦测试 kube-scheduler 集群的高可用

    随便停掉一个节点kube-scheduler 服务,看其它节点是否获取了 leader 权限(systemd 日志)

    systemctl stop kube-scheduler.service

    查看当前的 leader

    kubectl get endpoints kube-scheduler --namespace=kube-system  -o yaml

    八、部署Node节点组件

    kubernetes node 节点运行如下组件:

    • docker
    • kubelet     ---上面已部署
    • kube-proxy
    • flanneld    ---上文已部署
    • kube-nginx

    1、部署 docker 组件

     https://download.docker.com/linux/static/stable/x86_64/ 页面下载最新发布包

    ①下载和分发 docker 二进制文件

    下载解压

    cd /opt/k8s/work
    wget https://download.docker.com/linux/static/stable/x86_64/docker-18.09.0.tgz
    tar -xvf docker-18.09.0.tgz

    分发二进制文件到所有node节点:

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        scp docker/*  root@${node_ip}:/opt/k8s/bin/
        ssh root@${node_ip} "chmod +x /opt/k8s/bin/*"
      done

    ②创建和分发systemd unit 文件

    cd /opt/k8s/work
    cat > docker.service <<"EOF"
    [Unit]
    Description=Docker Application Container Engine
    Documentation=http://docs.docker.io
    
    [Service]
    WorkingDirectory=##DOCKER_DIR##
    Environment="PATH=/opt/k8s/bin:/bin:/sbin:/usr/bin:/usr/sbin"
    EnvironmentFile=-/run/flannel/docker
    ExecStart=/opt/k8s/bin/dockerd $DOCKER_NETWORK_OPTIONS
    ExecReload=/bin/kill -s HUP $MAINPID
    Restart=on-failure
    RestartSec=5
    LimitNOFILE=infinity
    LimitNPROC=infinity
    LimitCORE=infinity
    Delegate=yes
    KillMode=process
    
    [Install]
    WantedBy=multi-user.target
    EOF
    • EOF 前后有双引号,这样 bash 不会替换文档中的变量,如 $DOCKER_NETWORK_OPTIONS;

    • dockerd 运行时会调用其它 docker 命令,如 docker-proxy,所以需要将 docker 命令所在的目录加到 PATH 环境变量中;

    • flanneld 启动时将网络配置写入 /run/flannel/docker 文件中,dockerd 启动前读取该文件中的环境变量 DOCKER_NETWORK_OPTIONS ,然后设置 docker0 网桥网段;

    • 如果指定了多个 EnvironmentFile 选项,则必须将 /run/flannel/docker 放在最后(确保 docker0 使用 flanneld 生成的 bip 参数);

    • docker 需要以 root 用于运行;

    • docker 从 1.13 版本开始,可能将 iptables FORWARD chain的默认策略设置为DROP,从而导致 ping 其它 Node 上的 Pod IP 失败,遇到这种情况时,需要手动设置策略为 ACCEPT

    iptables -P FORWARD ACCEPT

    分发 systemd unit 文件到所有node节点

    ③配置和分发 docker 配置文件

    使用国内的仓库镜像服务器以加快 pull image 的速度,同时增加下载的并发数 (需要重启 dockerd 生效)

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    cat > docker-daemon.json <<EOF
    {
        "registry-mirrors": ["https://hub-mirror.c.163.com", "https://docker.mirrors.ustc.edu.cn"],
        "insecure-registries": ["docker02:35000"],
        "max-concurrent-downloads": 20,
        "live-restore": true,
        "max-concurrent-uploads": 10,
        "debug": true,
        "data-root": "${DOCKER_DIR}/data",
        "exec-root": "${DOCKER_DIR}/exec",
        "log-opts": {
          "max-size": "100m",
          "max-file": "5"
        }
    }
    EOF

    分发 docker 配置文件到所有node

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        ssh root@${node_ip} "mkdir -p  /etc/docker/ ${DOCKER_DIR}/{data,exec}"
        scp docker-daemon.json root@${node_ip}:/etc/docker/daemon.json
      done

    ④启动 docker 服务

    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        ssh root@${node_ip} "systemctl stop firewalld && systemctl disable firewalld"
        ssh root@${node_ip} "/usr/sbin/iptables -F && /usr/sbin/iptables -X && /usr/sbin/iptables -F -t nat && /usr/sbin/iptables -X -t nat"
        ssh root@${node_ip} "/usr/sbin/iptables -P FORWARD ACCEPT"
        ssh root@${node_ip} "systemctl daemon-reload && systemctl enable docker && systemctl restart docker"
        ssh root@${node_ip} "sysctl -p /etc/sysctl.d/kubernetes.conf"
      done

    ⑤检查服务运行状态

    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        ssh root@${node_ip} "systemctl status docker|grep Active"
      done

    确保状态为 active (running),否则查看日志

    journalctl -u docker

    ⑥检查 docker0 网桥

    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        ssh root@${node_ip} "/usr/sbin/ip addr show flannel.1 && /usr/sbin/ip addr show docker0"
      done

    确认各 work 节点的 docker0 网桥和 flannel.1 接口的 IP 处于同一个网段中(如下 172.30.168.0/32 位于 172.30.168.1/21 中)

    4: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN 
        link/ether 96:f0:62:fb:38:4b brd ff:ff:ff:ff:ff:ff
        inet 172.30.168.0/32 scope global flannel.1
           valid_lft forever preferred_lft forever
    5: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN 
        link/ether 02:42:df:b8:e8:8d brd ff:ff:ff:ff:ff:ff
        inet 172.30.168.1/21 brd 172.30.175.255 scope global docker0
           valid_lft forever preferred_lft forever

    2、部署 kubelet 组件

    kublet 运行在每个Node节点上,接收 kube-apiserver 发送的请求,管理 Pod 容器,执行交互式命令,如 exec、run、logs 等

    创建 kubelet bootstrap kubeconfig 文件

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    for node_name in ${NODE_NAMES[@]}
      do
        echo ">>> ${node_name}"
    
        # 创建 token
        export BOOTSTRAP_TOKEN=$(kubeadm token create 
          --description kubelet-bootstrap-token 
          --groups system:bootstrappers:${node_name} 
          --kubeconfig ~/.kube/config)
    
        # 设置集群参数
        kubectl config set-cluster kubernetes 
          --certificate-authority=/etc/kubernetes/cert/ca.pem 
          --embed-certs=true 
          --server=${KUBE_APISERVER} 
          --kubeconfig=kubelet-bootstrap-${node_name}.kubeconfig
    
        # 设置客户端认证参数
        kubectl config set-credentials kubelet-bootstrap 
          --token=${BOOTSTRAP_TOKEN} 
          --kubeconfig=kubelet-bootstrap-${node_name}.kubeconfig
    
        # 设置上下文参数
        kubectl config set-context default 
          --cluster=kubernetes 
          --user=kubelet-bootstrap 
          --kubeconfig=kubelet-bootstrap-${node_name}.kubeconfig
    
        # 设置默认上下文
        kubectl config use-context default --kubeconfig=kubelet-bootstrap-${node_name}.kubeconfig
      done

    分发 bootstrap kubeconfig 文件到所有Node节点

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    for node_name in ${NODE_NAMES[@]}
      do
        echo ">>> ${node_name}"
        scp kubelet-bootstrap-${node_name}.kubeconfig root@${node_name}:/etc/kubernetes/kubelet-bootstrap.kubeconfig
      done

    ②创建和分发 kubelet 参数配置文件

    创建 kubelet 参数配置模板文件

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    cat <<EOF | tee kubelet-config.yaml.template
    kind: KubeletConfiguration
    apiVersion: kubelet.config.k8s.io/v1beta1
    authentication:
      anonymous:
        enabled: false
      webhook:
        enabled: true
      x509:
        clientCAFile: "/etc/kubernetes/cert/ca.pem"
    authorization:
      mode: Webhook
    clusterDomain: "${CLUSTER_DNS_DOMAIN}"
    clusterDNS:
      - "${CLUSTER_DNS_SVC_IP}"
    podCIDR: "${POD_CIDR}"
    maxPods: 220
    serializeImagePulls: false
    hairpinMode: promiscuous-bridge
    cgroupDriver: cgroupfs
    runtimeRequestTimeout: "15m"
    rotateCertificates: true
    serverTLSBootstrap: true
    readOnlyPort: 0
    port: 10250
    address: "##NODE_IP##"
    EOF

    为各节点创建和分发 kubelet 配置文件

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do 
        echo ">>> ${node_ip}"
        sed -e "s/##NODE_IP##/${node_ip}/" kubelet-config.yaml.template > kubelet-config-${node_ip}.yaml.template
        scp kubelet-config-${node_ip}.yaml.template root@${node_ip}:/etc/kubernetes/kubelet-config.yaml
      done

    ③创建和分发 kubelet systemd unit 文件

    cd /opt/k8s/work
    cat > kubelet.service.template <<EOF
    [Unit]
    Description=Kubernetes Kubelet
    Documentation=https://github.com/GoogleCloudPlatform/kubernetes
    After=docker.service
    Requires=docker.service
    
    [Service]
    WorkingDirectory=${K8S_DIR}/kubelet
    ExecStart=/opt/k8s/bin/kubelet \
      --root-dir=${K8S_DIR}/kubelet \
      --bootstrap-kubeconfig=/etc/kubernetes/kubelet-bootstrap.kubeconfig \
      --cert-dir=/etc/kubernetes/cert \
      --kubeconfig=/etc/kubernetes/kubelet.kubeconfig \
      --config=/etc/kubernetes/kubelet-config.yaml \
      --hostname-override=##NODE_NAME## \
      --pod-infra-container-image=registry.cn-beijing.aliyuncs.com/k8s_images/pause-amd64:3.1
      --allow-privileged=true \
      --event-qps=0 \
      --kube-api-qps=1000 \
      --kube-api-burst=2000 \
      --registry-qps=0 \
      --image-pull-progress-deadline=30m \
      --logtostderr=true \
      --v=2
    Restart=always
    RestartSec=5
    StartLimitInterval=0
    
    [Install]
    WantedBy=multi-user.target
    EOF

    为各节点创建和分发 kubelet systemd unit 文件

    cd /opt/k8s/work
    source /opt/k8s/bin/environment.sh
    for node_name in ${NODE_NAMES[@]}
      do 
        echo ">>> ${node_name}"
        sed -e "s/##NODE_NAME##/${node_name}/" kubelet.service.template > kubelet-${node_name}.service
        scp kubelet-${node_name}.service root@${node_name}:/etc/systemd/system/kubelet.service
      done

    ④启动 kubelet 服务

    source /opt/k8s/bin/environment.sh
    for node_ip in ${NODE_IPS[@]}
      do
        echo ">>> ${node_ip}"
        ssh root@${node_ip} "mkdir -p ${K8S_DIR}/kubelet"
        ssh root@${node_ip} "/usr/sbin/swapoff -a"
        ssh root@${node_ip} "systemctl daemon-reload && systemctl enable kubelet && systemctl restart kubelet"
      done
    • 必须创建工作目录;
    • 关闭 swap 分区,否则 kubelet 会启动失败

    ⑤自动approve CSR 请求

    创建三个 ClusterRoleBinding,分别用于自动 approve client、renew client、renew server 证书:

    cd /opt/k8s/work
    cat > csr-crb.yaml <<EOF
     # Approve all CSRs for the group "system:bootstrappers"
     kind: ClusterRoleBinding
     apiVersion: rbac.authorization.k8s.io/v1
     metadata:
       name: auto-approve-csrs-for-group
     subjects:
     - kind: Group
       name: system:bootstrappers
       apiGroup: rbac.authorization.k8s.io
     roleRef:
       kind: ClusterRole
       name: system:certificates.k8s.io:certificatesigningrequests:nodeclient
       apiGroup: rbac.authorization.k8s.io
    ---
     # To let a node of the group "system:nodes" renew its own credentials
     kind: ClusterRoleBinding
     apiVersion: rbac.authorization.k8s.io/v1
     metadata:
       name: node-client-cert-renewal
     subjects:
     - kind: Group
       name: system:nodes
       apiGroup: rbac.authorization.k8s.io
     roleRef:
       kind: ClusterRole
       name: system:certificates.k8s.io:certificatesigningrequests:selfnodeclient
       apiGroup: rbac.authorization.k8s.io
    ---
    # A ClusterRole which instructs the CSR approver to approve a node requesting a
    # serving cert matching its client cert.
    kind: ClusterRole
    apiVersion: rbac.authorization.k8s.io/v1
    metadata:
      name: approve-node-server-renewal-csr
    rules:
    - apiGroups: ["certificates.k8s.io"]
      resources: ["certificatesigningrequests/selfnodeserver"]
      verbs: ["create"]
    ---
     # To let a node of the group "system:nodes" renew its own server credentials
     kind: ClusterRoleBinding
     apiVersion: rbac.authorization.k8s.io/v1
     metadata:
       name: node-server-cert-renewal
     subjects:
     - kind: Group
       name: system:nodes
       apiGroup: rbac.authorization.k8s.io
     roleRef:
       kind: ClusterRole
       name: approve-node-server-renewal-csr
       apiGroup: rbac.authorization.k8s.io
    EOF
    • auto-approve-csrs-for-group:自动 approve node 的第一次 CSR; 注意第一次 CSR 时,请求的 Group 为 system:bootstrappers;
    • node-client-cert-renewal:自动 approve node 后续过期的 client 证书,自动生成的证书 Group 为 system:nodes;
    • node-server-cert-renewal:自动 approve node 后续过期的 server 证书,自动生成的证书 Group 为 system:nodes

    生效配置:

    kubectl apply -f csr-crb.yaml

    查看 kublet 的情况

    kubectl get csr
    成功最有效的方法就是向有经验的人学习!
  • 相关阅读:
    获取配置文件
    微服务项目(1)
    string,stringbuffer,stringbuilder区别?
    异常
    IDEA结合Maven的profile构建不同开发环境(SpringBoot)
    出现org.springframework.beans.factory.NoSuchBeanDefinitionException 的解决思路
    Spring中的@Transactional(rollbackFor = Exception.class)属性详解
    Ubuntu安装飞鸽传输
    shell 创建带参数的命令方法
    python查询mysql中文乱码问题
  • 原文地址:https://www.cnblogs.com/yanxinjiang/p/9647545.html
Copyright © 2020-2023  润新知