• 基于Centos 7.8 和Kubeadm部署k8s高可用集群


    原文作者:Zhangguanzhang

    原文链接:http://zhangguanzhang.github.io/2019/11/24/kubeadm-base-use/

    一:系统基础配置

    这里我们认为您的系统是最新且最小化安装的。

    1. 确保时间统一

    yum install chrony -y systemctl enable chronyd && systemctl restart chronyd

    2:关闭交换分区
    swapoff -a && sysctl -w vm.swappiness=0
    sed -ri '/^[^#]*swap/s@^@#@' /etc/fstab

    3:关闭防火墙以及selinux
    systemctl stop firewalld && systemctl disable firewalld
    setenforce 0
    sed -ri '/^[^#]*SELINUX=/s#=.+$#=disabled#' /etc/selinux/config

    4. 关闭NetworkManager,如果ip不是通过NetworkManager纳管的,建议关闭,然后使用network;这里我们依然使用的是network
    systemctl disable NetworkManager && systemctl stop NetworkManager
    systemctl restart network

    5. 安装epel源,并且替换为阿里云的epel源
    yum install epel-release wget -y
    wget -O /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo

    6. 安装依赖组件
    yum install -y 
    curl
    git
    conntrack-tools
    psmisc
    nfs-utils
    jq
    socat
    bash-completion
    ipset
    ipvsadm
    conntrack
    libseccomp
    net-tools
    crontabs
    sysstat
    unzip
    iftop
    nload
    strace
    bind-utils
    tcpdump
    telnet
    lsof
    htop
     

     二:集群kube-proxy使用ipvs模式需要开机加载下列模块

    这里按照规范使用systemd-modules-load来加载而不是在/etc/rc.local里写modprobe

    vim /etc/modules-load.d/ipvs.conf
    
    ip_vs
    ip_vs_rr
    ip_vs_wrr
    ip_vs_sh
    nf_conntrack
    br_netfilter

    systemctl daemon-reload && systemctl enable --now systemd-modules-load.service

    确认内核加载模块

    [root@k8s-m1 ~]# lsmod | grep ip_v
    ip_vs_sh               12688  0 
    ip_vs_wrr              12697  0 
    ip_vs_rr               12600  0 
    ip_vs                 145497  6 ip_vs_rr,ip_vs_sh,ip_vs_wrr
    nf_conntrack          139264  1 ip_vs
    libcrc32c              12644  3 xfs,ip_vs,nf_conntrack

    三: 设定系统参数

    所有机器需要设定/etc/sysctl.d/k8s.conf的系统参数,目前对ipv6支持不怎么好,所以里面也关闭ipv6了。

    cat <<EOF > /etc/sysctl.d/k8s.conf
    net.ipv6.conf.all.disable_ipv6 = 1
    net.ipv6.conf.default.disable_ipv6 = 1
    net.ipv6.conf.lo.disable_ipv6 = 1
    net.ipv4.neigh.default.gc_stale_time = 120
    net.ipv4.conf.all.rp_filter = 0
    net.ipv4.conf.default.rp_filter = 0
    net.ipv4.conf.default.arp_announce = 2
    net.ipv4.conf.lo.arp_announce = 2
    net.ipv4.conf.all.arp_announce = 2
    net.ipv4.ip_forward = 1
    net.ipv4.tcp_max_tw_buckets = 5000
    net.ipv4.tcp_syncookies = 1
    net.ipv4.tcp_max_syn_backlog = 1024
    net.ipv4.tcp_synack_retries = 2
    # 要求iptables不对bridge的数据进行处理
    net.bridge.bridge-nf-call-ip6tables = 1
    net.bridge.bridge-nf-call-iptables = 1
    net.bridge.bridge-nf-call-arptables = 1
    net.netfilter.nf_conntrack_max = 2310720
    fs.inotify.max_user_watches=89100
    fs.may_detach_mounts = 1
    fs.file-max = 52706963
    fs.nr_open = 52706963
    vm.overcommit_memory=1
    vm.panic_on_oom=0
    EOF

    如果kube-proxy使用ipvs的话为了防止timeout需要设置下tcp参数

    cat <<EOF >> /etc/sysctl.d/k8s.conf
    # https://github.com/moby/moby/issues/31208 
    # ipvsadm -l --timout
    # 修复ipvs模式下长连接timeout问题 小于900即可
    net.ipv4.tcp_keepalive_time = 600
    net.ipv4.tcp_keepalive_intvl = 30
    net.ipv4.tcp_keepalive_probes = 10
    EOF
    sysctl --system

    优化设置 journal 日志相关,避免日志重复搜集,浪费系统资源。修改systemctl启动的最小文件打开数量。关闭ssh反向dns解析

    # 下面两句apt系列系统没有,执行不影响
    sed -ri 's/^$ModLoad imjournal/#&/' /etc/rsyslog.conf
    sed -ri 's/^$IMJournalStateFile/#&/' /etc/rsyslog.conf
    
    sed -ri 's/^#(DefaultLimitCORE)=/1=100000/' /etc/systemd/system.conf
    sed -ri 's/^#(DefaultLimitNOFILE)=/1=100000/' /etc/systemd/system.conf
    
    sed -ri 's/^#(UseDNS )yes/1no/' /etc/ssh/sshd_config

    文件最大打开数,按照规范,在子配置文件写

    cat>/etc/security/limits.d/kubernetes.conf<<EOF
    *       soft    nproc   131072
    *       hard    nproc   131072
    *       soft    nofile  131072
    *       hard    nofile  131072
    root    soft    nproc   131072
    root    hard    nproc   131072
    root    soft    nofile  131072
    root    hard    nofile  131072
    EOF

    docker官方的内核检查脚本建议(RHEL7/CentOS7: User namespaces disabled; add 'user_namespace.enable=1' to boot command line),如果是yum系列的系统使用下面命令开启,

    grubby --args="user_namespace.enable=1" --update-kernel="$(grubby --default-kernel)"

     四: 安装docker

    检查系统内核和模块是否适合运行 docker (仅适用于 linux 系统),该脚本可能因为墙的原因无法生成,可以先去掉重定向看看能不能访问到脚本

    curl -s https://raw.githubusercontent.com/docker/docker/master/contrib/check-config.sh > check-config.sh
    bash ./check-config.sh

    现在docker存储驱动都是使用的overlay2(不要使用devicemapper,这个坑非常多),我们重点关注overlay2是否不是绿色

    这里我们使用年份命名版本的docker-ce,假设我们要安装v1.18.5的k8s,我们去https://github.com/kubernetes/kubernetes/tree/master/CHANGELOG

    里进对应版本的CHANGELOG-1.18.md里搜The list of validated docker versions remain查找官方验证过的docker版本,docker版本不一定得在列表里,实际上测试过19.03也能使用(19.03+修复了runc的一个性能bug),这里我们使用docker官方的安装脚本安装docker(该脚本支持centos和ubuntu).

    export VERSION=19.03
    curl -fsSL "https://get.docker.com/" | bash -s -- --mirror Aliyun

    所有机器配置加速源并配置docker的启动参数使用systemd,使用systemd是官方的建议,详见 https://kubernetes.io/docs/setup/cri/

    mkdir -p /etc/docker/
    cat>/etc/docker/daemon.json<<EOF
    {
      "exec-opts": ["native.cgroupdriver=systemd"],
      "bip": "169.254.123.1/24",
      "oom-score-adjust": -1000,
      "registry-mirrors": [
          "https://fz5yth0r.mirror.aliyuncs.com",
          "https://dockerhub.mirrors.nwafu.edu.cn/",
          "https://mirror.ccs.tencentyun.com",
          "https://docker.mirrors.ustc.edu.cn/",
          "https://reg-mirror.qiniu.com",
          "http://hub-mirror.c.163.com/",
          "https://registry.docker-cn.com"
      ],
      "storage-driver": "overlay2",
      "storage-opts": [
        "overlay2.override_kernel_check=true"
      ],
      "log-driver": "json-file",
      "log-opts": {
        "max-size": "100m",
        "max-file": "3"
      }
    }
    EOF

    Live Restore Enabled这个千万别开,某些极端情况下容器Dead状态之类的必须重启docker daemon才能解决,开了就只能重启机器解决了

    复制补全脚本

    cp /usr/share/bash-completion/completions/docker /etc/bash_completion.d/

    启动docker并看下信息是否正常

    systemctl enable --now docker
    docker info

    五:kube-nginx部署

    这里我们使用nginx实现local proxy来玩,因为localproxy是每台机器上的,可以不用SLB和无视在云上vpc里无法使用vip的限制,需要每个机器上运行nginx实现
    每台机器配置hosts

    [root@k8s-m1 src]# cat /etc/hosts
    127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
    ::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
    127.0.0.1 apiserver.k8s.local
    192.168.50.101 apiserver01.k8s.local
    192.168.50.102 apiserver02.k8s.local
    192.168.50.103 apiserver03.k8s.local
    192.168.50.101 k8s-m1
    192.168.50.102 k8s-m2
    192.168.50.103 k8s-m3
    192.168.50.104 k8s-node1
    192.168.50.105 k8s-node2
    192.168.50.106 k8s-node3

    每台机器生成nginx配置文件,上面的三个hosts可以不写,写下面配置文件里域名写ip即可,但是这样更改ip需要重新加载。这里我跟原作者不一样的,是我自己手动编译nginx来做的。

    mkdir -p /etc/kubernetes
    [root@k8s-m1 src]# cat /etc/kubernetes/nginx.conf 
    user nginx nginx;
    worker_processes auto;
    events {
        worker_connections  20240;
        use epoll;
    }
    error_log /var/log/kube_nginx_error.log info;
    
    stream {
        upstream kube-servers {
            hash  consistent;
            server apiserver01.k8s.local:6443 weight=5 max_fails=1 fail_timeout=3s;
            server apiserver02.k8s.local:6443 weight=5 max_fails=1 fail_timeout=3s;
            server apiserver03.k8s.local:6443 weight=5 max_fails=1 fail_timeout=3s;
        }
    
        server {
            listen 8443 reuseport;
            proxy_connect_timeout 3s;
            # 加大timeout
            proxy_timeout 3000s;
            proxy_pass kube-servers;
        }
    }

    因为localproxy是每台机器上的,可以不用SLB和vpc无法使用vip的限制,这里我们编译安装kube-nginx;所有机器都需要安装

    yum install gcc gcc-c++ -y
    groupadd nginx
    useradd -r -g nginx nginx
    wget http://nginx.org/download/nginx-1.16.1.tar.gz -P /usr/local/src/
    cd /usr/local/src/
    tar zxvf nginx-1.16.1.tar.gz
    cd nginx-1.16.1/
    ./configure --with-stream --without-http --prefix=/usr/local/kube-nginx --without-http_uwsgi_module --without-http_scgi_module --without-http_fastcgi_module
    make && make install
    
    #编写systemd启动
    [root@k8s-m1 src]# cat /usr/lib/systemd/system/kube-nginx.service 
    [Unit]
    Description=kube-apiserver nginx proxy
    After=network.target
    After=network-online.target
    Wants=network-online.target
    
    [Service]
    Type=forking
    ExecStartPre=/usr/local/kube-nginx/sbin/nginx -c /etc/kubernetes/nginx.conf -p /usr/local/kube-nginx -t
    ExecStart=/usr/local/kube-nginx/sbin/nginx -c /etc/kubernetes/nginx.conf -p /usr/local/kube-nginx
    ExecReload=/usr/local/kube-nginx/sbin/nginx -c /etc/kubernetes/nginx.conf -p /usr/local/kube-nginx -s reload
    PrivateTmp=true
    Restart=always
    RestartSec=5
    StartLimitInterval=0
    LimitNOFILE=65536
    
    [Install]
    WantedBy=multi-user.target
    
    systemctl daemon-reload && systemctl enable kube-nginx && systemctl restart kube-nginx

    六: kubeadm部署

    1. 配置kubernetes阿里云的源

    cat <<EOF > /etc/yum.repos.d/kubernetes.repo
    [kubernetes]
    name=Kubernetes
    baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
    enabled=1
    gpgcheck=1
    repo_gpgcheck=1
    gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
    EOF

    2. master部分

    k8s的node就是kubelet+cri(一般是docker),kubectl是一个agent读取kubeconfig去访问kube-apiserver来操作集群,kubeadm是部署,所以master节点需要安装三个,node一般不需要kubectl

    安装相关软件

    yum install -y 
        kubeadm-1.18.5 
        kubectl-1.18.5 
        kubelet-1.18.5 
        --disableexcludes=kubernetes && 
        systemctl enable kubelet

    node节点安装软件

     yum install -y 
        kubeadm-1.18.5 
        kubelet-1.18.5 
        --disableexcludes=kubernetes && 
        systemctl enable kubelet

    配置集群信息(第一个master上配置)

    打印默认init的配置信息

    kubeadm config print init-defaults > initconfig.yaml
    
    #我们看下默认init的集群参数
    
    apiVersion: kubeadm.k8s.io/v1beta2
    bootstrapTokens:
    - groups:
      - system:bootstrappers:kubeadm:default-node-token
      token: abcdef.0123456789abcdef
      ttl: 24h0m0s
      usages:
      - signing
      - authentication
    kind: InitConfiguration
    localAPIEndpoint:
      advertiseAddress: 1.2.3.4
      bindPort: 6443
    nodeRegistration:
      criSocket: /var/run/dockershim.sock
      name: k8s-m1
      taints:
      - effect: NoSchedule
        key: node-role.kubernetes.io/master
    ---
    apiServer:
      timeoutForControlPlane: 4m0s
    apiVersion: kubeadm.k8s.io/v1beta2
    certificatesDir: /etc/kubernetes/pki
    clusterName: kubernetes
    controllerManager: {}
    dns:
      type: CoreDNS
    etcd:
      local:
        dataDir: /var/lib/etcd
    imageRepository: k8s.gcr.io
    kind: ClusterConfiguration
    kubernetesVersion: v1.16.0
    networking:
      dnsDomain: cluster.local
      serviceSubnet: 10.96.0.0/12
    scheduler: {}

    我们主要关注和只保留ClusterConfiguration的段,然后修改下,可以参考下列的v1beta2文档,如果是低版本可能是v1beta1,某些字段和新的是不一样的,自行查找godoc看
    https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#hdr-Basics
    https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2
    https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#pkg-constants
    https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#ClusterConfiguration
    ip啥的自行更改成和自己的一致,cidr不懂咋计算就别乱改。controlPlaneEndpoint写域名(内网没dns所有机器写hosts也行)或者SLB,VIP,原因和注意事项见 https://zhangguanzhang.github.io/2019/03/11/k8s-ha/ 这个文章我把HA解释得很清楚了,不要再问我了,下面是最终的yaml

    apiVersion: kubeadm.k8s.io/v1beta2
    kind: ClusterConfiguration
    imageRepository: registry.aliyuncs.com/k8sxio
    kubernetesVersion: v1.18.5 # 如果镜像列出的版本不对就这里写正确版本号
    certificatesDir: /etc/kubernetes/pki
    clusterName: kubernetes
    networking: #https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#Networking
      dnsDomain: cluster.local
      serviceSubnet: 10.96.0.0/12
      podSubnet: 10.244.0.0/16
    controlPlaneEndpoint: apiserver.k8s.local:8443 # 单个master的话写master的ip或者不写
    apiServer: # https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#APIServer
      timeoutForControlPlane: 4m0s
      extraArgs:
        authorization-mode: "Node,RBAC"
        enable-admission-plugins: "NamespaceLifecycle,LimitRanger,ServiceAccount,PersistentVolumeClaimResize,DefaultStorageClass,DefaultTolerationSeconds,NodeRestriction,MutatingAdmissionWebhook,ValidatingAdmissionWebhook,ResourceQuota,Priority,PodPreset"
        runtime-config: api/all=true,settings.k8s.io/v1alpha1=true
        storage-backend: etcd3
        etcd-servers: https://192.168.50.101:2379,https://192.168.50.102:2379,https://192.168.50.103:2379
      certSANs:
      - 10.96.0.1 # service cidr的第一个ip
      - 127.0.0.1 # 多个master的时候负载均衡出问题了能够快速使用localhost调试
      - localhost
      - apiserver.k8s.local # 负载均衡的域名或者vip
      - 192.168.50.101
      - 192.168.50.102
      - 192.168.50.103
      - apiserver01.k8s.local
      - apiserver02.k8s.local
      - apiserver03.k8s.local
      - master
      - kubernetes
      - kubernetes.default 
      - kubernetes.default.svc 
      - kubernetes.default.svc.cluster.local
      extraVolumes:
      - hostPath: /etc/localtime
        mountPath: /etc/localtime
        name: localtime
        readOnly: true
    controllerManager: # https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#ControlPlaneComponent
      extraArgs:
        bind-address: "0.0.0.0"
        experimental-cluster-signing-duration: 867000h
      extraVolumes:
      - hostPath: /etc/localtime
        mountPath: /etc/localtime
        name: localtime
        readOnly: true
    scheduler: 
      extraArgs:
        bind-address: "0.0.0.0"
      extraVolumes:
      - hostPath: /etc/localtime
        mountPath: /etc/localtime
        name: localtime
        readOnly: true
    dns: # https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#DNS
      type: CoreDNS # or kube-dns
      imageRepository: coredns # azk8s.cn已失效,使用dockerhub上coredns官方镜像
      imageTag: 1.6.7  # 阿里镜像仓库目前只有1.6.7,最新见dockerhub
    etcd: # https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#Etcd
      local:
        imageRepository: quay.io/coreos
        imageTag: v3.4.7
        dataDir: /var/lib/etcd
        serverCertSANs: # server和peer的localhost,127,::1都默认自带的不需要写
        - master
        - 192.168.50.101
        - 192.168.50.102
        - 192.168.50.103
        - etcd01.k8s.local
        - etcd02.k8s.local
        - etcd03.k8s.local
        peerCertSANs:
        - master
        - 192.168.50.101
        - 192.168.50.102
        - 192.168.50.103
        - etcd01.k8s.local
        - etcd02.k8s.local
        - etcd03.k8s.local
        extraArgs: # 暂时没有extraVolumes
          auto-compaction-retention: "1h"
          max-request-bytes: "33554432"
          quota-backend-bytes: "8589934592"
          enable-v2: "false" # disable etcd v2 api
      # external: //外部etcd的时候这样配置 https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#Etcd
        # endpoints:
        # - "https://172.19.0.2:2379"
        # - "https://172.19.0.3:2379"
        # - "https://172.19.0.4:2379"
        # caFile: "/etc/kubernetes/pki/etcd/ca.crt"
        # certFile: "/etc/kubernetes/pki/etcd/etcd.crt"
        # keyFile: "/etc/kubernetes/pki/etcd/etcd.key"
    ---
    apiVersion: kubeproxy.config.k8s.io/v1alpha1
    kind: KubeProxyConfiguration # https://godoc.org/k8s.io/kube-proxy/config/v1alpha1#KubeProxyConfiguration
    mode: ipvs # or iptables
    ipvs:
      excludeCIDRs: null
      minSyncPeriod: 0s
      scheduler: "rr" # 调度算法
      syncPeriod: 15s
    iptables:
      masqueradeAll: true
      masqueradeBit: 14
      minSyncPeriod: 0s
      syncPeriod: 30s
    ---
    apiVersion: kubelet.config.k8s.io/v1beta1
    kind: KubeletConfiguration # https://godoc.org/k8s.io/kubelet/config/v1beta1#KubeletConfiguration
    cgroupDriver: systemd
    failSwapOn: true # 如果开启swap则设置为false

    检查文件是否错误,忽略warning,错误的话会抛出error,没错则会输出到包含字符串kubeadm join xxx啥的

    kubeadm init --config initconfig.yaml --dry-run

    检查镜像是否正确,版本号不正确就把yaml里的kubernetesVersion取消注释写上自己对应的版本号

    kubeadm config images list --config initconfig.yaml

    预先拉取镜像

    kubeadm config images pull --config initconfig.yaml # 下面是输出
    [config/images] Pulled gcr.azk8s.cn/google_containers/kube-apiserver:v1.18.5
    [config/images] Pulled gcr.azk8s.cn/google_containers/kube-controller-manager:v1.18.5
    [config/images] Pulled gcr.azk8s.cn/google_containers/kube-scheduler:v1.18.5
    [config/images] Pulled gcr.azk8s.cn/google_containers/kube-proxy:v1.18.5
    [config/images] Pulled gcr.azk8s.cn/google_containers/pause:3.1
    [config/images] Pulled quay.azk8s.cn/coreos/etcd:v3.4.7
    [config/images] Pulled coredns/coredns:1.6.3

    七:kubeadm init

    下面init只在第一个master上面操作

    # --experimental-upload-certs 参数的意思为将相关的证书直接上传到etcd中保存,这样省去我们手动分发证书的过程
    # 注意在v1.15+版本中,已经变成正式参数,不再是实验性质,之前的版本请使用 --experimental-upload-certs
    
    kubeadm init --config initconfig.yaml --upload-certs

    如果超时了看看是不是kubelet没起来,调试见 https://github.com/zhangguanzhang/Kubernetes-ansible/wiki/systemctl-running-debug

    记住init后打印的token,复制kubectl的kubeconfig,kubectl的kubeconfig路径默认是~/.kube/config

    mkdir -p $HOME/.kube
    sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    sudo chown $(id -u):$(id -g) $HOME/.kube/config

    init的yaml信息实际上会存在集群的configmap里,我们可以随时查看,该yaml在其他node和master join的时候会使用到

    kubectl -n kube-system get cm kubeadm-config -o yaml

    如果单个master,也不想整其他的node,需要去掉master节点上的污点,下一步的多master操作不需要整

    kubectl taint nodes --all node-role.kubernetes.io/master-

    设置ep的rbac

    kube-apiserver的web健康检查路由有权限,我们需要开放用来监控或者对接SLB的健康检查,yaml文件 https://github.com/zhangguanzhang/Kubernetes-ansible-base/blob/roles/master/files/healthz-rbac.yml

    kubectl apply -f https://raw.githubusercontent.com/zhangguanzhang/Kubernetes-ansible-base/roles/master/files/healthz-rbac.yml

    配置其他master的k8s管理组件

    手动拷贝(某些低版本不支持上传证书的时候操作,如果前面kubeadm init的时候加了上传证书选项这步不用执行)

    第一个master上拷贝ca证书到其他master节点上,因为交互输入密码,我们安装sshpass,zhangguanzhang是root密码

    yum install sshpass -y
    alias ssh='sshpass -p zhangguanzhang ssh -o StrictHostKeyChecking=no'
    alias scp='sshpass -p zhangguanzhang scp -o StrictHostKeyChecking=no'

    复制ca证书到其他master节点

    for node in 172.19.0.3 172.19.0.4;do
        ssh $node 'mkdir -p /etc/kubernetes/pki/etcd'
        scp -r /etc/kubernetes/pki/ca.* $node:/etc/kubernetes/pki/
        scp -r /etc/kubernetes/pki/sa.* $node:/etc/kubernetes/pki/
        scp -r /etc/kubernetes/pki/front-proxy-ca.* $node:/etc/kubernetes/pki/
        scp -r /etc/kubernetes/pki/etcd/ca.* $node:/etc/kubernetes/pki/etcd/
    done
    其他master join进来
    kubeadm join apiserver.k8s.local:8443 --token vo6qyo.4cm47w561q9p830v 
        --discovery-token-ca-cert-hash sha256:46e177c317037a4815c6deaab8089da4340663efeeead40810d4f53239256671 
        --control-plane --certificate-key ba869da2d611e5afba5f9959a5f18891c20fb56d90592225765c0b965e3d8783

    token忘记的话可以kubeadm token list查看,可以通过kubeadm token create创建
    sha256的值可以通过下列命令获取

    openssl x509 -pubkey -in 
        /etc/kubernetes/pki/ca.crt | 
        openssl rsa -pubin -outform der 2>/dev/null | 
        openssl dgst -sha256 -hex | sed 's/^.* //'

    设置kubectl的补全脚本

    kubectl completion bash > /etc/bash_completion.d/kubectl

    所有master配置etcdctl

    复制出容器里的etcdctl

    docker cp `docker ps -a | awk '/k8s_etcd/{print $1}'`:/usr/local/bin/etcdctl /usr/local/bin/etcdctl

    1.13还是具体哪个版本后k8s默认使用v3 api的etcd,这里我们配置下etcdctl的参数

    cat >/etc/profile.d/etcd.sh<<'EOF'
    ETCD_CERET_DIR=/etc/kubernetes/pki/etcd/
    ETCD_CA_FILE=ca.crt
    ETCD_KEY_FILE=healthcheck-client.key
    ETCD_CERT_FILE=healthcheck-client.crt
    ETCD_EP=https://192.168.50.101:2379,https://192.168.50.102:2379,https://192.168.50.103:2379
    
    alias etcd_v2="etcdctl --cert-file ${ETCD_CERET_DIR}/${ETCD_CERT_FILE} 
                  --key-file ${ETCD_CERET_DIR}/${ETCD_KEY_FILE}  
                  --ca-file ${ETCD_CERET_DIR}/${ETCD_CA_FILE}  
                  --endpoints $ETCD_EP"
    
    alias etcd_v3="ETCDCTL_API=3 
        etcdctl   
       --cert ${ETCD_CERET_DIR}/${ETCD_CERT_FILE} 
       --key ${ETCD_CERET_DIR}/${ETCD_KEY_FILE} 
       --cacert ${ETCD_CERET_DIR}/${ETCD_CA_FILE} 
        --endpoints $ETCD_EP"
    EOF

    重新ssh下或者手动加载下环境变量. /etc/profile.d/etcd.sh

    [root@k8s-m1 ~]# etcd_v3 endpoint status --write-out=table
    +-----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
    |          ENDPOINT           |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
    +-----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
    | https://192.168.50.101:2379 | 9fdaf6a25119065e |   3.4.7 |  3.1 MB |     false |      false |         5 |     305511 |             305511 |        |
    | https://192.168.50.102:2379 | a3d9d41cf6d05e08 |   3.4.7 |  3.1 MB |      true |      false |         5 |     305511 |             305511 |        |
    | https://192.168.50.103:2379 | 3b34476e501895d4 |   3.4.7 |  3.0 MB |     false |      false |         5 |     305511 |             305511 |        |
    +-----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+

    配置etcd备份脚本

    mkdir -p /opt/etcd
    cat>/opt/etcd/etcd_cron.sh<<'EOF'
    #!/bin/bash
    set -e
    
    export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin
    
    :  ${bak_dir:=/root/} #缺省备份目录,可以修改成存在的目录
    :  ${cert_dir:=/etc/kubernetes/pki/etcd/}
    :  ${endpoints:=https://192.168.50.101:2379,https://192.168.50.102:2379,https://192.168.50.103:2379}
    
    bak_prefix='etcd-'
    cmd_suffix='date +%Y-%m-%d-%H:%M'
    bak_suffix='.db'
    
    #将规范化后的命令行参数分配至位置参数($1,$2,...)
    temp=`getopt -n $0 -o c:d: -u -- "$@"`
    
    [ $? != 0 ] && {
        echo '
    Examples:
      # just save once
      bash $0 /tmp/etcd.db
      # save in contab and  keep 5
      bash $0 -c 5
        '
        exit 1
        }
    set -- $temp
    
    
    # -c 备份保留副本数量
    # -d 指定备份存放目录
    while true;do
        case "$1" in
            -c)
                [ -z "$bak_count" ] && bak_count=$2
                printf -v null %d "$bak_count" &>/dev/null || 
                    { echo 'the value of the -c must be number';exit 1; }
                shift 2
                ;;
            -d)
                [ ! -d "$2" ] && mkdir -p $2
                bak_dir=$2
                shift 2
                ;;
             *)
                [[ -z "$1" || "$1" == '--' ]] && { shift;break; }
                echo "Internal error!"
                exit 1
                ;;
        esac
    done
    
    
    function etcd_v2(){
    
        etcdctl --cert-file $cert_dir/healthcheck-client.crt 
                --key-file  $cert_dir/healthcheck-client.key 
                --ca-file   $cert_dir/ca.crt 
            --endpoints $endpoints $@
    }
    
    function etcd_v3(){
    
        ETCDCTL_API=3 etcdctl   
           --cert $cert_dir/healthcheck-client.crt 
           --key  $cert_dir/healthcheck-client.key 
           --cacert $cert_dir/ca.crt 
           --endpoints $endpoints $@
    }
    
    etcd::cron::save(){
        cd $bak_dir/
        etcd_v3 snapshot save  $bak_prefix$($cmd_suffix)$bak_suffix
        rm_files=`ls -t $bak_prefix*$bak_suffix | tail -n +$[bak_count+1]`
        if [ -n "$rm_files" ];then
            rm -f $rm_files
        fi
    }
    
    main(){
        [ -n "$bak_count" ] && etcd::cron::save || etcd_v3 snapshot save $@
    }
    
    main $@
    EOF

    crontab -e添加下面内容自动保留四个备份副本

    bash /opt/etcd/etcd_cron.sh  -c 4 -d /opt/etcd/ &>/dev/null

    node

    按照前面的做:

    • 配置系统设置
    • 设置hostname
    • 安装docker-ce
    • 设置hosts和nginx
    • 配置软件源,安装kubeadm kubelet

    和master的join一样,提前准备好环境和docker,然后join的时候不需要带--control-plane,只有一个master的话join的那个ip写controlPlaneEndpoint的值

    kubeadm join apiserver.k8s.local:8443 --token vo6qyo.4cm47w561q9p830v 
        --discovery-token-ca-cert-hash sha256:46e177c317037a4815c6deaab8089da4340663efeeead40810d4f53239256671
    [root@k8s-m1 ~]# kubectl get node
    NAME        STATUS   ROLES    AGE    VERSION
    k8s-m1      Ready    master   23h    v1.18.5
    k8s-m2      Ready    master   23h    v1.18.5
    k8s-m3      Ready    master   23h    v1.18.5
    k8s-node1   Ready    node     23h    v1.18.5
    k8s-node2   Ready    node     121m   v1.18.5
    k8s-node3   Ready    node     82m    v1.18.5

    addon(此章开始到结尾选取任意一个master上执行)

    容器的网络还没处理好,coredns无法分配到ip会处于pending状态,这里我用flannel部署,如果你了解bgp可以使用calico
    yaml文件来源与flannel官方github https://github.com/coreos/flannel/tree/master/Documentation

    修改

    • 如果是在1.16之前使用psp,policy/v1beta1得修改成extensions/v1beta1;这里不用修改

    apiVersion: policy/v1beta1
    kind: PodSecurityPolicy

    - rbac的version改为下面,不要使用v1beta1了,使用下面命令修改

    sed -ri '/apiVersion: rbac/s#v1.+#v1#' kube-flannel.yml

    - 官方yaml自带了四种架构的daemonset,我们删掉除了amd64以外的,大概是227行到结尾

    sed -ri '227,$d' kube-flannel.yml

    - pod的cidr修改了的话这里也要修改,如果是在同一个二层,可以使用把vxlan改为性能更强的host-gw模式,vxlan的话需要安全组放开8472端口的udp

    net-conf.json: |
      {
        "Network": "10.244.0.0/16",
        "Backend": {
          "Type": "vxlan"
        }
      }

    - 修改limits,需要大于request

    limits:
      cpu: "200m"
      memory: "100Mi"

    部署flannel

    貌似没有遇到这个错误

    1.15后node的cidr是数组,而不是单个了,flannel目前0.11和之前版本部署的话会有下列错误,见文档
    https://github.com/kubernetes/kubernetes/blob/v1.15.0/staging/src/k8s.io/api/core/v1/types.go#L3890-L3893
    https://github.com/kubernetes/kubernetes/blob/v1.18.2/staging/src/k8s.io/api/core/v1/types.go#L4206-L4216

    Error registering network: failed to acquire lease: node "xxx" pod cidr not assigned

    手动打patch,后续扩的node也记得打下

    nodes=`kubectl get node --no-headers | awk '{print $1}'`
    for node in $nodes;do
        cidr=`kubectl get node "$node" -o jsonpath='{.spec.podCIDRs[0]}'`
        [ -z "$(kubectl get node $node -o jsonpath='{.spec.podCIDR}')" ] && {
            kubectl patch node "$node" -p '{"spec":{"podCIDR":"'"$cidr"'"}}' 
        }
    done

    最终的kube-flannel.yml如下:

    [root@k8s-m1 ~]# cat kube-flannel.yml 
    ---
    apiVersion: policy/v1beta1
    kind: PodSecurityPolicy
    metadata:
      name: psp.flannel.unprivileged
      annotations:
        seccomp.security.alpha.kubernetes.io/allowedProfileNames: docker/default
        seccomp.security.alpha.kubernetes.io/defaultProfileName: docker/default
        apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default
        apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default
    spec:
      privileged: false
      volumes:
        - configMap
        - secret
        - emptyDir
        - hostPath
      allowedHostPaths:
        - pathPrefix: "/etc/cni/net.d"
        - pathPrefix: "/etc/kube-flannel"
        - pathPrefix: "/run/flannel"
      readOnlyRootFilesystem: false
      # Users and groups
      runAsUser:
        rule: RunAsAny
      supplementalGroups:
        rule: RunAsAny
      fsGroup:
        rule: RunAsAny
      # Privilege Escalation
      allowPrivilegeEscalation: false
      defaultAllowPrivilegeEscalation: false
      # Capabilities
      allowedCapabilities: ['NET_ADMIN']
      defaultAddCapabilities: []
      requiredDropCapabilities: []
      # Host namespaces
      hostPID: false
      hostIPC: false
      hostNetwork: true
      hostPorts:
      - min: 0
        max: 65535
      # SELinux
      seLinux:
        # SELinux is unused in CaaSP
        rule: 'RunAsAny'
    ---
    kind: ClusterRole
    apiVersion: rbac.authorization.k8s.io/v1
    metadata:
      name: flannel
    rules:
      - apiGroups: ['extensions']
        resources: ['podsecuritypolicies']
        verbs: ['use']
        resourceNames: ['psp.flannel.unprivileged']
      - apiGroups:
          - ""
        resources:
          - pods
        verbs:
          - get
      - apiGroups:
          - ""
        resources:
          - nodes
        verbs:
          - list
          - watch
      - apiGroups:
          - ""
        resources:
          - nodes/status
        verbs:
          - patch
    ---
    kind: ClusterRoleBinding
    apiVersion: rbac.authorization.k8s.io/v1
    metadata:
      name: flannel
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: ClusterRole
      name: flannel
    subjects:
    - kind: ServiceAccount
      name: flannel
      namespace: kube-system
    ---
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: flannel
      namespace: kube-system
    ---
    kind: ConfigMap
    apiVersion: v1
    metadata:
      name: kube-flannel-cfg
      namespace: kube-system
      labels:
        tier: node
        app: flannel
    data:
      cni-conf.json: |
        {
          "name": "cbr0",
          "cniVersion": "0.3.1",
          "plugins": [
            {
              "type": "flannel",
              "delegate": {
                "hairpinMode": true,
                "isDefaultGateway": true
              }
            },
            {
              "type": "portmap",
              "capabilities": {
                "portMappings": true
              }
            }
          ]
        }
      net-conf.json: |
        {
          "Network": "10.244.0.0/16",
          "Backend": {
            "Type": "host-gw"
          }
        }
    ---
    apiVersion: apps/v1
    kind: DaemonSet
    metadata:
      name: kube-flannel-ds-amd64
      namespace: kube-system
      labels:
        tier: node
        app: flannel
    spec:
      selector:
        matchLabels:
          app: flannel
      template:
        metadata:
          labels:
            tier: node
            app: flannel
        spec:
          affinity:
            nodeAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
                nodeSelectorTerms:
                  - matchExpressions:
                      - key: kubernetes.io/os
                        operator: In
                        values:
                          - linux
                      - key: kubernetes.io/arch
                        operator: In
                        values:
                          - amd64
          hostNetwork: true
          tolerations:
          - operator: Exists
            effect: NoSchedule
          serviceAccountName: flannel
          initContainers:
          - name: install-cni
            image: quay.io/coreos/flannel:v0.12.0-amd64
            command:
            - cp
            args:
            - -f
            - /etc/kube-flannel/cni-conf.json
            - /etc/cni/net.d/10-flannel.conflist
            volumeMounts:
            - name: cni
              mountPath: /etc/cni/net.d
            - name: flannel-cfg
              mountPath: /etc/kube-flannel/
          containers:
          - name: kube-flannel
            image: quay.io/coreos/flannel:v0.12.0-amd64
            command:
            - /opt/bin/flanneld
            args:
            - --ip-masq
            - --kube-subnet-mgr
            resources:
              requests:
                cpu: "100m"
                memory: "50Mi"
              limits:
                cpu: "200m"
                memory: "100Mi"
            securityContext:
              privileged: false
              capabilities:
                add: ["NET_ADMIN"]
            env:
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: POD_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
            volumeMounts:
            - name: run
              mountPath: /run/flannel
            - name: flannel-cfg
              mountPath: /etc/kube-flannel/
          volumes:
            - name: run
              hostPath:
                path: /run/flannel
            - name: cni
              hostPath:
                path: /etc/cni/net.d
            - name: flannel-cfg
              configMap:
                name: kube-flannel-cfg

    这里采用了host-gw模式,因为遇到了udp的内核bug,详细请参考:https://zhangguanzhang.github.io/2020/05/23/k8s-vxlan-63-timeout/

    kubectl apply -f kube-flannel.yml

    验证集群可用性

    kubectl -n kube-system get pod -o wide

    等待kube-system空间下的pod都是running后我们来测试下集群可用性

    cat<<EOF | kubectl apply -f -
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: nginx
    spec:
      selector:
        matchLabels:
          app: nginx
      template:
        metadata:
          labels:
            app: nginx
        spec:
          containers:
          - image: nginx:alpine
            name: nginx
            ports:
            - containerPort: 80
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: nginx
    spec:
      selector:
        app: nginx
      ports:
        - protocol: TCP
          port: 80
          targetPort: 80
    ---
    apiVersion: v1
    kind: Pod
    metadata:
      name: busybox
      namespace: default
    spec:
      containers:
      - name: busybox
        image: zhangguanzhang/centos
        command:
          - sleep
          - "3600"
        imagePullPolicy: IfNotPresent
      restartPolicy: Always
    EOF

    等待pod running

    验证集群dns

    $ kubectl exec -ti busybox -- nslookup kubernetes
    Server:    10.96.0.10
    Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
    
    Name:      kubernetes
    Address 1: 10.96.0.1 kubernetes.default.svc.cluster.local

    关于kubeadm过程和更多详细参数选项见下面文章

    新添加节点

    1. 初始化centos 7 

    初始化脚本

    #!/bin/bash
    
    #----配置时间统一性----
    echo "配置时间"
    yum install chrony -y
    mv /etc/chrony.conf /etc/chrony.conf.bak
    cat>/etc/chrony.conf<<EOF
    server ntp.aliyun.com iburst
    stratumweight 0
    driftfile /var/lib/chrony/drift
    rtcsync
    makestep 10 3
    bindcmdaddress 127.0.0.1
    bindcmdaddress ::1
    keyfile /etc/chrony.keys
    commandkey 1
    generatecommandkey
    logchange 0.5
    logdir /var/log/chrony
    EOF
    /usr/bin/systemctl enable chronyd
    /usr/bin/systemctl restart chronyd
    
    #---关闭交换分区---
    echo "关闭交换分区"
    swapoff -a && sysctl -w vm.swappiness=0
    sed -ri '/^[^#]*swap/s@^@#@' /etc/fstab
    
    #---关闭防火墙以及selinux---
    echo "关闭防火墙以及selinux"
    systemctl stop firewalld
    systemctl disable firewalld
    setenforce 0
    sed -ri '/^[^#]*SELINUX=/s#=.+$#=disabled#' /etc/selinux/config
    
    #---关闭NetworkManager---
    echo "关闭NetworkManager"
    systemctl disable NetworkManager
    systemctl stop NetworkManager
    
    #---安装epel源,并且替换为阿里云的epel源---
    yum install epel-release wget -y
    wget -O /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo
    
    #---安装依赖组件---
    echo "安装依赖组件"
    yum install -y 
        curl 
        git 
        conntrack-tools 
        psmisc 
        nfs-utils 
        jq 
        socat 
        bash-completion 
        ipset 
        ipvsadm 
        conntrack 
        libseccomp 
        net-tools 
        crontabs 
        sysstat 
        unzip 
        iftop 
        nload 
        strace 
        bind-utils 
        tcpdump 
        telnet 
        lsof 
        htop
    #---ipvs模式需要开机加载下列模块---
    echo "ipvs模式需要开机加载下列模块"
    cat>/etc/modules-load.d/ipvs.conf<<EOF
    ip_vs
    ip_vs_rr
    ip_vs_wrr
    ip_vs_sh
    nf_conntrack
    br_netfilter
    EOF
    systemctl daemon-reload
    systemctl enable --now systemd-modules-load.service
    
    #---设定系统参数---
    cat <<EOF > /etc/sysctl.d/k8s.conf
    net.ipv6.conf.all.disable_ipv6 = 1
    net.ipv6.conf.default.disable_ipv6 = 1
    net.ipv6.conf.lo.disable_ipv6 = 1
    net.ipv4.neigh.default.gc_stale_time = 120
    net.ipv4.conf.all.rp_filter = 0
    net.ipv4.conf.default.rp_filter = 0
    net.ipv4.conf.default.arp_announce = 2
    net.ipv4.conf.lo.arp_announce = 2
    net.ipv4.conf.all.arp_announce = 2
    net.ipv4.ip_forward = 1
    net.ipv4.tcp_max_tw_buckets = 5000
    net.ipv4.tcp_syncookies = 1
    net.ipv4.tcp_max_syn_backlog = 1024
    net.ipv4.tcp_synack_retries = 2
    # 要求iptables不对bridge的数据进行处理
    net.bridge.bridge-nf-call-ip6tables = 1
    net.bridge.bridge-nf-call-iptables = 1
    net.bridge.bridge-nf-call-arptables = 1
    net.netfilter.nf_conntrack_max = 2310720
    fs.inotify.max_user_watches=89100
    fs.may_detach_mounts = 1
    fs.file-max = 52706963
    fs.nr_open = 52706963
    vm.overcommit_memory=1
    vm.panic_on_oom=0
    # https://github.com/moby/moby/issues/31208 
    # ipvsadm -l --timout
    # 修复ipvs模式下长连接timeout问题 小于900即可
    net.ipv4.tcp_keepalive_time = 600
    net.ipv4.tcp_keepalive_intvl = 30
    net.ipv4.tcp_keepalive_probes = 10
    EOF
    sysctl --system
    
    #---优化设置 journal 日志相关---
    sed -ri 's/^$ModLoad imjournal/#&/' /etc/rsyslog.conf
    sed -ri 's/^$IMJournalStateFile/#&/' /etc/rsyslog.conf
    sed -ri 's/^#(DefaultLimitCORE)=/1=100000/' /etc/systemd/system.conf
    sed -ri 's/^#(DefaultLimitNOFILE)=/1=100000/' /etc/systemd/system.conf
    sed -ri 's/^#(UseDNS )yes/1no/' /etc/ssh/sshd_config
    
    #---优化文件最大打开数---
    cat>/etc/security/limits.d/kubernetes.conf<<EOF
    *       soft    nproc   131072
    *       hard    nproc   131072
    *       soft    nofile  131072
    *       hard    nofile  131072
    root    soft    nproc   131072
    root    hard    nproc   131072
    root    soft    nofile  131072
    root    hard    nofile  131072
    EOF
    
    #---设置user_namespace.enable=1---
    grubby --args="user_namespace.enable=1" --update-kernel="$(grubby --default-kernel)"

    2. 编译安装nginx

    yum install gcc gcc-c++ -y
    tar zxvf nginx-1.16.1.tar.gz
    cd nginx-1.16.1/
    ./configure --with-stream --without-http --prefix=/usr/local/kube-nginx --without-http_uwsgi_module --without-http_scgi_module --without-http_fastcgi_module
    make && make install
    groupadd nginx
    useradd -r -g nginx nginx
    systemctl daemon-reload && systemctl enable kube-nginx && systemctl restart kube-nginx

    3. 重新生成tocken

    kubeadm token create
    openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
    kubeadm join apiserver.k8s.local:8443 --token 8ceduc.cy0r23j2hpsw80ff     --discovery-token-ca-cert-hash sha256:46e177c317037a4815c6deaab8089da4340663efeeead40810d4f53239256671

    error execution phase preflight: couldn't validate the identity of the API Server: could not find a JWS signature in the cluster-info ConfigMap for token ID "vo6qyo"

    此时就需要重新生成tocken。

  • 相关阅读:
    【HDU 4305】Lightning(生成树计数)
    【HDU 1150】Machine Schedule(二分图匹配)
    【HDU 2063】过山车(二分图匹配)
    透过Nim游戏浅谈博弈
    [SCOI2010]字符串
    [SCOI2010]传送带[三分]
    [SCOI2010]序列操作[分块or线段树]
    HDU 5306 Gorgeous Sequence[线段树区间最值操作]
    1455: 罗马游戏[左偏树or可并堆]
    Codevs 5914 [SXOI2016]最大值
  • 原文地址:https://www.cnblogs.com/skymyyang/p/13279006.html
Copyright © 2020-2023  润新知