kubernetes worker 节点运行如下组件:
- + docker
- + kubelet
- + kube-proxy
- + flanneld
- + kube-nginx
部署kube-nginx是为了访问 kube-apiserver 集群,kubelet、kube-proxy通过本地的 nginx(监听 127.0.0.1)访问 kube-apiserver,从而实现 kube-apiserver 的高可用
kube-nginx部署请参考:https://www.cnblogs.com/deny/p/12260717.html 里的安装配置nginx代理
CA证书请参考:https://www.cnblogs.com/deny/p/12259778.html
节点信息:
+ zhangjun-k8s01:192.168.1.201 + zhangjun-k8s02:192.168.1.202 + zhangjun-k8s03:192.168.1.203
所需的变量存放在/opt/k8s/bin/environment.sh
#!/usr/bin/bash # 生成 EncryptionConfig 所需的加密 key export ENCRYPTION_KEY=$(head -c 32 /dev/urandom | base64) # 集群各机器 IP 数组 export NODE_IPS=(192.168.1.201 192.168.1.202 192.168.1.203) # 集群各 IP 对应的主机名数组 export NODE_NAMES=(zhangjun-k8s01 zhangjun-k8s02 zhangjun-k8s03) # etcd 集群服务地址列表 export ETCD_ENDPOINTS="https://192.168.1.201:2379,https://192.168.1.202:2379,https://192.168.1.203:2379" # etcd 集群间通信的 IP 和端口 export ETCD_NODES="zhangjun-k8s01=https://192.168.1.201:2380,zhangjun-k8s02=https://192.168.1.202:2380,zhangjun-k8s03=https://192.168.1.203:2380" # kube-apiserver 的反向代理(kube-nginx)地址端口 export KUBE_APISERVER="https://127.0.0.1:8443" # 节点间互联网络接口名称 export IFACE="ens33" # etcd 数据目录 export ETCD_DATA_DIR="/data/k8s/etcd/data" # etcd WAL 目录,建议是 SSD 磁盘分区,或者和 ETCD_DATA_DIR 不同的磁盘分区 export ETCD_WAL_DIR="/data/k8s/etcd/wal" # k8s 各组件数据目录 export K8S_DIR="/data/k8s/k8s" # docker 数据目录 export DOCKER_DIR="/data/k8s/docker" ## 以下参数一般不需要修改 # TLS Bootstrapping 使用的 Token,可以使用命令 head -c 16 /dev/urandom | od -An -t x | tr -d ' ' 生成 BOOTSTRAP_TOKEN="41f7e4ba8b7be874fcff18bf5cf41a7c" # 最好使用 当前未用的网段 来定义服务网段和 Pod 网段 # 服务网段,部署前路由不可达,部署后集群内路由可达(kube-proxy 保证) SERVICE_CIDR="10.254.0.0/16" # Pod 网段,建议 /16 段地址,部署前路由不可达,部署后集群内路由可达(flanneld 保证) CLUSTER_CIDR="172.30.0.0/16" # 服务端口范围 (NodePort Range) export NODE_PORT_RANGE="30000-32767" # flanneld 网络配置前缀 export FLANNEL_ETCD_PREFIX="/kubernetes/network" # kubernetes 服务 IP (一般是 SERVICE_CIDR 中第一个IP) export CLUSTER_KUBERNETES_SVC_IP="10.254.0.1" # 集群 DNS 服务 IP (从 SERVICE_CIDR 中预分配) export CLUSTER_DNS_SVC_IP="10.254.0.2" # 集群 DNS 域名(末尾不带点号) export CLUSTER_DNS_DOMAIN="cluster.local" # 将二进制目录 /opt/k8s/bin 加到 PATH 中 export PATH=/opt/k8s/bin:$PATH
注意:如果没有特殊指明,本文档的所有操作**均在 zhangjun-k8s01 节点上执行**,然后远程分发文件和执行命令
安装依赖包
source /opt/k8s/bin/environment.sh
for node_ip in ${NODE_IPS[@]}
do
echo ">>> ${node_ip}"
ssh root@${node_ip} "yum install -y epel-release"
ssh root@${node_ip} "yum install -y conntrack ipvsadm ntp ntpdate ipset jq iptables curl sysstat libseccomp && modprobe ip_vs "
done
一、安装和配置 flanneld
flannel 集群部署请参考:https://www.cnblogs.com/deny/p/12260072.html
二、安装和配置 kube-nginx
kube-nginx部署请参考:https://www.cnblogs.com/deny/p/12260717.html 里的安装配置nginx代理
三、部署 docker 组件
docker 运行和管理容器,kubelet 通过 Container Runtime Interface (CRI) 与它进行交互。
1、安装docker
1)下载和分发 docker 二进制文件
到 [docker 下载页面](https://download.docker.com/linux/static/stable/x86_64/) 下载最新发布包
cd /opt/k8s/work wget https://download.docker.com/linux/static/stable/x86_64/docker-19.03.5.tgz tar -xvf docker-19.03.5.tgz
2)分发二进制文件到所有 worker 节点
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp docker/* root@${node_ip}:/opt/k8s/bin/ ssh root@${node_ip} "chmod +x /opt/k8s/bin/*" done
2、创建和分发 systemd unit 文件
1)创建模板文件
cd /opt/k8s/work cat > docker.service <<"EOF" [Unit] Description=Docker Application Container Engine Documentation=http://docs.docker.io [Service] WorkingDirectory=##DOCKER_DIR## Environment="PATH=/opt/k8s/bin:/bin:/sbin:/usr/bin:/usr/sbin" EnvironmentFile=-/run/flannel/docker ExecStart=/opt/k8s/bin/dockerd $DOCKER_NETWORK_OPTIONS ExecReload=/bin/kill -s HUP $MAINPID Restart=on-failure RestartSec=5 LimitNOFILE=infinity LimitNPROC=infinity LimitCORE=infinity Delegate=yes KillMode=process [Install] WantedBy=multi-user.target EOF
- EOF 前后有双引号,这样 bash 不会替换文档中的变量,如 `$DOCKER_NETWORK_OPTIONS` (这些环境变量是 systemd 负责替换的。);
- dockerd 运行时会调用其它 docker 命令,如 docker-proxy,所以需要将 docker 命令所在的目录加到 PATH 环境变量中;
- flanneld 启动时将网络配置写入 `/run/flannel/docker` 文件中,dockerd 启动前读取该文件中的环境变量 `DOCKER_NETWORK_OPTIONS` ,然后设置 docker0 网桥网段;
- 如果指定了多个 `EnvironmentFile` 选项,则必须将 `/run/flannel/docker` 放在最后(确保 docker0 使用 flanneld 生成的 bip 参数);
- docker 需要以 root 用于运行;
- docker 从 1.13 版本开始,可能将 **iptables FORWARD chain的默认策略设置为DROP**,从而导致 ping 其它 Node 上的 Pod IP 失败,遇到这种情况时,需要手动设置策略为 `ACCEPT`:iptables -P FORWARD ACCEPT,并且把以下命令写入 `/etc/rc.local` 文件中,防止节点重启**iptables FORWARD chain的默认策略又还原为DROP** ,/sbin/iptables -P FORWARD ACCEPT
2)分发 systemd unit 文件到所有 worker 机器
cd /opt/k8s/work source /opt/k8s/bin/environment.sh sed -i -e "s|##DOCKER_DIR##|${DOCKER_DIR}|" docker.service for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp docker.service root@${node_ip}:/etc/systemd/system/ done
3、配置和分发 docker 配置文件
1)配置docker-daemon.json
使用国内的仓库镜像服务器以加快 pull image 的速度,同时增加下载的并发数 (需要重启 dockerd 生效):
cd /opt/k8s/work source /opt/k8s/bin/environment.sh cat > docker-daemon.json <<EOF { "registry-mirrors": ["https://docker.mirrors.ustc.edu.cn","https://hub-mirror.c.163.com"], "insecure-registries": ["docker02:35000"], "max-concurrent-downloads": 20, "live-restore": true, "max-concurrent-uploads": 10, "debug": true, "data-root": "${DOCKER_DIR}/data", "exec-root": "${DOCKER_DIR}/exec", "log-opts": { "max-size": "100m", "max-file": "5" } } EOF
2)分发 docker 配置文件到所有 worker 节点
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "mkdir -p /etc/docker/ ${DOCKER_DIR}/{data,exec}" scp docker-daemon.json root@${node_ip}:/etc/docker/daemon.json done
4、启动 docker 服务
source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "systemctl daemon-reload && systemctl enable docker && systemctl restart docker" done
1)检查服务运行状态
source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "systemctl status docker|grep Active" done
确保状态为 `active (running)`,否则查看日志,确认原因:journalctl -u docker
2)检查 docker0 网桥
source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "/usr/sbin/ip addr show flannel.1 && /usr/sbin/ip addr show docker0" done
确认各 worker 节点的 docker0 网桥和 flannel.1 接口的 IP 处于同一个网段中(如下 172.30.128.0/32 位于 172.30.128.1/21 中)
注意: 如果您的服务安装顺序不对或者机器环境比较复杂, docker服务早于flanneld服务安装,此时 worker 节点的 docker0 网桥和 flannel.1 接口的 IP可能不会同处同一个网段下,这个时候请先停止docker服务, 手工删除docker0网卡,重新启动docker服务后即可修复:
systemctl stop docker
ip link delete docker0
systemctl start docker
3)查看 docker 的状态信息
ps -elfH|grep docker
docker info
四、部署 kubelet 组件
kubelet 运行在每个 worker 节点上,接收 kube-apiserver 发送的请求,管理 Pod 容器,执行交互式命令,如 exec、run、logs 等。
kubelet 启动时自动向 kube-apiserver 注册节点信息,内置的 cadvisor 统计和监控节点的资源使用情况。
为确保安全,部署时关闭了 kubelet 的非安全 http 端口,对请求进行认证和授权,拒绝未授权的访问(如 apiserver、heapster 的请求)。
1、下载和分发 kubelet 二进制文件
1)从 [CHANGELOG 页面](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG.md) 下载二进制 tar 文件并解压
cd /opt/k8s/work wget https://dl.k8s.io/v1.14.2/kubernetes-server-linux-amd64.tar.gz tar -xzvf kubernetes-server-linux-amd64.tar.gz cd kubernetes tar -xzvf kubernetes-src.tar.gz
其他下载链接:
wget https://storage.googleapis.com/kubernetes-release/release/v1.14.2/kubernetes-node-linux-amd64.tar.gz wget https://storage.googleapis.com/kubernetes-release/release/v1.14.2/kubernetes-server-linux-amd64.tar.gz wget https://storage.googleapis.com/kubernetes-release/release/v1.14.2/kubernetes-client-linux-amd64.tar.gz wget https://storage.googleapis.com/kubernetes-release/release/v1.14.2/kubernetes.tar.gz
2)将二进制文件 kubelet拷贝到所有 work 节点
cd /opt/k8s/work
source /opt/k8s/bin/environment.sh
for node_ip in ${NODE_IPS[@]}
do
echo ">>> ${node_ip}"
scp kubernetes/server/bin/{kubelet,kubectl,kubeadm,kube-proxy} root@${node_ip}:/opt/k8s/bin/
ssh root@${node_ip} "chmod +x /opt/k8s/bin/*"
done
2、创建 kubelet bootstrap kubeconfig 文件
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for node_name in ${NODE_NAMES[@]} do echo ">>> ${node_name}" # 创建 token export BOOTSTRAP_TOKEN=$(kubeadm token create --description kubelet-bootstrap-token --groups system:bootstrappers:${node_name} --kubeconfig ~/.kube/config) # 设置集群参数 kubectl config set-cluster kubernetes --certificate-authority=/etc/kubernetes/cert/ca.pem --embed-certs=true --server=${KUBE_APISERVER} --kubeconfig=kubelet-bootstrap-${node_name}.kubeconfig # 设置客户端认证参数 kubectl config set-credentials kubelet-bootstrap --token=${BOOTSTRAP_TOKEN} --kubeconfig=kubelet-bootstrap-${node_name}.kubeconfig # 设置上下文参数 kubectl config set-context default --cluster=kubernetes --user=kubelet-bootstrap --kubeconfig=kubelet-bootstrap-${node_name}.kubeconfig # 设置默认上下文 kubectl config use-context default --kubeconfig=kubelet-bootstrap-${node_name}.kubeconfig done
- 向 kubeconfig 写入的是 token,bootstrap 结束后 kube-controller-manager 为 kubelet 创建 client 和 server 证书
1)查看 kubeadm 为各节点创建的 token
$ kubeadm token list --kubeconfig ~/.kube/config TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS 3gzd53.ahl5unc2d09yjid9 23h 2019-05-27T11:29:57+08:00 authentication,signing kubelet-bootstrap-token system:bootstrappers:zhangjun-k8s02 82jfrm.um1mkjkr7w2c7ex9 23h 2019-05-27T11:29:56+08:00 authentication,signing kubelet-bootstrap-token system:bootstrappers:zhangjun-k8s01 b1f7np.lwnnzur3i8ymtkur 23h 2019-05-27T11:29:57+08:00 authentication,signing kubelet-bootstrap-token system:bootstrappers:zhangjun-k8s03
- token 有效期为 1 天,超期后将不能再被用来 boostrap kubelet,且会被 kube-controller-manager 的 token cleaner 清理;
- kube-apiserver 接收 kubelet 的 bootstrap token 后,将请求的 user 设置为 `system:bootstrap:<Token ID>`,group 设置为 `system:bootstrappers`,后续将为这个 group 设置 ClusterRoleBinding;
2)查看各 token 关联的 Secret
$ kubectl get secrets -n kube-system|grep bootstrap-token bootstrap-token-3gzd53 bootstrap.kubernetes.io/token 7 33s bootstrap-token-82jfrm bootstrap.kubernetes.io/token 7 34s bootstrap-token-b1f7np bootstrap.kubernetes.io/token 7 33s
3)分发 bootstrap kubeconfig 文件到所有 worker 节点
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for node_name in ${NODE_NAMES[@]} do echo ">>> ${node_name}" scp kubelet-bootstrap-${node_name}.kubeconfig root@${node_name}:/etc/kubernetes/kubelet-bootstrap.kubeconfig done
3、创建和分发 kubelet 参数配置文件
从 v1.10 开始,部分 kubelet 参数需在**配置文件**中配置,`kubelet --help` 会提示:
DEPRECATED: This parameter should be set via the config file specified by the Kubelet's --config flag
创建 kubelet 参数配置文件模板(可配置项参考[代码中注释](https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/apis/config/types.go)
1)创建文件
cd /opt/k8s/work source /opt/k8s/bin/environment.sh cat > kubelet-config.yaml.template <<EOF kind: KubeletConfiguration apiVersion: kubelet.config.k8s.io/v1beta1 address: "##NODE_IP##" staticPodPath: "" syncFrequency: 1m fileCheckFrequency: 20s httpCheckFrequency: 20s staticPodURL: "" port: 10250 readOnlyPort: 0 rotateCertificates: true serverTLSBootstrap: true authentication: anonymous: enabled: false webhook: enabled: true x509: clientCAFile: "/etc/kubernetes/cert/ca.pem" authorization: mode: Webhook registryPullQPS: 0 registryBurst: 20 eventRecordQPS: 0 eventBurst: 20 enableDebuggingHandlers: true enableContentionProfiling: true healthzPort: 10248 healthzBindAddress: "##NODE_IP##" clusterDomain: "${CLUSTER_DNS_DOMAIN}" clusterDNS: - "${CLUSTER_DNS_SVC_IP}" nodeStatusUpdateFrequency: 10s nodeStatusReportFrequency: 1m imageMinimumGCAge: 2m imageGCHighThresholdPercent: 85 imageGCLowThresholdPercent: 80 volumeStatsAggPeriod: 1m kubeletCgroups: "" systemCgroups: "" cgroupRoot: "" cgroupsPerQOS: true cgroupDriver: cgroupfs runtimeRequestTimeout: 10m hairpinMode: promiscuous-bridge maxPods: 220 podCIDR: "${CLUSTER_CIDR}" podPidsLimit: -1 resolvConf: /etc/resolv.conf maxOpenFiles: 1000000 kubeAPIQPS: 1000 kubeAPIBurst: 2000 serializeImagePulls: false evictionHard: memory.available: "100Mi" nodefs.available: "10%" nodefs.inodesFree: "5%" imagefs.available: "15%" evictionSoft: {} enableControllerAttachDetach: true failSwapOn: true containerLogMaxSize: 20Mi containerLogMaxFiles: 10 systemReserved: {} kubeReserved: {} systemReservedCgroup: "" kubeReservedCgroup: "" enforceNodeAllocatable: ["pods"] EOF
- address:kubelet 安全端口(https,10250)监听的地址,不能为 127.0.0.1,否则 kube-apiserver、heapster 等不能调用 kubelet 的 API;
- readOnlyPort=0:关闭只读端口(默认 10255),等效为未指定;
- authentication.anonymous.enabled:设置为 false,不允许匿名访问 10250 端口;
- authentication.x509.clientCAFile:指定签名客户端证书的 CA 证书,开启 HTTP 证书认证;
- authentication.webhook.enabled=true:开启 HTTPs bearer token 认证;
- 对于未通过 x509 证书和 webhook 认证的请求(kube-apiserver 或其他客户端),将被拒绝,提示 Unauthorized;
- authroization.mode=Webhook:kubelet 使用 SubjectAccessReview API 查询 kube-apiserver 某 user、group 是否具有操作资源的权限(RBAC);
- featureGates.RotateKubeletClientCertificate、featureGates.RotateKubeletServerCertificate:自动 rotate 证书,证书的有效期取决于 kube-controller-manager 的 --experimental-cluster-signing-duration 参数;
- 需要 root 账户运行;
2)为各节点创建和分发 kubelet 配置文件
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" sed -e "s/##NODE_IP##/${node_ip}/" kubelet-config.yaml.template > kubelet-config-${node_ip}.yaml.template scp kubelet-config-${node_ip}.yaml.template root@${node_ip}:/etc/kubernetes/kubelet-config.yaml done
4、创建和分发 kubelet systemd unit 文件
1)创建 kubelet systemd unit 文件模板
cd /opt/k8s/work source /opt/k8s/bin/environment.sh cat > kubelet.service.template <<EOF [Unit] Description=Kubernetes Kubelet Documentation=https://github.com/GoogleCloudPlatform/kubernetes After=docker.service Requires=docker.service [Service] WorkingDirectory=${K8S_DIR}/kubelet ExecStart=/opt/k8s/bin/kubelet \ --allow-privileged=true \ --bootstrap-kubeconfig=/etc/kubernetes/kubelet-bootstrap.kubeconfig \ --cert-dir=/etc/kubernetes/cert \ --cni-conf-dir=/etc/cni/net.d \ --container-runtime=docker \ --container-runtime-endpoint=unix:///var/run/dockershim.sock \ --root-dir=${K8S_DIR}/kubelet \ --kubeconfig=/etc/kubernetes/kubelet.kubeconfig \ --config=/etc/kubernetes/kubelet-config.yaml \ --hostname-override=##NODE_NAME## \ --pod-infra-container-image=registry.cn-beijing.aliyuncs.com/images_k8s/pause-amd64:3.1 \ --image-pull-progress-deadline=15m \ --volume-plugin-dir=${K8S_DIR}/kubelet/kubelet-plugins/volume/exec/ \ --logtostderr=true \ --v=2 Restart=always RestartSec=5 StartLimitInterval=0 [Install] WantedBy=multi-user.target EOF
- + 如果设置了 `--hostname-override` 选项,则 `kube-proxy` 也需要设置该选项,否则会出现找不到 Node 的情况;
- + `--bootstrap-kubeconfig`:指向 bootstrap kubeconfig 文件,kubelet 使用该文件中的用户名和 token 向 kube-apiserver 发送 TLS Bootstrapping 请求;
- + K8S approve kubelet 的 csr 请求后,在 `--cert-dir` 目录创建证书和私钥文件,然后写入 `--kubeconfig` 文件;
- + `--pod-infra-container-image` 不使用 redhat 的 `pod-infrastructure:latest` 镜像,它不能回收容器的僵尸;
2)为各节点创建和分发 kubelet systemd unit 文件
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for node_name in ${NODE_NAMES[@]} do echo ">>> ${node_name}" sed -e "s/##NODE_NAME##/${node_name}/" kubelet.service.template > kubelet-${node_name}.service scp kubelet-${node_name}.service root@${node_name}:/etc/systemd/system/kubelet.service done
5、Bootstrap Token Auth 和授予权限
kubelet 启动时查找 `--kubeletconfig` 参数对应的文件是否存在,如果不存在则使用 `--bootstrap-kubeconfig` 指定的 kubeconfig 文件向 kube-apiserver 发送证书签名请求 (CSR)。
kube-apiserver 收到 CSR 请求后,对其中的 Token 进行认证,认证通过后将请求的 user 设置为 `system:bootstrap:<Token ID>`,group 设置为 `system:bootstrappers`,这一过程称为 Bootstrap Token Auth。
默认情况下,这个 user 和 group 没有创建 CSR 的权限,kubelet 启动失败,错误日志如下:
$ sudo journalctl -u kubelet -a |grep -A 2 'certificatesigningrequests' May 26 12:13:41 zhangjun-k8s01 kubelet[128468]: I0526 12:13:41.798230 128468 certificate_manager.go:366] Rotating certificates May 26 12:13:41 zhangjun-k8s01 kubelet[128468]: E0526 12:13:41.801997 128468 certificate_manager.go:385] Failed while requesting a signed certificate from the master: cannot cre ate certificate signing request: certificatesigningrequests.certificates.k8s.io is forbidden: User "system:bootstrap:82jfrm" cannot create resource "certificatesigningrequests" i n API group "certificates.k8s.io" at the cluster scope May 26 12:13:42 zhangjun-k8s01 kubelet[128468]: E0526 12:13:42.044828 128468 kubelet.go:2244] node "zhangjun-k8s01" not found May 26 12:13:42 zhangjun-k8s01 kubelet[128468]: E0526 12:13:42.078658 128468 reflector.go:126] k8s.io/kubernetes/pkg/kubelet/kubelet.go:442: Failed to list *v1.Service: Unauthor ized May 26 12:13:42 zhangjun-k8s01 kubelet[128468]: E0526 12:13:42.079873 128468 reflector.go:126] k8s.io/kubernetes/pkg/kubelet/kubelet.go:451: Failed to list *v1.Node: Unauthorize d May 26 12:13:42 zhangjun-k8s01 kubelet[128468]: E0526 12:13:42.082683 128468 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1beta1.CSIDriver: Unau thorized May 26 12:13:42 zhangjun-k8s01 kubelet[128468]: E0526 12:13:42.084473 128468 reflector.go:126] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Unau thorized May 26 12:13:42 zhangjun-k8s01 kubelet[128468]: E0526 12:13:42.088466 128468 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1beta1.RuntimeClass: U nauthorized
解决办法是:创建一个 clusterrolebinding,将 group system:bootstrappers 和 clusterrole system:node-bootstrapper 绑定:
$ kubectl create clusterrolebinding kubelet-bootstrap --clusterrole=system:node-bootstrapper --group=system:bootstrappers
6、启动 kubelet 服务
source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "mkdir -p ${K8S_DIR}/kubelet/kubelet-plugins/volume/exec/" ssh root@${node_ip} "/usr/sbin/swapoff -a" ssh root@${node_ip} "systemctl daemon-reload && systemctl enable kubelet && systemctl restart kubelet" done
- + 启动服务前必须先创建工作目录;
- + 关闭 swap 分区,否则 kubelet 会启动失败;
$ journalctl -u kubelet |tail 8月 15 12:16:49 zhangjun-k8s01 kubelet[7807]: I0815 12:16:49.578598 7807 feature_gate.go:230] feature gates: &{map[RotateKubeletClientCertificate:true RotateKubeletServerCertificate:true]} 8月 15 12:16:49 zhangjun-k8s01 kubelet[7807]: I0815 12:16:49.578698 7807 feature_gate.go:230] feature gates: &{map[RotateKubeletClientCertificate:true RotateKubeletServerCertificate:true]} 8月 15 12:16:50 zhangjun-k8s01 kubelet[7807]: I0815 12:16:50.205871 7807 mount_linux.go:214] Detected OS with systemd 8月 15 12:16:50 zhangjun-k8s01 kubelet[7807]: I0815 12:16:50.205939 7807 server.go:408] Version: v1.11.2 8月 15 12:16:50 zhangjun-k8s01 kubelet[7807]: I0815 12:16:50.206013 7807 feature_gate.go:230] feature gates: &{map[RotateKubeletClientCertificate:true RotateKubeletServerCertificate:true]} 8月 15 12:16:50 zhangjun-k8s01 kubelet[7807]: I0815 12:16:50.206101 7807 feature_gate.go:230] feature gates: &{map[RotateKubeletServerCertificate:true RotateKubeletClientCertificate:true]} 8月 15 12:16:50 zhangjun-k8s01 kubelet[7807]: I0815 12:16:50.206217 7807 plugins.go:97] No cloud provider specified. 8月 15 12:16:50 zhangjun-k8s01 kubelet[7807]: I0815 12:16:50.206237 7807 server.go:524] No cloud provider specified: "" from the config file: "" 8月 15 12:16:50 zhangjun-k8s01 kubelet[7807]: I0815 12:16:50.206264 7807 bootstrap.go:56] Using bootstrap kubeconfig to generate TLS client cert, key and kubeconfig file 8月 15 12:16:50 zhangjun-k8s01 kubelet[7807]: I0815 12:16:50.208628 7807 bootstrap.go:86] No valid private key and/or certificate found, reusing existing private key or creating a new one
kubelet 启动后使用 --bootstrap-kubeconfig 向 kube-apiserver 发送 CSR 请求,当这个 CSR 被 approve 后,kube-controller-manager 为 kubelet 创建 TLS 客户端证书、私钥和 --kubeletconfig 文件。
注意:kube-controller-manager 需要配置 `--cluster-signing-cert-file` 和 `--cluster-signing-key-file` 参数,才会为 TLS Bootstrap 创建证书和私钥。
TLS bootstrapping相关概念请参考:https://www.cnblogs.com/deny/p/12268224.html
$ kubectl get csr NAME AGE REQUESTOR CONDITION csr-5f4vh 31s system:bootstrap:82jfrm Pending csr-5rw7s 29s system:bootstrap:b1f7np Pending csr-m29fm 31s system:bootstrap:3gzd53 Pending $ kubectl get nodes No resources found.
- 三个 worker 节点的 csr 均处于 pending 状态;
1)自动 approve CSR 请求
创建三个 ClusterRoleBinding,分别用于自动 approve client、renew client、renew server 证书
cd /opt/k8s/work cat > csr-crb.yaml <<EOF # Approve all CSRs for the group "system:bootstrappers" kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: auto-approve-csrs-for-group subjects: - kind: Group name: system:bootstrappers apiGroup: rbac.authorization.k8s.io roleRef: kind: ClusterRole name: system:certificates.k8s.io:certificatesigningrequests:nodeclient apiGroup: rbac.authorization.k8s.io --- # To let a node of the group "system:nodes" renew its own credentials kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: node-client-cert-renewal subjects: - kind: Group name: system:nodes apiGroup: rbac.authorization.k8s.io roleRef: kind: ClusterRole name: system:certificates.k8s.io:certificatesigningrequests:selfnodeclient apiGroup: rbac.authorization.k8s.io --- # A ClusterRole which instructs the CSR approver to approve a node requesting a # serving cert matching its client cert. kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: name: approve-node-server-renewal-csr rules: - apiGroups: ["certificates.k8s.io"] resources: ["certificatesigningrequests/selfnodeserver"] verbs: ["create"] --- # To let a node of the group "system:nodes" renew its own server credentials kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: node-server-cert-renewal subjects: - kind: Group name: system:nodes apiGroup: rbac.authorization.k8s.io roleRef: kind: ClusterRole name: approve-node-server-renewal-csr apiGroup: rbac.authorization.k8s.io EOF kubectl apply -f csr-crb.yaml
- + auto-approve-csrs-for-group:自动 approve node 的第一次 CSR; 注意第一次 CSR 时,请求的 Group 为 system:bootstrappers;
- + node-client-cert-renewal:自动 approve node 后续过期的 client 证书,自动生成的证书 Group 为 system:nodes;
- + node-server-cert-renewal:自动 approve node 后续过期的 server 证书,自动生成的证书 Group 为 system:nodes;
2)查看 kubelet 的情况
等待一段时间(1-10 分钟),三个节点的 CSR 都被自动 approved:
$ kubectl get csr NAME AGE REQUESTOR CONDITION csr-5f4vh 7m59s system:bootstrap:82jfrm Approved,Issued csr-5r7j7 4m45s system:node:zhangjun-k8s03 Pending csr-5rw7s 7m57s system:bootstrap:b1f7np Approved,Issued csr-9snww 6m37s system:bootstrap:82jfrm Approved,Issued csr-c7z56 4m46s system:node:zhangjun-k8s02 Pending csr-j55lh 4m46s system:node:zhangjun-k8s01 Pending csr-m29fm 7m59s system:bootstrap:3gzd53 Approved,Issued csr-rc8w7 6m37s system:bootstrap:3gzd53 Approved,Issued csr-vd52r 6m36s system:bootstrap:b1f7np Approved,Issued
Pending 的 CSR 用于创建 kubelet server 证书,需要手动 approve,参考后文。
所有节点均 ready:
[root@zhangjun-k8s01 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION zhangjun-k8s01 Ready <none> 2d1h v1.14.2 zhangjun-k8s02 Ready <none> 2d1h v1.14.2 zhangjun-k8s03 Ready <none> 2d1h v1.14.2
kube-controller-manager 为各 node 生成了 kubeconfig 文件和公私钥:
[root@zhangjun-k8s01 ~]# ls -l /etc/kubernetes/kubelet.kubeconfig -rw------- 1 root root 2310 2月 3 14:44 /etc/kubernetes/kubelet.kubeconfig [root@zhangjun-k8s01 ~]# ls -l /etc/kubernetes/cert/|grep kubelet -rw------- 1 root root 1281 2月 3 14:45 kubelet-client-2020-02-03-14-45-30.pem lrwxrwxrwx 1 root root 59 2月 3 14:45 kubelet-client-current.pem -> /etc/kubernetes/cert/kubelet-client-2020-02-03-14-45-30.pem -rw------- 1 root root 1330 2月 3 14:47 kubelet-server-2020-02-03-14-47-02.pem lrwxrwxrwx 1 root root 59 2月 3 14:47 kubelet-server-current.pem -> /etc/kubernetes/cert/kubelet-server-2020-02-03-14-47-02.pem
- 没有自动生成 kubelet server 证书;
3)手动 approve server cert csr
基于[安全性考虑](https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet-tls-bootstrapping/#kubelet-configuration),CSR approving controllers 不会自动 approve kubelet server 证书签名请求,需要手动 approve
$ kubectl get csr NAME AGE REQUESTOR CONDITION csr-5f4vh 9m25s system:bootstrap:82jfrm Approved,Issued csr-5r7j7 6m11s system:node:zhangjun-k8s03 Pending csr-5rw7s 9m23s system:bootstrap:b1f7np Approved,Issued csr-9snww 8m3s system:bootstrap:82jfrm Approved,Issued csr-c7z56 6m12s system:node:zhangjun-k8s02 Pending csr-j55lh 6m12s system:node:zhangjun-k8s01 Pending csr-m29fm 9m25s system:bootstrap:3gzd53 Approved,Issued csr-rc8w7 8m3s system:bootstrap:3gzd53 Approved,Issued csr-vd52r 8m2s system:bootstrap:b1f7np Approved,Issued $ kubectl certificate approve csr-5r7j7 certificatesigningrequest.certificates.k8s.io/csr-5r7j7 approved $ kubectl certificate approve csr-c7z56 certificatesigningrequest.certificates.k8s.io/csr-c7z56 approved $ kubectl certificate approve csr-j55lh certificatesigningrequest.certificates.k8s.io/csr-j55lh approved $ ls -l /etc/kubernetes/cert/kubelet-* -rw------- 1 root root 1281 May 26 12:19 /etc/kubernetes/cert/kubelet-client-2019-05-26-12-19-25.pem lrwxrwxrwx 1 root root 59 May 26 12:19 /etc/kubernetes/cert/kubelet-client-current.pem -> /etc/kubernetes/cert/kubelet-client-2019-05-26-12-19-25.pem -rw------- 1 root root 1326 May 26 12:26 /etc/kubernetes/cert/kubelet-server-2019-05-26-12-26-39.pem lrwxrwxrwx 1 root root 59 May 26 12:26 /etc/kubernetes/cert/kubelet-server-current.pem -> /etc/kubernetes/cert/kubelet-server-2019-05-26-12-26-39.pem
7、kubelet 提供的 API 接口
kubelet 启动后监听多个端口,用于接收 kube-apiserver 或其它客户端发送的请求:
[root@zhangjun-k8s01 ~]# netstat -lnpt|grep kubelet tcp 0 0 127.0.0.1:46758 0.0.0.0:* LISTEN 1505/kubelet tcp 0 0 192.168.1.201:10248 0.0.0.0:* LISTEN 1505/kubelet tcp 0 0 192.168.1.201:10250 0.0.0.0:* LISTEN 1505/kubelet
- + 10248: healthz http 服务;
- + 10250: https 服务,访问该端口时需要认证和授权(即使访问 /healthz 也需要);
- + 未开启只读端口 10255;
- + 从 K8S v1.10 开始,去除了 `--cadvisor-port` 参数(默认 4194 端口),不支持访问 cAdvisor UI & API。
例如执行 `kubectl exec -it nginx-ds-5rmws -- sh` 命令时,kube-apiserver 会向 kubelet 发送如下请求:
POST /exec/default/nginx-ds-5rmws/my-nginx?command=sh&input=1&output=1&tty=1
kubelet 接收 10250 端口的 https 请求,可以访问如下资源:
+ /pods、/runningpods
+ /metrics、/metrics/cadvisor、/metrics/probes
+ /spec
+ /stats、/stats/container
+ /logs
+ /run/、/exec/, /attach/, /portForward/, /containerLogs/
详情参考:https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/server/server.go#L434:3
由于关闭了匿名认证,同时开启了 webhook 授权,所有访问 10250 端口 https API 的请求都需要被认证和授权。
预定义的 ClusterRole system:kubelet-api-admin 授予访问 kubelet 所有 API 的权限(kube-apiserver 使用的 kubernetes 证书 User 授予了该权限):
[root@zhangjun-k8s01 ~]# kubectl describe clusterrole system:kubelet-api-admin Name: system:kubelet-api-admin Labels: kubernetes.io/bootstrapping=rbac-defaults Annotations: rbac.authorization.kubernetes.io/autoupdate: true PolicyRule: Resources Non-Resource URLs Resource Names Verbs --------- ----------------- -------------- ----- nodes/log [] [] [*] nodes/metrics [] [] [*] nodes/proxy [] [] [*] nodes/spec [] [] [*] nodes/stats [] [] [*] nodes [] [] [get list watch proxy]
8、kubelet api 认证和授权
kubelet 配置了如下认证参数:
+ authentication.anonymous.enabled:设置为 false,不允许匿名访问 10250 端口;
+ authentication.x509.clientCAFile:指定签名客户端证书的 CA 证书,开启 HTTPs 证书认证;
+ authentication.webhook.enabled=true:开启 HTTPs bearer token 认证;
同时配置了如下授权参数:
+ authroization.mode=Webhook:开启 RBAC 授权;
kubelet 收到请求后,使用 clientCAFile 对证书签名进行认证,或者查询 bearer token 是否有效。如果两者都没通过,则拒绝请求,提示 Unauthorized:
$ curl -s --cacert /etc/kubernetes/cert/ca.pem https://192.168.1.201:10250/metrics Unauthorized $ curl -s --cacert /etc/kubernetes/cert/ca.pem -H "Authorization: Bearer 123456" https://192.168.1.201:10250/metrics Unauthorized
通过认证后,kubelet 使用 SubjectAccessReview API 向 kube-apiserver 发送请求,查询证书或 token 对应的 user、group 是否有操作资源的权限(RBAC);
1)证书认证和授权
curl -s --cacert /etc/kubernetes/cert/ca.pem --cert /etc/kubernetes/cert/kube-controller-manager.pem --key /etc/kubernetes/cert/kube-controller-manager-key.pem https://192.168.1.201:10250/metrics
Forbidden (user=system:kube-controller-manager, verb=get, resource=nodes, subresource=metrics)
2) 使用部署 kubectl 命令行工具时创建的、具有最高权限的 admin 证书;
curl -s --cacert /etc/kubernetes/cert/ca.pem --cert /opt/k8s/work/admin.pem --key /opt/k8s/work/admin-key.pem https://192.168.1.201:10250/metrics|head
- + `--cacert`、`--cert`、`--key` 的参数值必须是文件路径,如上面的 `./admin.pem` 不能省略 `./`,否则返回 `401 Unauthorized`;
3)bear token 认证和授权
创建一个 ServiceAccount,将它和 ClusterRole system:kubelet-api-admin 绑定,从而具有调用 kubelet API 的权限:
kubectl create sa kubelet-api-test kubectl create clusterrolebinding kubelet-api-test --clusterrole=system:kubelet-api-admin --serviceaccount=default:kubelet-api-test SECRET=$(kubectl get secrets | grep kubelet-api-test | awk '{print $1}') TOKEN=$(kubectl describe secret ${SECRET} | grep -E '^token' | awk '{print $2}') echo ${TOKEN}
curl -s --cacert /etc/kubernetes/cert/ca.pem -H "Authorization: Bearer ${TOKEN}" https://192.168.1.201:10250/metrics|head
9、cadvisor 和 metrics
cadvisor 是内嵌在 kubelet 二进制中的,统计所在节点各容器的资源(CPU、内存、磁盘、网卡)使用情况的服务。
浏览器访问 https://192.168.1.201:10250/metrics 和 https://192.168.1.201:10250/metrics/cadvisor 分别返回 kubelet 和 cadvisor 的 metrics。
注意:
+ kubelet.config.json 设置 authentication.anonymous.enabled 为 false,不允许匿名证书访问 10250 的 https 服务;
+ 参考浏览器访问 kube-apiserver 安全端口 https://www.cnblogs.com/deny/p/12264757.html,创建和导入相关证书,然后访问上面的 10250 端口;
10、获取 kubelet 的配置
从 kube-apiserver 获取各节点 kubelet 的配置,使用部署 kubectl 命令行工具时创建的、具有最高权限的 admin 证书
curl -sSL --cacert /etc/kubernetes/cert/ca.pem --cert /opt/k8s/work/admin.pem --key /opt/k8s/work/admin-key.pem ${KUBE_APISERVER}/api/v1/nodes/zhangjun-k8s01/proxy/configz | jq '.kubeletconfig|.kind="KubeletConfiguration"|.apiVersion="kubelet.config.k8s.io/v1beta1"'
kubelet 认证和授权:https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet-authentication-authorization/
五、部署 kube-proxy 组件
1、创建 kube-proxy 证书
1)创建证书签名请求
cd /opt/k8s/work cat > kube-proxy-csr.json <<EOF { "CN": "system:kube-proxy", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "k8s", "OU": "4Paradigm" } ] } EOF
- + CN:指定该证书的 User 为 `system:kube-proxy`;
- + 预定义的 RoleBinding `system:node-proxier` 将User `system:kube-proxy` 与 Role `system:node-proxier` 绑定,该 Role 授予了调用 `kube-apiserver` Proxy 相关 API 的权限;
- + 该证书只会被 kube-proxy 当做 client 证书使用,所以 hosts 字段为空;
2)生成证书和私钥
cd /opt/k8s/work cfssl gencert -ca=/opt/k8s/work/ca.pem -ca-key=/opt/k8s/work/ca-key.pem -config=/opt/k8s/work/ca-config.json -profile=kubernetes kube-proxy-csr.json | cfssljson -bare kube-proxy ls kube-proxy*
2、创建和分发 kubeconfig 文件
1)创建文件
cd /opt/k8s/work source /opt/k8s/bin/environment.sh kubectl config set-cluster kubernetes --certificate-authority=/opt/k8s/work/ca.pem --embed-certs=true --server=${KUBE_APISERVER} --kubeconfig=kube-proxy.kubeconfig kubectl config set-credentials kube-proxy --client-certificate=kube-proxy.pem --client-key=kube-proxy-key.pem --embed-certs=true --kubeconfig=kube-proxy.kubeconfig kubectl config set-context default --cluster=kubernetes --user=kube-proxy --kubeconfig=kube-proxy.kubeconfig kubectl config use-context default --kubeconfig=kube-proxy.kubeconfig
- --embed-certs=true`:将 ca.pem 和 admin.pem 证书内容嵌入到生成的 kubectl-proxy.kubeconfig 文件中(不加时,写入的是证书文件路径);
2)分发 kubeconfig 文件
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for node_name in ${NODE_NAMES[@]} do echo ">>> ${node_name}" scp kube-proxy.kubeconfig root@${node_name}:/etc/kubernetes/ done
3、创建 kube-proxy 配置文件
从 v1.10 开始,kube-proxy **部分参数**可以配置文件中配置。可以使用 `--write-config-to` 选项生成该配置文件,或者参考 [源代码的注释](https://github.com/kubernetes/kubernetes/blob/release-1.14/pkg/proxy/apis/config/types.go)。
1)创建 kube-proxy config 文件模板
cd /opt/k8s/work source /opt/k8s/bin/environment.sh cat > kube-proxy-config.yaml.template <<EOF kind: KubeProxyConfiguration apiVersion: kubeproxy.config.k8s.io/v1alpha1 clientConnection: burst: 200 kubeconfig: "/etc/kubernetes/kube-proxy.kubeconfig" qps: 100 bindAddress: ##NODE_IP## healthzBindAddress: ##NODE_IP##:10256 metricsBindAddress: ##NODE_IP##:10249 enableProfiling: true clusterCIDR: ${CLUSTER_CIDR} hostnameOverride: ##NODE_NAME## mode: "ipvs" portRange: "" iptables: masqueradeAll: false ipvs: scheduler: rr excludeCIDRs: [] EOF
- + `bindAddress`: 监听地址;
- + `clientConnection.kubeconfig`: 连接 apiserver 的 kubeconfig 文件;
- + `clusterCIDR`: kube-proxy 根据 `--cluster-cidr` 判断集群内部和外部流量,指定 `--cluster-cidr` 或 `--masquerade-all` 选项后 kube-proxy 才会对访问 Service IP 的请求做 SNAT;
- + `hostnameOverride`: 参数值必须与 kubelet 的值一致,否则 kube-proxy 启动后会找不到该 Node,从而不会创建任何 ipvs 规则;
- + `mode`: 使用 ipvs 模式;
2)为各节点创建和分发 kube-proxy 配置文件
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for (( i=0; i < 3; i++ )) do echo ">>> ${NODE_NAMES[i]}" sed -e "s/##NODE_NAME##/${NODE_NAMES[i]}/" -e "s/##NODE_IP##/${NODE_IPS[i]}/" kube-proxy-config.yaml.template > kube-proxy-config-${NODE_NAMES[i]}.yaml.template scp kube-proxy-config-${NODE_NAMES[i]}.yaml.template root@${NODE_NAMES[i]}:/etc/kubernetes/kube-proxy-config.yaml done
4、创建和分发 kube-proxy systemd unit 文件
1)创建模板文件
cd /opt/k8s/work source /opt/k8s/bin/environment.sh cat > kube-proxy.service <<EOF [Unit] Description=Kubernetes Kube-Proxy Server Documentation=https://github.com/GoogleCloudPlatform/kubernetes After=network.target [Service] WorkingDirectory=${K8S_DIR}/kube-proxy ExecStart=/opt/k8s/bin/kube-proxy \ --config=/etc/kubernetes/kube-proxy-config.yaml \ --logtostderr=true \ --v=2 Restart=on-failure RestartSec=5 LimitNOFILE=65536 [Install] WantedBy=multi-user.target EOF
2)分发 kube-proxy systemd unit 文件
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for node_name in ${NODE_NAMES[@]} do echo ">>> ${node_name}" scp kube-proxy.service root@${node_name}:/etc/systemd/system/ done
5、启动 kube-proxy 服务
cd /opt/k8s/work source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "mkdir -p ${K8S_DIR}/kube-proxy" ssh root@${node_ip} "modprobe ip_vs_rr" ssh root@${node_ip} "systemctl daemon-reload && systemctl enable kube-proxy && systemctl restart kube-proxy" done
- 启动服务前必须先创建工作目录
1) 检查启动结果
source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "systemctl status kube-proxy|grep Active" done
确保状态为 `active (running)`,否则查看日志,确认原因:journalctl -u kube-proxy
2)查看监听端口
[root@zhangjun-k8s01 ~]# netstat -lnpt|grep kube-prox tcp 0 0 192.168.1.201:10249 0.0.0.0:* LISTEN 899/kube-proxy tcp 0 0 192.168.1.201:10256 0.0.0.0:* LISTEN 899/kube-proxy tcp6 0 0 :::31424 :::* LISTEN 899/kube-proxy tcp6 0 0 :::31205 :::* LISTEN 899/kube-proxy tcp6 0 0 :::31789 :::* LISTEN 899/kube-proxy [root@zhangjun-k8s01 ~]#
- + 10249:http prometheus metrics port;
- + 10256:http healthz port;
3)查看 ipvs 路由规则
source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh root@${node_ip} "/usr/sbin/ipvsadm -ln" done
- 可见所有通过 https 访问 K8S SVC kubernetes 的请求都转发到 kube-apiserver 节点的 6443 端口