学习内容总结来自B站UP主"尚硅谷"的Kubernetes(k8s)教学视频: https://www.bilibili.com/video/BV1w4411y7Go
k8s-基于阿里云服务器使用kubeadm搭建k8s集群
本人也是新手学习k8s, 先搭建一个比较简单的1主2从的集群, 这些服务器当然不能体现k8s的威力, 但是由于新手上路, 先搞个简单的集群试试看, 后面熟练了再使用更多的服务器尝试.
购买阿里云服务器
这里先购买三台阿里云服务器, 但是买的都是按量付费类型的, 即按其规定步骤停机后可以不收取费用 (停机再重启后不会影响已经搭建好的集群结构)
购买链接为:
https://ecs-buy.aliyun.com/wizard?spm=5176.ecssimplebuy.header.1.15fd36751sf2fA#/prepay/cn-shanghai
已经购买过阿里云服务器的话, 也可以在控制台的点击创建实例, 进入
选择服务器
- 选择按量付费
- 选择离你比较近的地域
- 这里选择了2和2G的突发性能实例 t6 (ecs.t6-c1m1.large)(先用着看看, 不确定配置是否够用)
- 数量3台
- 系统镜像选择了64位Centos8.0(docker安装要求 Centos7.0以上)
- 磁盘选择了40G
综上费用为¥ 0.413 /时
网络和安全组
- 网络选择默认
- 宽带计费模式选择"按使用流量"计费, 峰值可以随便选(如40M), 即用了多少扣多少钱
- 其他默认即可
系统配置
- 登录凭证选择"自定义密码", 可以给root设置统一的密码, 好管理
- 其他默认即可
分组配置
- 默认即可
确认配置, 创建订单
初始化服务器设置(三台都要)
为了方便管理, 将服务器的实例名称改成: k8s-master01-225
/k8s-node01-228
/k8s-node02-229
(其中225/228/229是私网IP的最后三位, 命名规则可以自行定义)
使用xshell工具连接三个服务器
测试一下三个服务器可以通过私网相互ping通, 后面使用私网连接而不用公网, 因为公网流量要钱
修改主机名称
# k8s-master01-225 机器上
hostnamectl set-hostname k8s-master01-225
# k8s-node01-228 机器上
hostnamectl set-hostname k8s-node01-228
# k8s-node02-229 机器上
hostnamectl set-hostname k8s-node02-229
设置/etc/hosts文件
真正的集群应该是使用自己搭建的dns服务器来进行IP和域名绑定, 这里处于简单考虑, 就直接使用hosts文件关联IP和主机名了, 在三台服务的/etc/hosts
文件中添加相同的三句话
172.19.199.225 k8s-master01-225
172.19.188.228 k8s-node01-228
172.19.188.229 k8s-node02-229
xshel有个强大功能是能输入一个命令同时控制多个终端, 在其中一个终端中右键, 选择"发送键输入到所有会话", 这样不用一个一个服务器取运行了, 不过要注意有时候只需要某一个服务器运行的命令时, 不要忘了把公用命令的设置去掉
安装依赖包
yum install -y conntrack ipvsadm ipset jq iptables curl sysstat libseccomp wget vim net-tools git
关闭防火墙
systemctl stop firewalld && systemctl disable firewal
安装设置Iptables规则为空
yum -y install iptables-services && systemctl start iptables && systemctl enable iptables&& iptables -F && service iptables save
关闭swap分区
不关闭的话, pod容器可能运行在swap(虚拟内存)中, 影响效率
swapoff -a && sed -i '/ swap / s/^(.*)$/#1/g' /etc/fstab
关闭selinux
setenforce 0 && sed -i 's/^SELINUX=.*/SELINUX=disabled/' /etc/selinux/config
针对K8S调整内核参数
编辑配置文件
cat > kubernetes.conf <<EOF
net.bridge.bridge-nf-call-iptables=1 # 开启网桥模式
net.bridge.bridge-nf-call-ip6tables=1 # 开启网桥模式
net.ipv4.ip_forward=1
net.ipv4.tcp_tw_recycle=0
vm.swappiness=0 # 禁止使用 swap 空间,只有当系统 OOM 时才允许使用它
vm.overcommit_memory=1 # 不检查物理内存是否够用
vm.panic_on_oom=0 # 开启 OOM
fs.inotify.max_user_instances=8192
fs.inotify.max_user_watches=1048576
fs.file-max=52706963
fs.nr_open=52706963
net.ipv6.conf.all.disable_ipv6=1 # 关闭IPV6协议
net.netfilter.nf_conntrack_max=2310720
EOF
生效配置文件
cp kubernetes.conf /etc/sysctl.d/kubernetes.conf
sysctl -p /etc/sysctl.d/kubernetes.conf
调整系统时区(时区正常的可以不用设置)
# 设置系统时区为中国/上海
timedatectl set-timezone Asia/Shanghai
# 将当前的 UTC 时间写入硬件时钟
timedatectl set-local-rtc 0
# 重启依赖于系统时间的服务
systemctl restart rsyslog
systemctl restart crond
关闭系统不需要的服务(如果有的话)
systemctl stop postfix && systemctl disable postfix
设置日志系统
选择systemd journald
的日志系统, 而不是rsyslogd
创建日志目录
mkdir /var/log/journal # 持久化保存日志的目录
mkdir /etc/systemd/journald.conf.d
编写配置文件
cat > /etc/systemd/journald.conf.d/99-prophet.conf <<EOF
[Journal]
# 持久化保存到磁盘
Storage=persistent
# 压缩历史日志
Compress=yes
SyncIntervalSec=5m
RateLimitInterval=30s
RateLimitBurst=1000
# 最大占用空间 10G
SystemMaxUse=10G
# 单日志文件最大 200M
SystemMaxFileSize=200M
# 日志保存时间 2 周
MaxRetentionSec=2week
# 不将日志转发到syslog
ForwardToSyslog=no
EOF
重启日志系统
systemctl restart systemd-journald
kube-proxy开启ipvs的前置条件
# 加载br_netfilter模块
modprobe br_netfilter
# 编写依赖文件
cat > /etc/sysconfig/modules/ipvs.modules <<EOF
#!/bin/bash
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack_ipv4
EOF
# 授权
chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules && lsmod | grep -e ip_vs -e nf_conntrack_ipv4
安装Docker
# 安装依赖
yum install -y yum-utils device-mapper-persistent-data lvm2
# 配置阿里源
yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
# 安装安装最新的 containerd.io
dnf install https://download.docker.com/linux/centos/7/x86_64/stable/Packages/containerd.io-1.2.6-3.3.el7.x86_64.rpm
# 安装docker
yum update -y && yum install -y docker-ce
# 查看docker版本(是否安装成功)
docker --version
# 创建 /etc/docker 目录
mkdir /etc/docker
# 配置 daemon.json
cat > /etc/docker/daemon.json <<EOF
{
"registry-mirrors": ["https:xxxxx"] # 在阿里云控制台选择"容器镜像服务", 再选择"镜像加速器"侧边栏, 查看加速器地址
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
}
}
EOF
# 创建目录
mkdir-p /etc/systemd/system/docker.service.d
# 重启docker
systemctl daemon-reload && systemctl restart docker && systemctl enable docker
安装Kubeadm(主从配置)
下载kubeadm(三台服务器)
# 配置阿里源
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg # 注意两个网址在一行, 空格隔开
EOF
# 安装 kubelet kubeadm kubectl
yum install -y kubelet kubeadm kubectl
systemctl enable --now kubelet
下载必须镜像(三台服务器)
正常情况下, 接下来可以直接init操作, 在init操作时, 也会下载一些必须的组件镜像, 这些镜像是在k8s.gcr.io
网站上下载的, 但是由于我们国内把该网址墙掉了, 不能直接访问, 于是需要先提前将这些镜像通过其他的方式下载好, 这里比较好的方式就是从另一个网站源下载.
# 查看需要下载的镜像
kubeadm config images list
# 输出结果, 这些都是K8S的必要组件, 但是由于被墙, 是不能直接docker pull下来的
k8s.gcr.io/kube-apiserver:v1.18.6
k8s.gcr.io/kube-controller-manager:v1.18.6
k8s.gcr.io/kube-scheduler:v1.18.6
k8s.gcr.io/kube-proxy:v1.18.6
k8s.gcr.io/pause:3.2
k8s.gcr.io/etcd:3.4.3-0
k8s.gcr.io/coredns:1.6.7
# 直接pull的话会报错超时
[ERROR ImagePull]: failed to pull image k8s.gcr.io/kube-apiserver:v1.18.5: output: Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
经过百度后, 发现这篇大佬的博客中第二个方法对我是管用的, 这里搬来用一用
https://blog.csdn.net/weixin_43168190/article/details/107227626
即先从gotok8s仓库下载镜像, 然后重新tag一下, 修改起名字即可, 这里使用大佬的脚本自动化执行全过程
# 编写pull脚本
vim pull_k8s_images.sh
# 内容为
set -o errexit
set -o nounset
set -o pipefail
##这里定义需要下载的版本
KUBE_VERSION=v1.18.6
KUBE_PAUSE_VERSION=3.2
ETCD_VERSION=3.4.3-0
DNS_VERSION=1.6.7
##这是原来被墙的仓库
GCR_URL=k8s.gcr.io
##这里就是写你要使用的仓库,可以gotok8s不变
DOCKERHUB_URL=gotok8s
##这里是镜像列表
images=(
kube-proxy:${KUBE_VERSION}
kube-scheduler:${KUBE_VERSION}
kube-controller-manager:${KUBE_VERSION}
kube-apiserver:${KUBE_VERSION}
pause:${KUBE_PAUSE_VERSION}
etcd:${ETCD_VERSION}
coredns:${DNS_VERSION}
)
##这里是拉取和改名的循环语句, 先下载, 再tag重命名生成需要的镜像, 再删除下载的镜像
for imageName in ${images[@]} ; do
docker pull $DOCKERHUB_URL/$imageName
docker tag $DOCKERHUB_URL/$imageName $GCR_URL/$imageName
docker rmi $DOCKERHUB_URL/$imageName
done
# 赋予执行权限
chmod +x ./pull_k8s_images.sh
# 执行脚本
./pull_k8s_images.sh
# 查看下载结果
[root@k8s-master01-225 ~]# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
k8s.gcr.io/kube-proxy v1.18.6 c3d62d6fe412 2 weeks ago 117MB
k8s.gcr.io/kube-controller-manager v1.18.6 ffce5e64d915 2 weeks ago 162MB
k8s.gcr.io/kube-apiserver v1.18.6 56acd67ea15a 2 weeks ago 173MB
k8s.gcr.io/kube-scheduler v1.18.6 0e0972b2b5d1 2 weeks ago 95.3MB
k8s.gcr.io/pause 3.2 80d28bedfe5d 5 months ago 683kB
k8s.gcr.io/coredns 1.6.7 67da37a9a360 6 months ago 43.8MB
gotok8s/kube-controller-manager v1.17.0 5eb3b7486872 7 months ago 161MB
k8s.gcr.io/etcd 3.4.3-0 303ce5db0e90 9 months ago 288MB
初始化主节点(只有主节点服务器才需要初始化)
生成初始化文件
kubeadm config print init-defaults > kubeadm-config.yaml
修改初始化文件
# 编辑文件
vim kubeadm-config.yaml
# 修改项下面标出
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 172.19.199.225 # 1.修改IP地址, 使用私网IP地址即可
bindPort: 6443
nodeRegistration:
criSocket: /var/run/dockershim.sock
name: k8s-master01-225
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: k8s.gcr.io
kind: ClusterConfiguration
kubernetesVersion: v1.18.6 # 2.修改版本, 与前面版本一致, 也可通过 kubeadm version 查看版本
networking:
dnsDomain: cluster.local
podSubnet: "10.244.0.0/16" # 3.新增pod子网, 固定该IP即可
serviceSubnet: 10.96.0.0/12
scheduler: {}
# 4.新增下面设置, 固定即可
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
featureGates:
SupportIPVSProxyMode: true
mode: ipvs
运行初始化命令
kubeadm init --config=kubeadm-config.yaml | tee kubeadm-init.log
# 正常运行结果
....
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 172.19.199.225:6443 --token abcdef.0123456789abcdef
--discovery-token-ca-cert-hash sha256:873f80617875dc39a23eced3464c7069689236d460b60692586e7898bf8a254a
如果init运行错误
可以根据错误信息来排错, 多半原因是配置文件kubeadm-config.yaml
没写好, 如版本号没对上, IP地址没改, 多余空格等等...
修改完之后之后, 如果直接运行init
命令, 可能还会报错端口已被占用或者一些文件已经存在等
[root@k8s-node01-228 ~]# kubeadm init --config=kubeadm-config.yaml | tee kubeadm-init.log
W0801 18:35:22.768809 44882 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[init] Using Kubernetes version: v1.18.6
[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[WARNING FileExisting-tc]: tc not found in system path
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR Port-10259]: Port 10259 is in use
[ERROR Port-10257]: Port 10257 is in use
[ERROR FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists
[ERROR FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists
[ERROR FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists
[ERROR FileAvailable--etc-kubernetes-manifests-etcd.yaml]: /etc/kubernetes/manifests/etcd.yaml already exists
[ERROR Port-10250]: Port 10250 is in use
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
原因可能是之前init
到一半成功了一部分, 但是报错后有没有回滚, 那么需要先运行kubeadm reset
重新设置为init
之前的状态
[root@k8s-node01-228 ~]# kubeadm reset
[reset] Reading configuration from the cluster...
[reset] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
W0801 18:57:02.630170 52554 reset.go:99] [reset] Unable to fetch the kubeadm-config ConfigMap from cluster: failed to get config map: Get https://172.19.188.226:6443/api/v1/namespaces/kube-system/configmaps/kubeadm-config?timeout=10s: context deadline exceeded
[reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted.
[reset] Are you sure you want to proceed? [y/N]: y
[preflight] Running pre-flight checks
W0801 18:57:07.534409 52554 removeetcdmember.go:79] [reset] No kubeadm config, using etcd pod spec to get data directory
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
[reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]
[reset] Deleting contents of stateful directories: [/var/lib/etcd /var/lib/kubelet /var/lib/dockershim /var/run/kubernetes /var/lib/cni]
The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d
The reset process does not reset or clean up iptables rules or IPVS tables.
If you wish to reset iptables, you must do so manually by using the "iptables" command.
If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
to reset your system's IPVS tables.
The reset process does not clean your kubeconfig files and you must remove them manually.
Please, check the contents of the $HOME/.kube/config file.
重设完之后再继续执行上述的init
即可, 知道init
成功
init运行成功后
可以查看最后的输出结果或者查看运行日志kubeadm-init.log
, 里面告诉说需要操作下面的步骤
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
查看当前节点, 发现状态为NotReady
[root@k8s-master01-225 ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-master01-225 NotReady master 40m v1.18.6
部署flannel网络(主节点服务器)
可以先整理一下当前文件夹
# 创建整理安装所需的文件夹
[root@k8s-master01-225 ~]# mkdir -p install-k8s/core
# 将主要的文件放入文件夹中
[root@k8s-master01-225 ~]# mv kubeadm-init.log kubeadm-config.yaml install-k8s/core
# 创建flannel文件夹
[root@k8s-master01-225 ~]# cd install-k8s
[root@k8s-master01-225 install-k8s]# mkdir plugin
[root@k8s-master01-225 install-k8s]# cd plugin/
[root@k8s-master01-225 plugin]# mkdir flannel
[root@k8s-master01-225 plugin]# cd flannel/
# 下载kube-flannel.yml文件
[root@k8s-master01-225 flannel]# wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
# 下载命令的打印结果
--2020-08-01 19:23:44-- https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.108.133
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 14366 (14K) [text/plain]
Saving to: ‘kube-flannel.yml’
kube-flannel.yml 100%[================================================>] 14.03K --.-KB/s in 0.05s
2020-08-01 19:23:44 (286 KB/s) - ‘kube-flannel.yml’ saved [14366/14366]
# 创建flannel
[root@k8s-master01-225 flannel]# kubectl create -f kube-flannel.yml
# 创建命令的打印结果
podsecuritypolicy.policy/psp.flannel.unprivileged created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds-amd64 created
daemonset.apps/kube-flannel-ds-arm64 created
daemonset.apps/kube-flannel-ds-arm created
daemonset.apps/kube-flannel-ds-ppc64le created
daemonset.apps/kube-flannel-ds-s390x created
# 查看pod, 可以看到flannel组件已经运行起来了. 默认系统组件都安装在 kube-system 这个命名空间(namespace)下
[root@k8s-master01-225 flannel]# kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-66bff467f8-tlqdw 1/1 Running 0 18m
coredns-66bff467f8-zpg4q 1/1 Running 0 18m
etcd-k8s-master01-225 1/1 Running 0 18m
kube-apiserver-k8s-master01-225 1/1 Running 0 18m
kube-controller-manager-k8s-master01-225 1/1 Running 0 18m
kube-flannel-ds-amd64-5hpff 1/1 Running 0 32s
kube-proxy-xh6wh 1/1 Running 0 18m
kube-scheduler-k8s-master01-225 1/1 Running 0 18m
# 再次查看node, 发现状态已经变成了 Ready
[root@k8s-master01-225 flannel]# kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-master01-225 Ready master 19m v1.18.6
将子节点加到主节点下面(在子节点服务器运行)
还是在主节点的init
命令的输出日志下, 有子节点的加入命令, 在两台子节点服务器上运行
kubeadm join 172.19.199.225:6443 --token abcdef.0123456789abcdef
--discovery-token-ca-cert-hash sha256:23816230102e09bf09766f14896828f7b377d0b3aa44e619342cbdf47ccd37b5
稍等片刻后, 加入成功如下:
W0801 19:27:06.500319 12557 join.go:346] [preflight] WARNING: JoinControlPane.controlPlane settings will be ignored when control-plane flag is not set.
[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[WARNING FileExisting-tc]: tc not found in system path
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.18" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
在主节点服务器上查看子节点状态为Ready
[root@k8s-master01-225 flannel]# kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-master01-225 Ready master 20m v1.18.6
k8s-node01-228 Ready <none> 34s v1.18.6
k8s-node02-229 Ready <none> 29s v1.18.6
但是在子节点服务器上运行kubectl get node
却发现报错了, 如下
(root@k8s-node02-229:~)# kubectl get node
The connection to the server localhost:8080 was refused - did you specify the right host or port?
经百度后发现按安装成功日志提示的如下步骤操作即可
# 在各个子节点创建.kube目录
(root@k8s-node02-229:~)# mkdir -p $HOME/.kube
# 这里需要在主节点将admin.conf复制到各个子节点
scp /etc/kubernetes/admin.conf root@k8s-node01-228:$HOME/.kube/config
scp /etc/kubernetes/admin.conf root@k8s-node02-229:$HOME/.kube/config
# 授权
(root@k8s-node02-229:~)# chown $(id -u):$(id -g) $HOME/.kube/config
# 最后运行测试, 发现不报错了
(root@k8s-node02-229:~)# kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-master01-225 Ready master 37h v1.18.6
k8s-node01-228 Ready <none> 36h v1.18.6
k8s-node02-229 Ready <none> 36h v1.18.6
解决pod的IP无法ping通的问题
集群安装完成后, 启动一个pod
# 启动pod, 命名为nginx-offi, 里面运行的容器为从官网拉取的Nginx镜像
(root@k8s-master01-225:~)# kubectl run nginx-offi --image=nginx
pod/nginx-offi created
# 查看pod的运行信息, 可以看到状态为 "Running" ,IP为 "10.244.1.7", 运行在了 "k8s-node01-228" 节点上
(root@k8s-master01-225:~)# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-offi 1/1 Running 0 55s 10.244.1.7 k8s-node01-228 <none> <none>
但是如果在主节点k8s-master01-225
或者另一个子节点 k8s-node02-229
上访问刚才运行的pod, 却发现访问不到, ping该IP地址10.244.1.7
也ping不通, 尽管前面我们已经安装好了flannel.
经过百度后发现, 是因为 iptables 规则的问题, 前面我们在初始化服务器设置
的时候清除了iptables的规则, 但是不知道是不是因为安装了 flannel 还是哪一步的问题, 会导致 iptables 里面又多出了规则
# 查看iptables
(root@k8s-master01-225:~)# iptables -L -n
Chain INPUT (policy ACCEPT)
target prot opt source destination
KUBE-FIREWALL all -- 0.0.0.0/0 0.0.0.0/0
Chain FORWARD (policy ACCEPT)
target prot opt source destination
KUBE-FORWARD all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes forwarding rules */
DOCKER-USER all -- 0.0.0.0/0 0.0.0.0/0
DOCKER-ISOLATION-STAGE-1 all -- 0.0.0.0/0 0.0.0.0/0
ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 ctstate RELATED,ESTABLISHED
DOCKER all -- 0.0.0.0/0 0.0.0.0/0
ACCEPT all -- 0.0.0.0/0 0.0.0.0/0
ACCEPT all -- 0.0.0.0/0 0.0.0.0/0
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
KUBE-FIREWALL all -- 0.0.0.0/0 0.0.0.0/0
Chain DOCKER (1 references)
target prot opt source destination
Chain DOCKER-ISOLATION-STAGE-1 (1 references)
target prot opt source destination
DOCKER-ISOLATION-STAGE-2 all -- 0.0.0.0/0 0.0.0.0/0
RETURN all -- 0.0.0.0/0 0.0.0.0/0
Chain DOCKER-ISOLATION-STAGE-2 (1 references)
target prot opt source destination
DROP all -- 0.0.0.0/0 0.0.0.0/0
RETURN all -- 0.0.0.0/0 0.0.0.0/0
Chain DOCKER-USER (1 references)
target prot opt source destination
RETURN all -- 0.0.0.0/0 0.0.0.0/0
Chain KUBE-FIREWALL (2 references)
target prot opt source destination
DROP all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes firewall for dropping marked packets */ mark match 0x8000/0x8000
DROP all -- !127.0.0.0/8 127.0.0.0/8 /* block incoming localnet connections */ ! ctstate RELATED,ESTABLISHED,DNAT
Chain KUBE-KUBELET-CANARY (0 references)
target prot opt source destination
Chain KUBE-FORWARD (1 references)
target prot opt source destination
ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes forwarding rules */ mark match 0x4000/0x4000
ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes forwarding conntrack pod source rule */ ctstate RELATED,ESTABLISHED
ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes forwarding conntrack pod destination rule */ ctstate RELATED,ESTABLISHED
# Warning: iptables-legacy tables present, use iptables-legacy to see them
我们需要再次清空iptables规则
iptables -F && iptables -X && iptables -F -t nat && iptables -X -t nat
再次查看iptables
(root@k8s-master01-225:~)# iptables -L -n
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain FORWARD (policy ACCEPT)
target prot opt source destination
KUBE-FORWARD all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes forwarding rules */
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
Chain KUBE-FORWARD (1 references)
target prot opt source destination
ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes forwarding rules */ mark match 0x4000/0x4000
ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes forwarding conntrack pod source rule */ ctstate RELATED,ESTABLISHED
ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes forwarding conntrack pod destination rule */ ctstate RELATED,ESTABLISHED
# Warning: iptables-legacy tables present, use iptables-legacy to see them
再次ping或者访问pod, 即可成功
(root@k8s-master01-225:~)# curl 10.244.1.7
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
body {
35em;
margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif;
}
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>
<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>
<p><em>Thank you for using nginx.</em></p>
</body>
</html>
安装私有仓库harbor
Harbor是一个用于存储和分发Docker镜像的企业级Registry服务器,可以用来构建企业内部的Docker镜像仓库。
harbor是基于docker registry进行了相应的企业级扩展,从而获得了更加广泛的应用,新特性包括:
- 管理用户界面
- 基于角色的访问控制
- AD/LDAP集成
- 审计日志等
相比于原生的docker registry, 更加方便管理企业量级的容器, 并且通过内网搭建的传输效率也是非常高的
前置条件
-
python应该是2.7或更高版本
-
Docker引擎应为1.10或更高版本
-
Docker Compose需要为1.6.0或更高版本
安装Docker-compose
官网安装教程: https://docs.docker.com/compose/install/
下载最新的安装包, 到/usr/local/bin/docker-compose
目录
sudo curl -L "https://github.com/docker/compose/releases/download/1.26.2/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
授权
sudo chmod +x /usr/local/bin/docker-compose
创建软连接
sudo ln -s /usr/local/bin/docker-compose /usr/bin/docker-compose
测试安装结果
# docker-compose --version
docker-compose version 1.26.2, build eefe0d31
下载harbor
官网下载地址: https://github.com/vmware/harbor/releases
-
选择最新发布的版本: v1.10.4
-
下载600多兆的线下版本(这样便于后续安装): harbor-offline-installer-v1.10.4.tgz
wget https://github.com/goharbor/harbor/releases/download/v1.10.4/harbor-offline-installer-v1.10.4.tgz
解压至自定义的目录, 这里放在/usr/local下
tar xvf harbor-offline-installer-v1.10.4.tgz -C /usr/local/
# 重命名并创建软连接(推荐使用, 便于后续升级管理的常用方式)
cd /usr/local/
(root@Aliyun-Alex:/usr/local)# mv harbor/ harbor-v1.10.4
(root@Aliyun-Alex:/usr/local)# ln -s /usr/local/harbor-v1.10.4/ /usr/local/harbor
(root@Aliyun-Alex:/usr/local)# cd harbor
(root@Aliyun-Alex:/usr/local/harbor)# ls
common.sh harbor.v1.10.4.tar.gz harbor.yml install.sh LICENSE prepare
修改安装配置文件harbor.yml
# vim harbor.yml
# 1. 修改主机名, 可以是IP或者域名, 用来进入管理UI界面和仓库服务的
# 这里我随便使用一个域名 alex.gcx.com, 然后在本机Windows10电脑的hosts中添加设置: alex.gcx.com 阿里云公网IP
# hosts文件其实就是一个dns的作用, 在浏览器中输入域名后, 会找到其对应的IP地址
hostname: alex.gcx.com
# 2. harbor提供了http和https两种协议方式访问harbor服务, 以前版本默认使用http协议, 现在默认使用https协议,
# http 协议, 正如下面官网注释所说, 如果https服务是可用的, 那么就算访问的是http的端口, 也会重定向到https的端口上
# 将原来的80端口改为8002(自定义)端口, 之所以改80端口因为一般来说80端口都是给Nginx用的, 可以先查看端口是否被占用 netstat -anp |grep 8002
# http related config
http:
# port for http, default is 80. If https enabled, this port will redirect to https port
port: 8002
# https 协议, 如果不想用https协议, 就可以把下面的设置注释掉, 我两种方式都有尝试, https比较麻烦的一点就是需要创建授权证书
# 若证书创建好了就可以在下面配置证书信息, 创建https证书的步骤下面会介绍
# https related config
# https:
# https port for harbor, default is 443
# port: 443
# The path of cert and key files for nginx
# certificate: /data/cert/server.crt
# private_key: /data/cert/server.key
# 3. (可选)登录harbor管理界面的用户 admin 的登录密码
harbor_admin_password: your_password
# 4. (可选)修改数据卷目录和容器目录
data_volume: /data/harbor
location: /data/harbor/logs
创建https证书(可选)
创建密钥, 使用openssl工具生成一个RSA私钥
(root@Aliyun-Alex:~)# openssl genrsa -des3 -out server.key 2048
# 输入两次自定义的密码
Generating RSA private key, 2048 bit long modulus (2 primes)
...........+++++
...........................+++++
e is 65537 (0x010001)
Enter pass phrase for server.key:
Verifying - Enter pass phrase for server.key:
(root@Aliyun-Alex:~)# ls
server.key
生成CSR(证书签名请求), 输入的信息可以随意输入, 这里只是随便做一个虚拟的证书, 如果是真实的证书需要将证书发送给证书颁发机构(CA),CA验证过请求者的身份之后,会出具签名证书,需要花钱。
(root@Aliyun-Alex:~)# openssl req -new -key server.key -out server.csr
Enter pass phrase for server.key:
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [XX]:CN #
State or Province Name (full name) []:SH
Locality Name (eg, city) [Default City]:SH
Organization Name (eg, company) [Default Company Ltd]:
Organizational Unit Name (eg, section) []:
Common Name (eg, your name or your server's hostname) []:alex.gcx.com
Email Address []:111@163.com
Please enter the following 'extra' attributes
to be sent with your certificate request
A challenge password []:
An optional company name []:
(root@Aliyun-Alex:~)# ls
3000 dump.rdb server.csr server.key
删除密钥中的密码, 如果不删除密码,在应用加载的时候会出现输入密码进行验证的情况,不方便自动化部署。
# 备份证书
(root@Aliyun-Alex:~)# cp server.key server.key.back
# 删除密码
(root@Aliyun-Alex:~)# openssl rsa -in server.key -out server.key
Enter pass phrase for server.key:
writing RSA key
生成自签名证书
(root@Aliyun-Alex:~)# openssl x509 -req -days 365 -in server.csr -signkey server.key -out server.crt
Signature ok
subject=C = CN, ST = SH, L = SH, O = Default Company Ltd, CN = alex, emailAddress = 111@163.com
Getting Private key
生成pem格式的公钥(可选), 有些服务,需要有pem格式的证书才能正常加载,可以用下面的命令:
openssl x509 -in server.crt -out server.pem -outform PEM
创建证书目录
# 创建目录
(root@Aliyun-Alex:~)# mkdir -p /data/cert
# 将证书相关文件移动至证书目录
(root@Aliyun-Alex:~)# mv server.* /data/cert/
(root@Aliyun-Alex:~)# cd /data/cert/
(root@Aliyun-Alex:/data/cert)# ls
server.crt server.csr server.key server.key.back
# 授权
chmod -R 777 /data/cert
修改harbor.yml
中证书路径配置
# vim /usr/local/harbor-v1.10.4/harbor.yml
certificate: /data/cert/server.crt
private_key: /data/cert/server.key
运行脚本安装harhor
(root@Aliyun-Alex:~)# sh /usr/local/harbor/install.sh
[Step 0]: checking if docker is installed ...
Note: docker version: 19.03.12
[Step 1]: checking docker-compose is installed ...
Note: docker-compose version: 1.26.2
[Step 2]: loading Harbor images ...
...
[Step 5]: starting Harbor ...
Creating network "harbor-v1104_harbor" with the default driver
Creating harbor-log ... done
Creating registry ... done
Creating harbor-portal ... done
Creating redis ... done
Creating registryctl ... done
Creating harbor-db ... done
Creating harbor-core ... done
Creating nginx ... done
Creating harbor-jobservice ... done
✔ ----Harbor has been installed and started successfully.----
登录网站查看harbor的管理页面
在终端中登录harbor
(root@Aliyun-Alex:/usr/local/harbor)# docker login alex.gcx.com
Username: admin
Password:
Error response from daemon: Get https://alex.gcx.com/v2/: x509: certificate signed by unknown authority
发现登录报错, 这是因为还是和上面一样, 重定向到了https的地址, 需要证书认证, 但是我们的证书是虚拟的, docker客户端认为证书是不安全的, 所以会报错, 那么这里我们需要修改一下docker的配置文件/etc/docker/daemon.json
vim /etc/docker/daemon.json
# 在里面添上一句话(显示时可能不会显示双引号)
# 告诉docker客户端这个域名可以访问
"insecure-registries": ["https://alex.gcx.com"]
# 重启docker
systemctl restart docker
# 再次登录发现可以成功
(root@Aliyun-Alex:/usr/local)# docker login alex.gcx.com
Username: admin
Password:
WARNING! Your password will be stored unencrypted in /root/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store
Login Succeeded
其他服务器访问harbor需要修改的地方
# 1.添加hosts
echo "172.19.67.12 alex.gcx.com" >> /etc/hosts
# 2.添加/etc/docker/daemon.json
"insecure-registries": ["https://alex.gcx.com"]
# 3.重启docker
systemctl restart docker
运维操作-启停harbor
若想要修改harbor配置, 如这里想启用https协议, 步骤为
# 进入harbor目录
(root@Aliyun-Alex:~)# cd /usr/local/harbor
(root@Aliyun-Alex:/usr/local/harbor)# ls
common common.sh docker-compose.yml harbor.v1.10.4.tar.gz harbor.yml install.sh LICENSE prepare
# 关闭harbor服务(docker-compose)
(root@Aliyun-Alex:/usr/local/harbor)# docker-compose down -v
Stopping harbor-jobservice ... done
Stopping nginx ... done
Stopping harbor-core ... done
Stopping harbor-portal ... done
Stopping harbor-db ... done
Stopping redis ... done
Stopping registryctl ... done
Stopping registry ... done
Stopping harbor-log ... done
Removing harbor-jobservice ... done
Removing nginx ... done
Removing harbor-core ... done
Removing harbor-portal ... done
Removing harbor-db ... done
Removing redis ... done
Removing registryctl ... done
Removing registry ... done
Removing harbor-log ... done
Removing network harbor-v1104_harbor
# 编辑harbor.yml, 修改https设置
(root@Aliyun-Alex:/usr/local/harbor)# vim harbor.yml
# https related config
https:
# https port for harbor, default is 443
port: 443
# The path of cert and key files for nginx
certificate: /data/cert/server.crt
private_key: /data/cert/server.key
# 执行启动前准备
(root@Aliyun-Alex:/usr/local/harbor)# ./prepare
prepare base dir is set to /usr/local/harbor-v1.10.4
Clearing the configuration file: /config/log/logrotate.conf
Clearing the configuration file: /config/log/rsyslog_docker.conf
Clearing the configuration file: /config/nginx/nginx.conf
Clearing the configuration file: /config/core/env
Clearing the configuration file: /config/core/app.conf
Clearing the configuration file: /config/registry/config.yml
Clearing the configuration file: /config/registry/root.crt
Clearing the configuration file: /config/registryctl/env
Clearing the configuration file: /config/registryctl/config.yml
Clearing the configuration file: /config/db/env
Clearing the configuration file: /config/jobservice/env
Clearing the configuration file: /config/jobservice/config.yml
Generated configuration file: /config/log/logrotate.conf
Generated configuration file: /config/log/rsyslog_docker.conf
Generated configuration file: /config/nginx/nginx.conf
Generated configuration file: /config/core/env
Generated configuration file: /config/core/app.conf
Generated configuration file: /config/registry/config.yml
Generated configuration file: /config/registryctl/env
Generated configuration file: /config/db/env
Generated configuration file: /config/jobservice/env
Generated configuration file: /config/jobservice/config.yml
loaded secret from file: /secret/keys/secretkey
Generated configuration file: /compose_location/docker-compose.yml
Clean up the input dir
# 启动docker-compose
(root@Aliyun-Alex:/usr/local/harbor)# docker-compose up -d
Creating network "harbor-v1104_harbor" with the default driver
Creating harbor-log ... done
Creating redis ... done
Creating registry ... done
Creating harbor-db ... done
Creating registryctl ... done
Creating harbor-portal ... done
Creating harbor-core ... done
Creating harbor-jobservice ... done
Creating nginx ... done
浏览器中再次访问http的网址: http://alex.gcx.com:8002, 发现其重定向为https的网址了