本实验参考:https://github.com/gjmzj/kubeasz
kubernetes官方github地址 https://github.com/kubernetes/kubernetes/releases
软硬件限制:
①cpu和内存 master:至少1c2g,推荐2c4g;node:至少1c2g
②linux系统 内核版本至少3.10,推荐CentOS7/RHEL7
③docker 至少1.9版本,推荐1.12+
④etcd 至少2.0版本,推荐3.0+
高可用集群所需节点规划:
①部署节点------x1 : 运行这份 ansible 脚本的节点
②etcd节点------x3 : 注意etcd集群必须是1,3,5,7...奇数个节点
③master节点----x2 : 根据实际集群规模可以增加节点数,需要额外规划一个master VIP(虚地址)
④lb节点--------x2 : 负载均衡节点两个,安装 haproxy+keepalived
⑤node节点------x3 : 真正应用负载的节点,根据需要提升机器配置和增加节点数
四台主机规划:
主机 | 主机名 | 集色角色 |
192.168.1.200 | master | deploy、etcd、lb1、master1 |
192.168.1.201 | master2 | lb2、master2 |
192.168.1.202 | node | etcd2、node1 |
192.168.1.203 | node2 | etcd3、node2 |
192.168.1.250 | vip |
一、准备工作
1:四台机器都执行安装epel源、更新、安装Python包。(说明:这边是为了做实验,防止出现不必要错误,把防火墙关闭了,生成环境勿学)
1 yum install -y epel-release 2 yum install -y python 3 iptables -F 4 setenforce 0
【deploy节点操作】
2:安装ansible
1 [root@master ~]# yum -y install ansible
3:生成密钥对
1 [root@master ~]# ssh-keygen 2 Generating public/private rsa key pair. 3 Enter file in which to save the key (/root/.ssh/id_rsa): 4 Created directory '/root/.ssh'. 5 Enter passphrase (empty for no passphrase): 6 Enter same passphrase again: 7 Your identification has been saved in /root/.ssh/id_rsa. 8 Your public key has been saved in /root/.ssh/id_rsa.pub. 9 The key fingerprint is: 10 SHA256:cfoSPSgeEkAkgY08UIVWK2t2eNJIrKph5wkRkZX7AKs root@master 11 The key's randomart image is: 12 +---[RSA 2048]----+ 13 |BOB=+ | 14 |oB=o . | 15 | oB + . . | 16 | +.O . * | 17 |o.B B o S o | 18 |Eo.+ + o o . | 19 |oo . . . . | 20 |o.+ . . | 21 |. o | 22 +----[SHA256]-----+
4:拷贝秘钥到四台机器中
1 [root@master ~]# for ip in 200 201 202 203; do ssh-copy-id 192.168.1.$ip; done
5:测试是否可以免密登录
1 [root@master ~]# ssh 192.168.1.200 2 Last login: Wed Dec 11 10:47:55 2019 from 192.168.1.2 3 [root@master ~]# exit 4 登出 5 Connection to 192.168.1.200 closed. 6 [root@master ~]# ssh 192.168.1.201 7 Last login: Wed Dec 11 10:48:00 2019 from 192.168.1.2 8 [root@master2 ~]# exit 9 登出 10 Connection to 192.168.1.201 closed. 11 [root@master ~]# ssh 192.168.1.202 12 Last login: Wed Dec 11 11:13:53 2019 from 192.168.1.200 13 [root@node1 ~]# exit 14 登出 15 Connection to 192.168.1.202 closed. 16 [root@master ~]# ssh 192.168.1.203 17 Last login: Wed Dec 11 10:48:20 2019 from 192.168.1.2 18 [root@node2 ~]# exit 19 登出 20 Connection to 192.168.1.203 closed.
6:下载脚本文件,安装kubeasz代码、二进制、离线镜像
脚本下载链接:https://pan.baidu.com/s/1GLoU9ntjUL2SP4R_Do7mlQ
提取码:96eg
1 [root@master ~]# chmod +x easzup 2 [root@master ~]# ./easzup -D 3 [root@master ~]# ls /etc/ansible/ 4 01.prepare.yml 03.docker.yml 06.network.yml 22.upgrade.yml 90.setup.yml bin down pics tools 5 02.etcd.yml 04.kube-master.yml 07.cluster-addon.yml 23.backup.yml 99.clean.yml dockerfiles example README.md 6 03.containerd.yml 05.kube-node.yml 11.harbor.yml 24.restore.yml ansible.cfg docs manifests roles
7:配置hosts集群参数
1 [root@master ~ ]# cd /etc/ansible 2 [root@master ansible]# cp example/hosts.multi-node hosts 3 [root@master ansible]# vim hosts 4 [etcd] ##设置etcd节点ip 5 192.168.1.200 NODE_NAME=etcd1 6 192.168.1.202 NODE_NAME=etcd2 7 192.168.1.203 NODE_NAME=etcd3 8 9 [kube-master] ##设置master节点ip 10 192.168.1.200 11 192.168.1.201 12 13 [kube-node] ##设置node节点ip 14 192.168.1.202 15 192.168.1.203 16 17 [ex-lb] ##设置lb节点ip和VIP 18 192.168.1.200 LB_ROLE=backup EX_APISERVER_VIP=192.168.1.250 EX_APISERVER_PORT=8443 19 192.168.1.201 LB_ROLE=master EX_APISERVER_VIP=192.168.1.250 EX_APISERVER_PORT=8443
8:修改完hosts,测试连通性
1 [root@master ansible]# ansible all -m ping 2 [DEPRECATION WARNING]: The TRANSFORM_INVALID_GROUP_CHARS settings is set to allow bad characters in group names by default, this will change, but still be user 3 configurable on deprecation. This feature will be removed in version 2.10. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg. 4 [WARNING]: Invalid characters were found in group names but not replaced, use -vvvv to see details 5 6 192.168.1.201 | SUCCESS => { 7 "ansible_facts": { 8 "discovered_interpreter_python": "/usr/bin/python" 9 }, 10 "changed": false, 11 "ping": "pong" 12 } 13 192.168.1.202 | SUCCESS => { 14 "ansible_facts": { 15 "discovered_interpreter_python": "/usr/bin/python" 16 }, 17 "changed": false, 18 "ping": "pong" 19 } 20 192.168.1.203 | SUCCESS => { 21 "ansible_facts": { 22 "discovered_interpreter_python": "/usr/bin/python" 23 }, 24 "changed": false, 25 "ping": "pong" 26 } 27 192.168.1.200 | SUCCESS => { 28 "ansible_facts": { 29 "discovered_interpreter_python": "/usr/bin/python" 30 }, 31 "changed": false, 32 "ping": "pong" 33 }
二、开始部署集群
【deploy节点操作】手动安装方式
1:安装ca证书
1 [root@master ansible]# ansible-playbook 01.prepare.yml
2:安装etcd
1 [root@master ansible]# ansible-playbook 02.etcd.yml
检查etcd健康状态,显示healthy: successfully表示节点正常
1 [root@master ansible]# for ip in 200 202 203 ; do ETCDCTL_API=3 etcdctl --endpoints=https://192.168.1.$ip:2379 --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/etcd/ssl/etcd.pem --key=/etc/etcd/ssl/etcd-key.pem endpoint healt; done 2 https://192.168.1.200:2379 is healthy: successfully committed proposal: took = 5.658163ms 3 https://192.168.1.202:2379 is healthy: successfully committed proposal: took = 6.384588ms 4 https://192.168.1.203:2379 is healthy: successfully committed proposal: took = 7.386942ms
3:安装docker
1 [root@master ansible]# ansible-playbook 03.docker.yml
4:安装master
1 [root@master ansible]# ansible-playbook 04.kube-master.yml
查看集群状态
1 [root@master ansible]# kubectl get componentstatus 2 NAME STATUS MESSAGE ERROR 3 scheduler Healthy ok 4 controller-manager Healthy ok 5 etcd-0 Healthy {"health":"true"} 6 etcd-2 Healthy {"health":"true"} 7 etcd-1 Healthy {"health":"true"}
5:安装node节点
1 [root@master ansible]# ansible-playbook 05.kube-node.yml
查看node节点
1 [root@master ansible]# kubectl get nodes 2 NAME STATUS ROLES AGE VERSION 3 192.168.1.200 Ready,SchedulingDisabled master 4m45s v1.15.0 4 192.168.1.201 Ready,SchedulingDisabled master 4m45s v1.15.0 5 192.168.1.202 Ready node 12s v1.15.0 6 192.168.1.203 Ready node 12s v1.15.0
6:部署集群网络
1 [root@master ansible]# ansible-playbook 06.network.yml
查看kube-system namespace上的pod,从中可以看到flannel相关的pod
1 [root@master ansible]# kubectl get pod -n kube-system 2 NAME READY STATUS RESTARTS AGE 3 kube-flannel-ds-amd64-7bk5w 1/1 Running 0 61s 4 kube-flannel-ds-amd64-blcxx 1/1 Running 0 61s 5 kube-flannel-ds-amd64-c4sfx 1/1 Running 0 61s 6 kube-flannel-ds-amd64-f8pnz 1/1 Running 0 61s
7:安装集群插件
1 [root@master ansible]# ansible-playbook 07.cluster-addon.yml
查看kube-system namespace下的服务
1 [root@master ansible]# kubectl get svc -n kube-system 2 NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE 3 heapster ClusterIP 10.68.191.0 <none> 80/TCP 13m 4 kube-dns ClusterIP 10.68.0.2 <none> 53/UDP,53/TCP,9153/TCP 15m 5 kubernetes-dashboard NodePort 10.68.115.45 <none> 443:35294/TCP 13m 6 metrics-server ClusterIP 10.68.116.163 <none> 443/TCP 15m 7 traefik-ingress-service NodePort 10.68.106.241 <none> 80:23456/TCP,8080:26004/TCP 12m
【自动安装方式】
一步执行上面所有手动安装操作
1 [root@master ansible]# ansible-playbook 90.setup.yml
查看node/pod使用资源情况
1 [root@master ansible]# kubectl top node 2 NAME CPU(cores) CPU% MEMORY(bytes) MEMORY% 3 192.168.1.200 58m 7% 960Mi 85% 4 192.168.1.201 34m 4% 1018Mi 91% 5 192.168.1.202 76m 9% 549Mi 49% 6 192.168.1.203 89m 11% 568Mi 50%
1 [root@master ansible]# kubectl top pod --all-namespaces 2 NAMESPACE NAME CPU(cores) MEMORY(bytes) 3 kube-system coredns-797455887b-9nscp 5m 22Mi 4 kube-system coredns-797455887b-k92wv 5m 19Mi 5 kube-system heapster-5f848f54bc-vvwzx 1m 11Mi 6 kube-system kube-flannel-ds-amd64-7bk5w 3m 20Mi 7 kube-system kube-flannel-ds-amd64-blcxx 2m 19Mi 8 kube-system kube-flannel-ds-amd64-c4sfx 2m 18Mi 9 kube-system kube-flannel-ds-amd64-f8pnz 2m 10Mi 10 kube-system kubernetes-dashboard-5c7687cf8-hnbdp 1m 22Mi 11 kube-system metrics-server-85c7b8c8c4-6q4vj 1m 16Mi 12 kube-system traefik-ingress-controller-766dbfdddd-98trv 4m 17Mi
查看集群信息
1 [root@master ansible]# kubectl cluster-info 2 Kubernetes master is running at https://192.168.1.200:6443 3 CoreDNS is running at https://192.168.1.200:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy 4 kubernetes-dashboard is running at https://192.168.1.200:6443/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy 5 Metrics-server is running at https://192.168.1.200:6443/api/v1/namespaces/kube-system/services/https:metrics-server:/proxy 6 7 To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
8:测试DNS
①创建一个nginx.service
1 [root@master ansible]# kubectl run nginx --image=nginx --expose --port=80 2 kubectl run --generator=deployment/apps.v1 is DEPRECATED and will be removed in a future version. Use kubectl run --generator=run-pod/v1 or kubectl create instead. 3 service/nginx created 4 deployment.apps/nginx created
②创建busybox测试pod,可以看到nginx监听的虚拟地址10.68.243.55
1 [root@master ansible]# kubectl run busybox --rm -it --image=busybox /bin/sh 2 kubectl run --generator=deployment/apps.v1 is DEPRECATED and will be removed in a future version. Use kubectl run --generator=run-pod/v1 or kubectl create instead. 3 If you don't see a command prompt, try pressing enter. 4 / # nslookup nginx.default.svc.cluster.local 5 Server: 10.68.0.2 6 Address: 10.68.0.2:53 7 8 Name: nginx.default.svc.cluster.local 9 Address: 10.68.243.55
三、增加node节点,IP:192.168.1.204
【deploy节点操作】
1:拷贝公钥到新的node节点机器上
1 [root@master ansible]# ssh-copy-id 192.168.1.204
2:修改hosts文件,添加新的node节点IP
[root@master ansible]# vim hosts [kube-node] 192.168.1.202 192.168.1.203 192.168.1.204
3:执行添加node安装文件,并指导节点的IP
1 [root@master ansible]# ansible-playbook tools/02.addnode.yml -e NODE_TO_ADD=192.168.1.204
4:验证node节点是否添加成功
1 [root@master ansible]# kubectl get node 2 NAME STATUS ROLES AGE VERSION 3 192.168.1.200 Ready,SchedulingDisabled master 9h v1.15.0 4 192.168.1.201 Ready,SchedulingDisabled master 9h v1.15.0 5 192.168.1.202 Ready node 9h v1.15.0 6 192.168.1.203 Ready node 9h v1.15.0 7 192.168.1.204 Ready node 2m11s v1.15.0
1 [root@master ansible]# kubectl get pod -n kube-system -o wide 2 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES 3 coredns-797455887b-9nscp 1/1 Running 0 31h 172.20.3.2 192.168.1.203 <none> <none> 4 coredns-797455887b-k92wv 1/1 Running 0 31h 172.20.2.2 192.168.1.202 <none> <none> 5 heapster-5f848f54bc-vvwzx 1/1 Running 1 31h 172.20.2.4 192.168.1.202 <none> <none> 6 kube-flannel-ds-amd64-7bk5w 1/1 Running 0 31h 192.168.1.202 192.168.1.202 <none> <none> 7 kube-flannel-ds-amd64-blcxx 1/1 Running 0 31h 192.168.1.200 192.168.1.200 <none> <none> 8 kube-flannel-ds-amd64-c4sfx 1/1 Running 0 31h 192.168.1.203 192.168.1.203 <none> <none> 9 kube-flannel-ds-amd64-f8pnz 1/1 Running 0 31h 192.168.1.201 192.168.1.201 <none> <none> 10 kube-flannel-ds-amd64-vdd7n 1/1 Running 0 21h 192.168.1.204 192.168.1.204 <none> <none> 11 kubernetes-dashboard-5c7687cf8-hnbdp 1/1 Running 0 31h 172.20.3.3 192.168.1.203 <none> <none> 12 metrics-server-85c7b8c8c4-6q4vj 1/1 Running 0 31h 172.20.2.3 192.168.1.202 <none> <none> 13 traefik-ingress-controller-766dbfdddd-98trv 1/1 Running 0 31h 172.20.3.4 192.168.1.203 <none> <none>