硬件环境:
三台虚拟机:
192.168.99.129 master(kube-apiserver、kube-controller-manager、kube-proxy、kube-scheduler、kubelet、etcd、calico、docker)
192.168.99.130 slave1(kube-proxy、kubelet、etcd proxy、calico、docker、dns)
192.168.99.131 slave2(kube-proxy、kubelet、etcd proxy、calico、docker)
软件环境:
kubernetes 1.5.2
etcd 3.1.0
calico 0.23.1
【etcd】
calico需要每个node节点都要运行一个etcd proxy,所以master主机上部署一个etcd,其他node节点上部署etcd proxy。
master上etcd启动命令如下:(etcd新版本基本只使用2379这个端口了,但是有一些老的程序之前与etcd集成时使用的是4001端口,因此我同时监听2379和4001这两个端口)
etcd --name infra1 --data-dir /var/lib/etcd --listen-client-urls http://0.0.0.0:2379,http://0.0.0.0:4001 --advertise-client-urls http://192.168.99.129:2379,http://192.168.99.129:4001 --listen-peer-urls http://0.0.0.0:2380 --initial-advertise-peer-urls http://192.168.99.129:2380 --initial-cluster-token etcd-cluster --initial-cluster 'infra1=http://192.168.99.129:2380' --initial-cluster-state new --enable-pprof >> /var/log/etcd.log 2>&1 &
node上etcd proxy启动命令如下:
etcd --name infra-proxy1 --proxy=on --listen-client-urls http://0.0.0.0:2379 --initial-cluster 'infra1=http://192.168.99.130:2380' --enable-pprof >> /var/log/etcd.log 2>&1 &
etcd --name infra-proxy1 --proxy=on --listen-client-urls http://0.0.0.0:2379 --initial-cluster 'infra1=http://192.168.99.131:2380' --enable-pprof >> /var/log/etcd.log 2>&1 &
【kubernetes】
1、kube-apiserver和kubelet的启动脚本中添加--allow_privileged=true,如果不添加的话,下面在部署calico的时候,会以下错误:
The DaemonSet "calico-node" is invalid: spec.template.spec.containers[0].securityContext.privileged: Forbidden: disallowed by policy
2、在kubelet的启动脚本中增加--network-plugin=cni和--network-plugin-dir=/etc/cni/net.d
kube-apiserver和kubelet的启动脚本如下:
kube-apiserver --logtostderr=true --v=0 --etcd-servers=http://k8s-master:4001 --insecure-bind-address=0.0.0.0 --insecure-port=8080 --service-cluster-ip-range=10.254.0.0/16 --allow_privileged=true >> /var/log/kube-apiserver.log 2>&1 &
kubelet --logtostderr=true --v=0 --address=0.0.0.0 --api-servers=http://k8s-master:8080 --pod-infra-container-image=index.tenxcloud.com/google_containers/pause-amd64:3.0 --cluster-dns=10.254.159.10 --cluster-domain=cluster.local --hostname-override=192.168.99.130 --allow_privileged=true --network-plugin=cni --network-plugin-dir=/etc/cni/net.d >> /var/log/kubelet.log 2>&1 &
3、下载 https://github.com/containernetworking/cni/releases/download/v0.4.0/cni-v0.4.0.tgz,解压之后,将loopback拷贝到/opt/cni/bin目录下,如果不做这步的话,创建pod时会抛错,说找不到loopback。
4、calico必须部署在master节点和所有的node节点上,如果master节点不部署calico,会出现容器内无法访问master的问题。因为calico是以dameonset部署的,所以在master节点上启动kubelet,calico就会部署在master节点上了。
【calico】
1、下载calico.yaml,地址为http://docs.projectcalico.org/v2.0/getting-started/kubernetes/installation/hosted/calico.yaml
2、修改calico.yaml文件中,etcd的地址
etcd_endpoints: "http://192.168.99.129:2379"
3、通过以下命令部署calico
kubectl apply -f calico.yaml
【部署centos和redis】
1、部署centos,指定部署在192.168.99.130节点上,centos-rcd.yaml如下:
apiVersion: v1 kind: ReplicationController metadata: name: centos labels: name: centos spec: replicas: 1 template: metadata: labels: name: centos spec: containers: - name: centos image: index.tenxcloud.com/tenxcloud/docker-centos ports: - containerPort: 6379 nodeSelector: kubernetes.io/hostname: "192.168.99.130"
2、部署redis,指定部署在192.168.99.131节点上,redis-rc.yaml如下:
apiVersion: v1 kind: ReplicationController metadata: name: redis labels: k8s-app: redis spec: replicas: 1 selector: k8s-app: redis template: metadata: labels: k8s-app: redis spec: containers: - name: redis image: 10.10.30.166/public/redis:v1 ports: - containerPort: 6379 name: redis-tcp protocol: TCP nodeSelector: kubernetes.io/hostname: "192.168.99.131"
redis-svc.yaml如下:
apiVersion: v1 kind: Service metadata: name: redis spec: selector: k8s-app: redis clusterIP: 10.254.159.20 ports: - name: "1" port: 6379 protocol: TCP
3、部署情况如下:
[root@master redis]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE centos-bpzkc 1/1 Running 0 23h 192.168.140.197 192.168.99.130 dns-99cqq 3/3 Running 0 1d 192.168.140.196 192.168.99.130 redis-c7wk3 1/1 Running 0 4m 192.168.140.82 192.168.99.131
[root@master redis]# kubectl get svc -o wide NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR dns 10.254.159.10 <none> 53/UDP,53/TCP 1d k8s-app=dns kubernetes 10.254.0.1 <none> 443/TCP 2d <none> redis 10.254.159.20 <none> 6379/TCP 4m k8s-app=redis
master主机上的路由:
[root@master redis]# route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 192.168.99.2 0.0.0.0 UG 100 0 0 eno16777736 172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0 192.168.99.0 0.0.0.0 255.255.255.0 U 100 0 0 eno16777736 192.168.122.0 0.0.0.0 255.255.255.0 U 0 0 0 virbr0 192.168.140.64 192.168.99.131 255.255.255.192 UG 0 0 0 eno16777736 192.168.140.192 192.168.99.130 255.255.255.192 UG 0 0 0 eno16777736
slave1主机上的路由:
[root@slave1 bin]# route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 192.168.99.2 0.0.0.0 UG 100 0 0 eno16777736 172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0 192.168.99.0 0.0.0.0 255.255.255.0 U 100 0 0 eno16777736 192.168.122.0 0.0.0.0 255.255.255.0 U 0 0 0 virbr0 192.168.140.64 192.168.99.131 255.255.255.192 UG 0 0 0 eno16777736 192.168.140.192 0.0.0.0 255.255.255.192 U 0 0 0 * 192.168.140.196 0.0.0.0 255.255.255.255 UH 0 0 0 cali12b26626b64 192.168.140.197 0.0.0.0 255.255.255.255 UH 0 0 0 calic477824fb70
slave2主机上的路由:
[root@slave2 bin]# route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 192.168.99.2 0.0.0.0 UG 100 0 0 eno16777736 172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0 192.168.99.0 0.0.0.0 255.255.255.0 U 100 0 0 eno16777736 192.168.122.0 0.0.0.0 255.255.255.0 U 0 0 0 virbr0 192.168.140.64 0.0.0.0 255.255.255.192 U 0 0 0 * 192.168.140.82 0.0.0.0 255.255.255.255 UH 0 0 0 calieb567fc0b5e 192.168.140.192 192.168.99.130 255.255.255.192 UG 0 0 0 eno16777736
master、slave1和slave2上redis的iptables规则如下,他们三个是一样的
iptables -S -t nat | grep redis -A KUBE-SEP-XAJWX3SXEKZG2YR7 -s 192.168.140.82/32 -m comment --comment "default/redis:1" -j KUBE-MARK-MASQ -A KUBE-SEP-XAJWX3SXEKZG2YR7 -p tcp -m comment --comment "default/redis:1" -m tcp -j DNAT --to-destination 192.168.140.82:6379 -A KUBE-SERVICES -d 10.254.159.20/32 -p tcp -m comment --comment "default/redis:1 cluster IP" -m tcp --dport 6379 -j KUBE-SVC-XXJ2TMJIYSJJDBZG -A KUBE-SVC-XXJ2TMJIYSJJDBZG -m comment --comment "default/redis:1" -j KUBE-SEP-XAJWX3SXEKZG2YR7
从这个规则能够看出,redis的clusterIp 10.254.159.20:6379被dnat转换为192.168.140.82:6379,这里遇到一个奇怪的问题,目前不知道原因,现象是,当redis-rc.yaml中labels是k8s-app: redis时,iptables规则如上面显示,一切正常,但如果labels是name: redis,则只有下面这一条规则,这说明clusterip不会被转换成pod的IP,所以访问clusterIP肯定是不通的。
-A KUBE-SERVICES -d 10.254.159.20/32 -p tcp -m comment --comment "default/redis:1 cluster IP" -m tcp --dport 6379 -j KUBE-SVC-XXJ2TMJIYSJJDBZG
【验证网络连通性】
1、在master主机上ping centos和redis的ip
[root@master redis]# ping 192.168.140.197 PING 192.168.140.197 (192.168.140.197) 56(84) bytes of data. 64 bytes from 192.168.140.197: icmp_seq=1 ttl=63 time=1.55 ms 64 bytes from 192.168.140.197: icmp_seq=2 ttl=63 time=0.487 ms
[root@master redis]# ping 192.168.140.82 PING 192.168.140.82 (192.168.140.82) 56(84) bytes of data. 64 bytes from 192.168.140.82: icmp_seq=1 ttl=63 time=0.317 ms 64 bytes from 192.168.140.82: icmp_seq=2 ttl=63 time=0.502 ms
2、在master主机上telnet redis的clusterip
[root@master redis]# telnet 10.254.159.20 6379 Trying 10.254.159.20... Connected to 10.254.159.20. Escape character is '^]'.
3、在slave1上ping centos和redis的pod,访问redis的clusterip
[root@slave1 bin]# ping 192.168.140.197 PING 192.168.140.197 (192.168.140.197) 56(84) bytes of data. 64 bytes from 192.168.140.197: icmp_seq=1 ttl=64 time=0.329 ms 64 bytes from 192.168.140.197: icmp_seq=2 ttl=64 time=0.068 ms
[root@slave1 bin]# ping 192.168.140.82 PING 192.168.140.82 (192.168.140.82) 56(84) bytes of data. 64 bytes from 192.168.140.82: icmp_seq=1 ttl=63 time=0.291 ms 64 bytes from 192.168.140.82: icmp_seq=2 ttl=63 time=0.455 ms
[root@slave1 bin]# telnet 10.254.159.20 6379 Trying 10.254.159.20... Connected to 10.254.159.20. Escape character is '^]'.
4、在centos容器内ping redis的pod
[root@centos-bpzkc /]# ping 192.168.140.82 PING 192.168.140.82 (192.168.140.82) 56(84) bytes of data. 64 bytes from 192.168.140.82: icmp_seq=1 ttl=62 time=0.951 ms
5、在centos容器内通过dns解析redis域名,并访问redis
[root@centos-bpzkc /]# nslookup redis Server: 10.254.159.10 Address: 10.254.159.10#53 Name: redis.default.svc.cluster.local Address: 10.254.159.20
[root@centos-bpzkc /]# telnet redis 6379 Trying 10.254.159.20... Connected to redis. Escape character is '^]'.
6、在centos容器内访问master主机上的服务(kube-apiserver)
[root@centos-bpzkc /]# telnet 192.168.99.129 8080 Trying 192.168.99.129... Connected to 192.168.99.129. Escape character is '^]'.