前几天,出现了无法跨主机ping通容器的情况,导致一个node机网络中断,无法访问,排查过程如下。
- 首先确认,宿主机node2是可以ping通容器
[root@node2 ~]# ping 10.1.19.3 PING 10.1.19.3 (10.1.19.3) 56(84) bytes of data. 64 bytes from 10.1.19.3: icmp_seq=1 ttl=64 time=0.122 ms 64 bytes from 10.1.19.3: icmp_seq=2 ttl=64 time=0.073 ms
可以ping通,进行下一步 - 确认,代理机到容器是否可以ping通
[root@node1 ~]# ping 10.1.19.3 PING 10.1.19.3 (10.1.19.3) 56(84) bytes of data. ^C --- 10.1.19.3 ping statistics --- 14 packets transmitted, 0 received, 100% packet loss, time 12999ms
- 查看代理机的flannel子网段配置是否正常
[root@node1 ~]# etcdctl ls /coreos.com/network/subnets /coreos.com/network/subnets/10.1.91.0-24 /coreos.com/network/subnets/10.1.93.0-24 /coreos.com/network/subnets/10.1.94.0-24 /coreos.com/network/subnets/10.1.19.0-24 /coreos.com/network/subnets/10.1.77.0-24
- 返回去查看宿主机路由是否配置完整
[root@node2 ~]# route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 192.168.19.51 0.0.0.0 UG 100 0 0 eth0 10.1.19.0 0.0.0.0 255.255.255.0 U 0 0 0 docker0 192.168.19.0 0.0.0.0 255.255.255.0 U 100 0 0 eth0
- 尝试重启flannel,如果无法自动创建路由,则进行手动添加
[root@node2 ~]# route add -net 10.1.0.0 netmask 255.255.0.0 dev flannel0 [root@node2 ~]# route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 0.0.0.0 192.168.19.51 0.0.0.0 UG 100 0 0 eth0 10.1.0.0 0.0.0.0 255.255.0.0 U 0 0 0 flannel0 10.1.19.0 0.0.0.0 255.255.255.0 U 0 0 0 docker0 192.168.19.0 0.0.0.0 255.255.255.0 U 100 0 0 eth0
- 确认网络,如果依然无法联通,由于flannel.1网卡和docker0网卡通过iptables的forward转发,所以确保:
- 核中的forward功能开启(立即生效,重启后效果不再)
echo "1" > /proc/sys/net/ipv4/ip_forward
-
包不会被iptables的forward规则拦截
sudo iptables -P FORWARD ACCEPT
- 确认网络是否联通了
[root@node1 ~]# ping 10.1.19.3 PING 10.1.19.3 (10.1.19.3) 56(84) bytes of data. 64 bytes from 10.1.19.3: icmp_seq=1 ttl=61 time=0.444 ms 64 bytes from 10.1.19.3: icmp_seq=2 ttl=61 time=0.288 ms ^C --- 10.1.19.3 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 999ms rtt min/avg/max/mdev = 0.288/0.366/0.444/0.078 ms
以上