• docker&k8s填坑记


    本篇主要用于记录在实施docker和kubenetes过程中遇到的一个问题和解决办法.

    本节部分内容摘自互联网,有些部分为自己在测试环境中遇到到实际问题,后面还会根据实际情况不断分享关于docker/k8s在开发和维护过程中出现的种种问题,以便后来者少走弯路.
    系列目录

    kubernets nodeport 无法访问

    环境:

    Os: centos7.1
    Kubelet: 1.6.7
    Docker: 17.06-ce
    Calico: 2.3
    
    K8s Cluster: master, node-1, node-2
    

    问题:

    现有 service A, 为了能使外部访问,故将 service type 设为NodePort。端口为 31246。
    A 所对应的 pod 运行在 node-1 上。

    经过测试发现,外部访问 master:31246 和 node-2:31246 时均出现失败,只有通过 node-1:31246 才可正常访问。

    起因:

    为了安全起见, docker 在 1.13 版本之后,将系统iptables 中 FORWARD 链的默认策略设置为 DROP,并为连接到 docker0 网桥的容器添加了放行规则。这里引用 moby issue#14041 中的描述:

    When docker starts, it enables net.ipv4.ip_forward without changing the iptables FORWARD chain default policy to DROP. This means that another machine on the same network as the docker host can add a route to their routing table, and directly address any containers running on that docker host.
    
    For example, if the docker0 subnet is 172.17.0.0/16 (the default subnet), and the docker host’s IP address is 192.168.0.10, from another host on the network run:
    
       $ ip route add 172.17.0.0/16 via 192.168.0.10
    $ nmap 172.17.0.0/16
    
        1
        2
    
    The above will scan for containers running on the host, and report IP addresses & running services found.
    
    To fix this, docker needs to set the FORWARD policy to DROP when it enables the net.ipv4.ip_forward sysctl parameter.
    

    kubernetes 使用的 cni 插件会因此受影响(cni并不会在 FORWARD 链中生成相应规则),由此导致除 pod 所在 host 以外节点无法转发报文而访问失败。

    解决办法:

    如果对安全要求较低,可将 FORWARD 链的默认规则设为 ACCEPT

    iptables -P FORWARD ACCEPT
    

    google 网络不可达

    https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64/repodata/repomd.xml: [Errno 14] curl#7 - "Failed to connect to 2404:6800:4005:809::200e: 网络不可达"
    

    经典问题,需要自备梯子哦!
    CentOS中设置yum的proxy

    vi /etc/yum.conf
    # 增加内容如下:
    proxy=http://xxx.xx.x.xx:xxx #代理地址
    

    没有梯子的换阿里源

    cat > /etc/yum.repos.d/kubernetes.repo <<EOF 
    [kubernetes] 
    name=Kubernetes 
    baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64 
    enabled=1 
    gpgcheck=0 
    repo_gpgcheck=0 
    EOF
    

    关闭Swap

    F1213 10:20:53.304755    2266 server.go:261] failed to run Kubelet: Running with swap on is not supported, please disable swap! or set --fail-swap-on flag to false. /proc/swaps contained:
    

    执行 swapoff -a

    设定master错误

    [ERROR FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables contents are not set to 1      
     [ERROR FileContent--proc-sys-net-ipv4-ip_forward]: /proc/sys/net/ipv4/ip_forward contents are not set to 1
    [preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
    

    按提示设定为1

    echo "1" >/proc/sys/net/ipv4/ip_forward
    echo "1" >/proc/sys/net/bridge/bridge-nf-call-iptables
    

    Kubeadm init 安装镜像卡住

    # kubeadm config images list --kubernetes-version v1.13.0 # 看下该版本下的镜像名
    # 拉取镜像
    docker pull mirrorgooglecontainers/kube-apiserver:v1.13.0
    docker pull mirrorgooglecontainers/kube-controller-manager:v1.13.0
    docker pull mirrorgooglecontainers/kube-scheduler:v1.13.0
    docker pull mirrorgooglecontainers/kube-proxy:v1.13.0
    docker pull mirrorgooglecontainers/pause:3.1
    docker pull mirrorgooglecontainers/etcd:3.2.24
    docker pull coredns/coredns:1.2.6
    
    # 重命名镜像标签
    docker tag docker.io/mirrorgooglecontainers/kube-proxy:v1.13.0 k8s.gcr.io/kube-proxy:v1.13.0
    docker tag docker.io/mirrorgooglecontainers/kube-scheduler:v1.13.0 k8s.gcr.io/kube-scheduler:v1.13.0
    docker tag docker.io/mirrorgooglecontainers/kube-apiserver:v1.13.0 k8s.gcr.io/kube-apiserver:v1.13.0
    docker tag docker.io/mirrorgooglecontainers/kube-controller-manager:v1.13.0 k8s.gcr.io/kube-controller-manager:v1.13.0
    docker tag docker.io/mirrorgooglecontainers/etcd:3.2.24  k8s.gcr.io/etcd:3.2.24
    docker tag docker.io/mirrorgooglecontainers/pause:3.1  k8s.gcr.io/pause:3.1
    docker tag docker.io/coredns/coredns:1.2.6  k8s.gcr.io/coredns:1.2.6
    
    # 删除旧镜像
    docker rmi docker.io/mirrorgooglecontainers/kube-proxy:v1.13.0 
    docker rmi docker.io/mirrorgooglecontainers/kube-scheduler:v1.13.0 
    docker rmi docker.io/mirrorgooglecontainers/kube-apiserver:v1.13.0 
    docker rmi docker.io/mirrorgooglecontainers/kube-controller-manager:v1.13.0 
    docker rmi docker.io/mirrorgooglecontainers/etcd:3.2.24  
    docker rmi docker.io/mirrorgooglecontainers/pause:3.1 
    docker rmi docker.io/coredns/coredns:1.2.6  
    

    终于看到了

    Your Kubernetes master has initialized successfully!
    
    To start using your cluster, you need to run the following as a regular user:
    
      mkdir -p $HOME/.kube
      sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
      sudo chown $(id -u):$(id -g) $HOME/.kube/config
    
    You should now deploy a pod network to the cluster.
    Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
      https://kubernetes.io/docs/concepts/cluster-administration/addons/
    
    You can now join any number of machines by running the following on each node
    as root:
    
      kubeadm join 192.168.232.204:6443 --token m2hxkd.scxjrxgew6pyhvmb --discovery-token-ca-cert-hash sha256:8b94cefbe54ae4b3d7201012db30966c53870aad55be80a2888ec0da178c3610
    
    
    

    network配置

    # 这是我的虚机/etc/hosts的配置
    192.168.232.204   k8a204
    192.168.232.203 k8a203
    192.168.232.202 k8a202
    

    按手册安装选用的网络,并等待dns安装OK,然后增加node。

    NAME     STATUS   ROLES    AGE    VERSION
    k8a204   Ready    master   6m6s   v1.13.0
    [root@localhost .kube]# kubectl get nodes
    NAME     STATUS     ROLES    AGE     VERSION
    k8a203   NotReady   <none>   4s      v1.13.0
    k8a204   Ready      master   6m19s   v1.13.0
    

    注意,配置较慢,耐心等待

    kubectl get pods --all-namespaces
    ===============以下是结果===============
    NAMESPACE     NAME                             READY   STATUS              RESTARTS   AGE
    kube-system   coredns-86c58d9df4-2vdvx         1/1     Running             0          7m32s
    kube-system   coredns-86c58d9df4-88fjk         1/1     Running             0          7m32s
    kube-system   etcd-k8a204                      1/1     Running             0          6m39s
    kube-system   kube-apiserver-k8a204            1/1     Running             0          6m30s
    kube-system   kube-controller-manager-k8a204   1/1     Running             0          6m30s
    kube-system   kube-proxy-tl7g5                 1/1     Running             0          7m32s
    kube-system   kube-proxy-w2jgl                 0/1     ContainerCreating   0          95s
    kube-system   kube-scheduler-k8a204            1/1     Running             0          6m49s
    
    

    节点加入后NotReady

    接上一问题:
    **ContainerCreating **状态时,请耐心等待,但是如果超过10分钟仍然无响应,则必定是出错了,囧!
    最主要的问题:节点的镜像拉不下来。
    采用下列方式:

    1)在master主机内保存镜像为文件:

    docker save -o /opt/kube-pause.tar k8s.gcr.io/pause:3.1
    docker save -o /opt/kube-proxy.tar k8s.gcr.io/kube-proxy:v1.13.0
    docker save -o /opt/kube-flannel1.tar quay.io/coreos/flannel:v0.9.1
    docker save -o /opt/kube-flannel2.tar quay.io/coreos/flannel:v0.10.0-amd64
    docker save -o /opt/kube-calico1.tar quay.io/calico/cni:v3.3.2
    docker save -o /opt/kube-calico2.tar quay.io/calico/node:v3.3.2
    
    

    2)拷贝文件到node计算机

    scp /opt/*.tar root@192.168.232.203:/opt/
    
    1. 在node节点执行docker导入
    docker load -i /opt/kube-flannel1.tar
    docker load -i /opt/kube-flannel2.tar
    docker load -i /opt/kube-proxy.tar
    docker load -i /opt/kube-pause.tar
    docker load -i /opt/kube-calico1.tar
    docker load -i /opt/kube-calico2.tar
    
    1. 检查node节点镜像文件
    docker images
    ==============================================以下是结果======================================
    REPOSITORY               TAG                 IMAGE ID            CREATED             SIZE
    k8s.gcr.io/kube-proxy    v1.13.0             8fa56d18961f        9 days ago          80.2 MB
    quay.io/calico/node      v3.3.2              4e9be81e3a59        9 days ago          75.3 MB
    quay.io/calico/cni       v3.3.2              490d921fa49c        9 days ago          75.4 MB
    quay.io/coreos/flannel   v0.10.0-amd64       f0fad859c909        10 months ago       44.6 MB
    k8s.gcr.io/pause         3.1                 da86e6ba6ca1        11 months ago       742 kB
    quay.io/coreos/flannel   v0.9.1              2b736d06ca4c        13 months ago       51.3 MB
    

    搞定了,所有服务均running

    [root@localhost .kube]# kubectl get pods --all-namespaces
    ====================================以下是结果========================================
    NAMESPACE     NAME                             READY   STATUS    RESTARTS   AGE
    kube-system   calico-node-4dsg5                1/2     Running   0          42m
    kube-system   calico-node-5dtk2                1/2     Running   0          41m
    kube-system   calico-node-78qvp                1/2     Running   0          41m
    kube-system   coredns-86c58d9df4-26vr7         1/1     Running   0          43m
    kube-system   coredns-86c58d9df4-s5ljf         1/1     Running   0          43m
    kube-system   etcd-k8a204                      1/1     Running   0          42m
    kube-system   kube-apiserver-k8a204            1/1     Running   0          42m
    kube-system   kube-controller-manager-k8a204   1/1     Running   0          42m
    kube-system   kube-proxy-8c7hs                 1/1     Running   0          41m
    kube-system   kube-proxy-dls8l                 1/1     Running   0          41m
    kube-system   kube-proxy-t65tc                 1/1     Running   0          43m
    kube-system   kube-scheduler-k8a204            1/1     Running   0          42m
    

    重启恢复master

    swapoff -a
    # 启动所有容器
    # 更简洁的命令: docker start $(docker ps -aq)  
    docker start $(docker ps -a | awk '{ print $1}' | tail -n +2)
    systemctl start kubelet
    # 查看启动错误
    journalctl -xefu kubelet
    # docker 开机自启
    docker run --restart=always
    
    • DNS解析 kubernetes.default失败

    安装busybox进行dns检测,一直出现如下错误:

    kubectl exec -ti  busybox -- nslookup kubernetes.default
    =============================以下是结果============================
    Server:         10.96.0.10
    Address:        10.96.0.10:53
    ** server can't find kubernetes.default: NXDOMAIN
    *** Can't find kubernetes.default: No answer
    

    经查,新版busybox的dns解析有变化或bug,采用旧版本busybox images <= 1.28.4 后测试OK

    • token过期后重新生成
    # 生成新的token
    kubeadm token create
    # 生成新的token hash码
    openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
    # 利用新的token和hash码加入节点
    # master地址,token,hash请自行更换
    kubeadm join 192.168.232.204:6443 --token m87q91.gbcqhfx9ansvaf3o --discovery-token-ca-cert-hash sha256:fdd34ef6c801e382f3fb5b87bc9912a120bf82029893db121b9c8eae29e91c62
    
    
  • 相关阅读:
    eclipse FilteredTree
    Windows API高精度计时 C#版
    循环中响应消息,避免循环时UI线程被阻塞
    Linux rpm 包制作 使用 rpmbuild
    利用Windows API实现精确计时
    C++显示选择文件夹对话框
    android AsyncTask
    [转]Android 动画学习笔记
    eclipse 中导入android 工程时出错:The method of type must override a superclass method 解决方式
    Android 自定义对话框
  • 原文地址:https://www.cnblogs.com/tylerzhou/p/10975062.html
Copyright © 2020-2023  润新知