我的问题:部署k8s时,kubelet一直无法启动。
[root@jm228 ~]# kubeadm init --config=kubeadm-config.yaml --experimental-upload-certs --ignore-preflight-errors=all | tee kubeadm-init.log Flag --experimental-upload-certs has been deprecated, use --upload-certs instead [init] Using Kubernetes version: v1.15.1 [preflight] Running pre-flight checks [WARNING FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists [WARNING FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists [WARNING FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists [WARNING FileAvailable--etc-kubernetes-manifests-etcd.yaml]: /etc/kubernetes/manifests/etcd.yaml already exists [preflight] Pulling images required for setting up a Kubernetes cluster [preflight] This might take a minute or two, depending on the speed of your internet connection [preflight] You can also perform this action in beforehand using 'kubeadm config images pull' [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Activating the kubelet service [certs] Using certificateDir folder "/etc/kubernetes/pki" [certs] Using existing ca certificate authority [certs] Using existing apiserver certificate and key on disk [certs] Using existing apiserver-kubelet-client certificate and key on disk [certs] Using existing front-proxy-ca certificate authority [certs] Using existing front-proxy-client certificate and key on disk [certs] Using existing etcd/ca certificate authority [certs] Using existing etcd/server certificate and key on disk [certs] Using existing etcd/healthcheck-client certificate and key on disk [certs] Using existing apiserver-etcd-client certificate and key on disk [certs] Using existing etcd/peer certificate and key on disk [certs] Using the existing "sa" key [kubeconfig] Using kubeconfig folder "/etc/kubernetes" [kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/admin.conf" [kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/kubelet.conf" [kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/controller-manager.conf" [kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/scheduler.conf" [control-plane] Using manifest folder "/etc/kubernetes/manifests" [control-plane] Creating static Pod manifest for "kube-apiserver" [control-plane] Creating static Pod manifest for "kube-controller-manager" [control-plane] Creating static Pod manifest for "kube-scheduler" [etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests" [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s [kubelet-check] Initial timeout of 40s passed. [kubelet-check] It seems like the kubelet isn't running or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp: lookup localhost on 114.114.114.114:53: no such host. [kubelet-check] It seems like the kubelet isn't running or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp: lookup localhost on 114.114.114.114:53: no such host. [kubelet-check] It seems like the kubelet isn't running or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp: lookup localhost on 114.114.114.114:53: no such host. [kubelet-check] It seems like the kubelet isn't running or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp: lookup localhost on 114.114.114.114:53: no such host. [kubelet-check] It seems like the kubelet isn't running or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp: lookup localhost on 114.114.114.114:53: no such host. Unfortunately, an error has occurred: timed out waiting for the condition This error is likely caused by: - The kubelet is not running - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled) If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands: - 'systemctl status kubelet' - 'journalctl -xeu kubelet' Additionally, a control plane component may have crashed or exited when started by the container runtime. To troubleshoot, list all containers using your preferred container runtimes CLI, e.g. docker. Here is one example how you may list all Kubernetes containers running in docker: - 'docker ps -a | grep kube | grep -v pause' Once you have found the failing container, you can inspect its logs with: - 'docker logs CONTAINERID' error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
使用 journalctl -xeu kubelet 查看日志:
Mar 27 15:15:15 jm230 kubelet[22402]: I0327 15:15:15.075048 22402 server.go:416] Version: v1.17.4 Mar 27 15:15:15 jm230 kubelet[22402]: I0327 15:15:15.075484 22402 plugins.go:100] No cloud provider specified. Mar 27 15:15:15 jm230 kubelet[22402]: I0327 15:15:15.075528 22402 server.go:821] Client rotation is on, will bootstrap in background Mar 27 15:15:15 jm230 kubelet[22402]: I0327 15:15:15.079892 22402 certificate_store.go:129] Loading cert/key pair from "/var/lib/kubelet/pki/kubelet-client-current.pem". Mar 27 15:15:15 jm230 kubelet[22402]: I0327 15:15:15.294977 22402 server.go:641] --cgroups-per-qos enabled, but --cgroup-root was not specified. defaulting to / Mar 27 15:15:15 jm230 kubelet[22402]: I0327 15:15:15.295878 22402 container_manager_linux.go:265] container manager verified user specified cgroup-root exists: [] Mar 27 15:15:15 jm230 kubelet[22402]: I0327 15:15:15.295937 22402 container_manager_linux.go:270] Creating Container Manager object based on Node Config: {RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: ContainerRuntime:dock Mar 27 15:15:15 jm230 kubelet[22402]: I0327 15:15:15.296161 22402 fake_topology_manager.go:29] [fake topologymanager] NewFakeManager Mar 27 15:15:15 jm230 kubelet[22402]: I0327 15:15:15.296182 22402 container_manager_linux.go:305] Creating device plugin manager: true Mar 27 15:15:15 jm230 kubelet[22402]: I0327 15:15:15.296230 22402 fake_topology_manager.go:39] [fake topologymanager] AddHintProvider HintProvider: &{kubelet.sock /var/lib/kubelet/device-plugins/ map[] {0 0} <nil> {{} [0 0 0]} 0x1b1d7 Mar 27 15:15:15 jm230 kubelet[22402]: I0327 15:15:15.296299 22402 state_mem.go:36] [cpumanager] initializing new in-memory state store Mar 27 15:15:15 jm230 kubelet[22402]: I0327 15:15:15.296515 22402 state_mem.go:84] [cpumanager] updated default cpuset: "" Mar 27 15:15:15 jm230 kubelet[22402]: I0327 15:15:15.296538 22402 state_mem.go:92] [cpumanager] updated cpuset assignments: "map[]" Mar 27 15:15:15 jm230 kubelet[22402]: I0327 15:15:15.296561 22402 fake_topology_manager.go:39] [fake topologymanager] AddHintProvider HintProvider: &{{0 0} 0x6ea6db8 10000000000 0xc00090a840 <nil> <nil> <nil> <nil> map[memory:{{104857 Mar 27 15:15:15 jm230 kubelet[22402]: I0327 15:15:15.296718 22402 kubelet.go:286] Adding pod path: /etc/kubernetes/manifests Mar 27 15:15:15 jm230 kubelet[22402]: I0327 15:15:15.296785 22402 kubelet.go:311] Watching apiserver Mar 27 15:15:15 jm230 kubelet[22402]: E0327 15:15:15.307611 22402 reflector.go:153] k8s.io/kubernetes/pkg/kubelet/kubelet.go:449: Failed to list *v1.Service: Get https://10.41.4.230:6443/api/v1/services?limit=500&resourceVersion=0: dia Mar 27 15:15:15 jm230 kubelet[22402]: E0327 15:15:15.307642 22402 reflector.go:153] k8s.io/kubernetes/pkg/kubelet/kubelet.go:458: Failed to list *v1.Node: Get https://10.41.4.230:6443/api/v1/nodes?fieldSelector=metadata.name%3Djm230&li Mar 27 15:15:15 jm230 kubelet[22402]: E0327 15:15:15.307823 22402 reflector.go:153] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:46: Failed to list *v1.Pod: Get https://10.41.4.230:6443/api/v1/pods?fieldSelector=spec.nodeName%3Djm Mar 27 15:15:15 jm230 kubelet[22402]: I0327 15:15:15.312694 22402 client.go:75] Connecting to docker on unix:///var/run/docker.sock Mar 27 15:15:15 jm230 kubelet[22402]: I0327 15:15:15.312745 22402 client.go:104] Start docker client with request timeout=2m0s Mar 27 15:15:15 jm230 kubelet[22402]: W0327 15:15:15.328315 22402 docker_service.go:563] Hairpin mode set to "promiscuous-bridge" but kubenet is not enabled, falling back to "hairpin-veth" Mar 27 15:15:15 jm230 kubelet[22402]: I0327 15:15:15.328393 22402 docker_service.go:240] Hairpin mode set to "hairpin-veth" Mar 27 15:15:15 jm230 kubelet[22402]: W0327 15:15:15.328522 22402 cni.go:237] Unable to update cni config: no networks found in /etc/cni/net.d # 此报错只是因为网络还没有部署,暂不用管 Mar 27 15:15:15 jm230 kubelet[22402]: W0327 15:15:15.335658 22402 cni.go:237] Unable to update cni config: no networks found in /etc/cni/net.d Mar 27 15:15:15 jm230 kubelet[22402]: I0327 15:15:15.335769 22402 docker_service.go:255] Docker cri networking managed by cni Mar 27 15:15:15 jm230 kubelet[22402]: W0327 15:15:15.335883 22402 cni.go:237] Unable to update cni config: no networks found in /etc/cni/net.d Mar 27 15:15:15 jm230 kubelet[22402]: I0327 15:15:15.385723 22402 docker_service.go:260] Docker Info: &{ID:2UFE:736H:UMTZ:W2OH:YPUJ:P5JR:4GBY:Z4QT:PCGW:FUJ5:M7SN:2PJC Containers:0 ContainersRunning:0 ContainersPaused:0 ContainersStoppe Mar 27 15:15:15 jm230 kubelet[22402]: Use `--storage-opt dm.thinpooldev` to specify a custom block storage device.]} Mar 27 15:15:15 jm230 kubelet[22402]: I0327 15:15:15.385932 22402 docker_service.go:273] Setting cgroupDriver to systemd Mar 27 15:15:15 jm230 kubelet[22402]: F0327 15:15:15.390830 22402 docker_service.go:414] Streaming server stopped unexpectedly: listen tcp: lookup localhost on 114.114.114.114:53: no such host Mar 27 15:15:15 jm230 systemd[1]: kubelet.service: main process exited, code=exited, status=255/n/a Mar 27 15:15:15 jm230 systemd[1]: Unit kubelet.service entered failed state. Mar 27 15:15:15 jm230 systemd[1]: kubelet.service failed.
所以我分别查看一下dns地址 cat /etc/resolv.conf,发现并没有什么不对,又使用 ping www.baidu.com,发现网络也正常,最后查看了一下 cat /etc/hosts 问题出现了。
[root@k8s-135 ~]# cat /etc/hosts 192.168.17.135 k8s-135 192.168.17.138 k8s-138 192.168.17.140 k8s-140
一般情况下hosts的内容是关于主机名(hostname)的定义,每行为一个主机,每行由三部份组成,每个部份由空格隔开。
分别是: 网络IP地址 主机名.域名 主机名(主机名别名)
这里缺少了前两行,也就是ipv4和ipv6的回环地址,导致本机无法解析,添加上去即可。(说起来也是愚蠢,一开始我还以为是k8s部署哪里出了问题,搞了半天)
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6