Pod 存活性探测
有不少程序长时间持续运行后逐渐转为不可用状态,并且仅能通过重启恢复,Kubernetes 的容器存活性探测机制可发现诸如此类的问题,并依据探测结果结合重启策略触发后续的行为。
存活性探测是隶属于容器级别的配置,kubelet 可基于它判定何时需要重启一个容器。
1、设置 exec 探针
exec 类型的探针通过在目标容器中执行由用户定义的命令来判断容器的健康状态,若命令状态返回值为 0 则表示 "成功' 通过检测,其值非 0 均为 "失败" 状态。"spec.containers.livenessProbe.exec" 字段用于定义此类检测,它只有一个可用属性 "command",用于指定要执行的命令。
spec:
containers:
- name: liveness-exec-demo
image: busybox
args: ["/bin/sh", "-c", "touch /tmp/healthy;sleep 60; rm -rf /tmp/healthy;sleep 600"]
livenessProbe:
exec:
command: ["test", "-e", "/tmp/healthy"]
上面基于 busybox 镜像启动一个运行 "touch /tmp/healthy;sleep 60; rm -rf /tmp/healthy;sleep 600" 命令的容器,此命令在容器启动时创建 /tmp/healthy 文件,并于 60秒 之后将其删除。
存活性探针运行 "test -e /tmp/healthy" 命令检查 /tmp/healthy 文件的存在性,若文件存在则返回状态码0,表示成功通过测试。
2、设置 HTTP 探针
基于 HTTP 的探测(HTTPGetAction)向目标容器发起一个 HTTP 请求,根据其相应码进行结果判定,响应码形如 2xx 或 3xx 时表示通过。"spec.containers.livenessProbe.httpGet" 字段用于定义此类检测,它的可用配置字段包括如下几个:
kubectl explain deployment.spec.template.spec.containers.livenessProbe.httpGet
KIND: Deployment
VERSION: apps/v1
RESOURCE: httpGet <Object>
DESCRIPTION:
HTTPGet specifies the http request to perform.
HTTPGetAction describes an action based on HTTP Get requests.
FIELDS:
host <string>
Host name to connect to, defaults to the pod IP. You probably want to set
"Host" in httpHeaders instead.
httpHeaders <[]Object>
Custom headers to set in the request. HTTP allows repeated headers.
path <string>
Path to access on the HTTP server.
port <string> -required-
Name or number of the port to access on the container. Number must be in
the range 1 to 65535. Name must be an IANA_SVC_NAME.
scheme <string>
Scheme to use for connecting to the host. Defaults to HTTP.
下面是一个定义资源清单文件 livenesshttp.yaml 中的示例,它通过 lifecycle 中的 postStart hook 创建了一个专用于 httpGet 测试的页面文件 healthz:
livenessProbe:
httpGet:
path: /actuator/health
port: 8291
scheme: HTTP
initialDelaySeconds: 60
timeoutSeconds: 1
periodSeconds: 5
successThreshold: 1
failureThreshold: 30
3、设置 TCP 探针
基于 TCP 的存活性探测(TCPSocketAction)用于向容器的特定端口发起 TCP 请求并尝试建立连接进行结果判定,连接建立成功即为通过检测。相比较来说,它比基于 HTTP 的探测要更高效 、更节约资源,但精准度略低,毕竟建立连接成功未必意味着页面资源可用。"spec.containers.livessProbeS.tcpSocket" 字段用于定义此类检测,它主要包括以下属性:
kubectl explain deployment.spec.template.spec.containers.livenessProbe.tcpSocket
KIND: Deployment
VERSION: apps/v1
RESOURCE: tcpSocket <Object>
DESCRIPTION:
TCPSocket specifies an action involving a TCP port. TCP hooks not yet
supported
TCPSocketAction describes an action based on opening a socket
FIELDS:
host <string>
Optional: Host name to connect to, defaults to the pod IP.
port <string> -required-
Number or name of the port to access on the container. Number must be in
the range 1 to 65535. Name must be an IANA_SVC_NAME.
下面是一个定义在资源清单文件 liveness-tcp.yaml 中的示例,它向 Pod IP 的 80/tcp 端口发起连接请求,并根据连接建立的状态判定测试结果:、
spec:
containers:
- name: liveness-tcp-demo
image: nginx:1.18-alpine
ports:
- name: http
containerPort: 80
livenessProbe:
tcpSocket:
port: http
4、存活性探测行为属性:
kubectl explain deployment.spec.template.spec.containers.livenessProbe
KIND: Deployment
VERSION: apps/v1
RESOURCE: livenessProbe <Object>
DESCRIPTION:
Periodic probe of container liveness. Container will be restarted if the
probe fails. Cannot be updated. More info:
https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes
Probe describes a health check to be performed against a container to
determine whether it is alive or ready to receive traffic.
FIELDS:
exec <Object>
One and only one of the following should be specified. Exec specifies the
action to take.
failureThreshold <integer>
Minimum consecutive failures for the probe to be considered failed after
having succeeded. Defaults to 3. Minimum value is 1.
httpGet <Object>
HTTPGet specifies the http request to perform.
initialDelaySeconds <integer>
Number of seconds after the container has started before liveness probes
are initiated. More info:
https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes
periodSeconds <integer>
How often (in seconds) to perform the probe. Default to 10 seconds. Minimum
value is 1.
successThreshold <integer>
Minimum consecutive successes for the probe to be considered successful
after having failed. Defaults to 1. Must be 1 for liveness and startup.
Minimum value is 1.
tcpSocket <Object>
TCPSocket specifies an action involving a TCP port. TCP hooks not yet
supported
timeoutSeconds <integer>
Number of seconds after which the probe times out. Defaults to 1 second.
Minimum value is 1. More info:
https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes