• Readiness probe failed: Client.Timeout exceeded while awaiting headers)


    Pods restart frequently causing periodic timeout errors

    After you complete your installation, you might encounter an issue that causes some pods to become not ready every few minutes. In addition, this issue can cause login difficulty.

    Symptoms

    One or more pods experience multiple restarts that result in the pod or pods being frequently in a not ready state. In addition, attempts to log in result in periodic 502 Bad Gateway or 504 Gateway Timeout errors.

    You can view the events for a pod that is frequently restarting by running the following commands:

    kubectl describe pod <pod name> -n <namespace-name>
    

    Your output can include the following error message:

    Readiness probe failed: Get http://<host>:<port>/readinessProbe: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
    
    kubectl logs <pod name> -n <namespace-name>
    

    The logs might show errors similar to the following sample log messages:

    [2020-01-23T19:59:23.036] [ERROR] [mcm-ui] [status] GET /readinessProbe  500
    [2020-01-23T19:59:29.064] [ERROR] [mcm-ui] [status] GET /readinessProbe  500
    [2020-01-23T19:59:38.087] [ERROR] [mcm-ui] [status] GET /readinessProbe  500
    [2020-01-23T20:01:53.096] [ERROR] [mcm-ui] [status] GET /readinessProbe  500
    [2020-01-23T20:01:59.111] [ERROR] [mcm-ui] [status] GET /readinessProbe  500
    [2020-01-23T20:02:08.137] [ERROR] [mcm-ui] [status] GET /readinessProbe  500
    [2020-01-23T20:11:39.951] [ERROR] [mcm-ui] [status] GET /readinessProbe  500
    [2020-01-23T20:11:50.184] [ERROR] [mcm-ui] [status] GET /readinessProbe  500
    [2020-01-23T20:11:59.207] [ERROR] [mcm-ui] [status] GET /readinessProbe  500
    [2020-01-23T20:12:08.232] [ERROR] [mcm-ui] [status] GET /readinessProbe  500
    [2020-01-23T20:13:53.051] [ERROR] [mcm-ui] [status] GET /readinessProbe  500
    

    Causes

    This issue can occur due to frequent failing readiness probes for a pod. When the pod becomes 'not ready', you might not be able to log in or use the console.

    Resolving the problem

    To reduce the frequency of timeout errors from this issue, you can configure a workaround or apply a DNS config patch to help resolve this issue.

    Apply DNS config patch

    To help address the periodic timeout errors, you can apply a OpenShift DNS config patch for IBM Cloud Pak foundational services clusters. This patch addresses an issue that occurs when accessing services from pods that results in requests that have a response delay of up to 5 seconds. Normal response times for requests typically require only a millisecond delay.

    This patch is available as an interim fix for IBM Cloud Pak foundational services. For more information about this patch, and to obtain this patch, go to IBM® Fix Central. IBM® Fix Central contains fixes and updates for IBM® products. To access this website, see IBM Fix Central Opens in a new tab.

    To directly access the interim fix for this patch, see CS=3.2.4-fix-37137 Opens in a new tab.

    To apply the patch, follow the README instructions that are included with the interim fix.

    Workaround

    To resolve the issue, complete manual adjustments to all liveness and readiness probes in the pod daemonset, for example the following steps adjusts the auth-idp daemonset as an example.

    1. Install kubectl. See Installing the Kubernetes CLI (kubectl).

    2. Edit the auth-idp daemonset.

      kubectl edit ds auth-idp -n kube-system
      
    3. Locate the settings for each liveness and readiness probe for all containers in the daemonset. For example, the following section shows the readinessProbe settings for the platform-auth-service container:

           name: platform-auth-service
           ports:
           - containerPort: 8443
             hostPort: 8443
             name: http
             protocol: TCP
           readinessProbe:
             failureThreshold: 3
             httpGet:
               path: /
               port: 8443
               scheme: HTTPS
             periodSeconds: 10
             successThreshold: 1
             timeoutSeconds: 1
           resources:
             limits:
               cpu: "1"
               memory: 1Gi
             requests:
               cpu: 100m
               memory: 256Mi
      
    4. Set values for each probe setting for all containers in the daemonset to increase the initialDelaySecondsperiodSeconds, and timeoutSeconds settings. If missing, add the initialDelaySeconds setting. The following example shows the placement of these settings for a readiness probe:

       readinessProbe:
         failureThreshold: 3
         httpGet:
           path: /
           port: 8443
           scheme: HTTPS
         initialDelaySeconds: 420
         periodSeconds: 30
         successThreshold: 1
         timeoutSeconds: 10
      
    5. Save the file and wait until all the auth-idp pods restart. The pods might take a few minutes to restart.

  • 相关阅读:
    十二、redis常用的运维命令及注意参数
    十一,redis的主从集群
    十、redis的持久化配置
    九、Redis的消息发布和订阅
    八、Redis 中的事务
    apache、nginx、iis日志记录的各个字段内容与含义
    Pikachu-RCE
    Pikachu-SQL-Inject(SQL注入漏洞)
    Pikachu-CSRF(跨站请求伪造)
    Pikachu-XSS(跨站脚本)漏洞
  • 原文地址:https://www.cnblogs.com/cheyunhua/p/15246305.html
Copyright © 2020-2023  润新知