• k8s——Job


    Job Controller

    Job Controller负责根据Job Spec创建Pod,并持续监控Pod的状态,直至其成功结束。如果失败,则根据restartPolicy(只支持OnFailure和Never,不支持Always)决定是否创建新的Pod再次重试任务。

    Job用途
    容器按照持续运行的时间可分为两类:服务类容器和工作类容器
    服务类容器通常持续提供服务,需要一直运行,比如HTTPServer、Daemon等。工作类容器则是一次性任务,比如批处理程序,完成后容器就退出
    Kubernetes的Deployment、ReplicaSet和DaemonSet都用于管理服务类容器;对于工作类容器,我们使用Job
     

    root@ubuntu:~/tenant# cat job.yaml 
    apiVersion: batch/v1 
    kind: Job 
    metadata:
     name: myjob
    spec:
     template:
      metadata:
        name: myjob
      spec:
       containers:
       - name: hello
         image: busybox
         command: ["echo","hello k8s job !"]
       restartPolicy: Never 

     
    restartPolicy 指定什么情况下需要重启容器。对于Job,只能设置为Never(启动容器失败了,会一直重新启动新的pod)或者OnFailure(启动容器失败,不会重新启动新的pod,节省资源)。对于其他controller(比如Deployment),
     

    root@ubuntu:~/tenant#  kubectl get pods  -o wide
    NAME                             READY   STATUS              RESTARTS   AGE     IP               NODE      NOMINATED NODE   READINESS GATES
    busybox                          1/1     Running             0          46m     10.244.129.145   centos7   <none>           <none>
    example-foo-54dc4db9fc-lqz9j     1/1     Running             0          19d     10.244.29.26     bogon     <none>           <none>
    job-1-nginx-0                    0/1     Completed           0          22d     10.244.29.19     bogon     <none>           <none>
    myjob-pl75c                      0/1     ContainerCreating   0          8s      <none>           centos7   <none>           <none>
    nginx-ds-f7sjm                   1/1     Running             0          10m     10.244.29.23     bogon     <none>           <none>
    nginx-ds-ldlrq                   1/1     Running             0          10m     10.244.41.1      cloud     <none>           <none>
    nginx-ds-p8nqz                   1/1     Running             0          10m     10.244.243.195   ubuntu    <none>           <none>
    nginx-ds-xrt8b                   1/1     Running             0          10m     10.244.129.146   centos7   <none>           <none>
    test-job-default-nginx-0         1/1     Running             0          15d     10.244.29.3      bogon     <none>           <none>
    test-job-default-nginx-1         1/1     Running             0          15d     10.244.29.9      bogon     <none>           <none>
    test-job-default-nginx-2         1/1     Running             0          15d     10.244.29.19     bogon     <none>           <none>
    test-job-default-nginx-3         1/1     Running             0          15d     10.244.29.63     bogon     <none>           <none>
    test-job-default-nginx-4         1/1     Running             0          15d     10.244.29.1      bogon     <none>           <none>
    test-job-default-nginx-5         1/1     Running             0          15d     10.244.29.2      bogon     <none>           <none>
    test-job-v2-default-nginx-v2-0   1/1     Running             0          14d     10.244.29.20     bogon     <none>           <none>
    web-0                            1/1     Running             0          3h22m   10.244.129.142   centos7   <none>           <none>
    web-1                            1/1     Running             0          3h16m   10.244.129.143   centos7   <none>           <none>
    root@ubuntu:~/tenant# kubectl get job
    NAME    COMPLETIONS   DURATION   AGE
    myjob   1/1           12s        8m3s
    root@ubuntu:~/tenant# 
    root@ubuntu:~/tenant# kubectl get job
    NAME    COMPLETIONS   DURATION   AGE
    myjob   1/1           12s        8m3s
    root@ubuntu:~/tenant# kubectl logs  myjob-pl75c 
    hello k8s job !
    root@ubuntu:~/tenant# 

    以上是Pod成功执行的情况,如果Pod失败了会怎么样呢?
    修改job.yml,故意引入一个错误

    root@ubuntu:~/tenant# vi job.yaml 
    apiVersion: batch/v1
    kind: Job
    metadata:
     name: myjob
    spec:
     template:
      metadata:
        name: myjob
      spec:
       containers:
       - name: hello
         image: busybox
         command: ["invalid cmd","hello k8s job !"]
       restartPolicy: Never
    root@ubuntu:~/tenant# kubectl create -f job.yaml 
    job.batch/myjob created
    root@ubuntu:~/tenant# kubectl get pod
    NAME                             READY   STATUS              RESTARTS   AGE
    busybox                          1/1     Running             0          56m
    example-foo-54dc4db9fc-lqz9j     1/1     Running             0          19d
    job-1-nginx-0                    0/1     Completed           0          22d
    myjob-j6mtv                      0/1     ContainerCreating   0          9s
     
    root@ubuntu:~/tenant# kubectl get job
    NAME    COMPLETIONS   DURATION   AGE
    myjob   0/1           23s        23s
    root@ubuntu:~/tenant# kubectl get pod
    NAME                             READY   STATUS               RESTARTS   AGE
    busybox                          1/1     Running              0          57m
    example-foo-54dc4db9fc-lqz9j     1/1     Running              0          19d
    job-1-nginx-0                    0/1     Completed            0          22d
    myjob-j6mtv                      0/1     ContainerCannotRun   0          27s
    myjob-zrgmk                      0/1     ContainerCannotRun   0          15s
     
    root@ubuntu:~/tenant# kubectl get pod
    NAME                             READY   STATUS               RESTARTS   AGE
    busybox                          1/1     Running              0          57m
    example-foo-54dc4db9fc-lqz9j     1/1     Running              0          19d
    job-1-nginx-0                    0/1     Completed            0          22d
    myjob-j6mtv                      0/1     ContainerCannotRun   0          32s
    myjob-zrgmk                      0/1     ContainerCannotRun   0          20s
     
    root@ubuntu:~/tenant# kubectl get pod
    NAME                             READY   STATUS               RESTARTS   AGE
    busybox                          1/1     Running              0          57m
    example-foo-54dc4db9fc-lqz9j     1/1     Running              0          19d
    job-1-nginx-0                    0/1     Completed            0          22d
    myjob-j6mtv                      0/1     ContainerCannotRun   0          38s
    myjob-mdfpz                      0/1     ContainerCreating    0          4s
    myjob-zrgmk                      0/1     ContainerCannotRun   0          26s
     
    root@ubuntu:~/tenant# 
    root@ubuntu:~/tenant# kubectl get pod | grep myjob
    myjob-6kfq8                      0/1     ContainerCannotRun   0          65s
    myjob-j6mtv                      0/1     ContainerCannotRun   0          119s
    myjob-mdfpz                      0/1     ContainerCannotRun   0          85s
    myjob-zrgmk                      0/1     ContainerCannotRun   0          107s
    root@ubuntu:~/tenant# 
    root@ubuntu:~/tenant# kubectl get pod | grep myjob
    myjob-6kfq8                      0/1     ContainerCannotRun   0          65s
    myjob-j6mtv                      0/1     ContainerCannotRun   0          119s
    myjob-mdfpz                      0/1     ContainerCannotRun   0          85s
    myjob-zrgmk                      0/1     ContainerCannotRun   0          107s
    root@ubuntu:~/tenant# kubectl describe pods myjob-mdfpz
    Name:         myjob-mdfpz
    Namespace:    default
    Priority:     0
    Node:         centos7/10.10.16.251
    Start Time:   Thu, 29 Jul 2021 15:29:41 +0800
    Labels:       controller-uid=2fab27c7-2c65-425b-a698-1a4ffaa24448
                  job-name=myjob
    Annotations:  cni.projectcalico.org/podIP: 
                  cni.projectcalico.org/podIPs: 
    Status:       Failed
    IP:           10.244.129.150
    IPs:
      IP:           10.244.129.150
    Controlled By:  Job/myjob
    Containers:
      hello:
        Container ID:  docker://0b71696f5d71fb7c4ddb7fcb408c2141e890298092ef2701ce695f82d1ff242e
        Image:         busybox
        Image ID:      docker-pullable://docker.io/busybox@sha256:0f354ec1728d9ff32edcd7d1b8bbdfc798277ad36120dc3dc683be44524c8b60
        Port:          <none>
        Host Port:     <none>
        Command:
          invalid cmd
          hello k8s job !
        State:      Terminated
          Reason:   ContainerCannotRun
          Message:  oci runtime error: container_linux.go:235: starting container process caused "exec: "invalid cmd": executable file not found in $PATH"
    
          Exit Code:    127
          Started:      Thu, 29 Jul 2021 15:29:49 +0800
          Finished:     Thu, 29 Jul 2021 15:29:49 +0800
        Ready:          False
        Restart Count:  0
        Environment:    <none>
        Mounts:
          /var/run/secrets/kubernetes.io/serviceaccount from default-token-cfr6q (ro)
    Conditions:
      Type              Status
      Initialized       True 
      Ready             False 
      ContainersReady   False 
      PodScheduled      True 
    Volumes:
      default-token-cfr6q:
        Type:        Secret (a volume populated by a Secret)
        SecretName:  default-token-cfr6q
        Optional:    false
    QoS Class:       BestEffort
    Node-Selectors:  <none>
    Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                     node.kubernetes.io/unreachable:NoExecute for 300s
    Events:
      Type     Reason     Age        From               Message
      ----     ------     ----       ----               -------
      Normal   Scheduled  <unknown>  default-scheduler  Successfully assigned default/myjob-mdfpz to centos7
      Normal   Pulling    2m21s      kubelet, centos7   Pulling image "busybox"
      Normal   Pulled     2m17s      kubelet, centos7   Successfully pulled image "busybox"
      Normal   Created    2m16s      kubelet, centos7   Created container hello
      Warning  Failed     2m15s      kubelet, centos7   Error: failed to start container "hello": Error response from daemon: oci runtime error: container_linux.go:235: starting container process caused "exec: "invalid cmd": executable file not found in $PATH"
    root@ubuntu:~/tenant# 

    下面解释一个现象:为什么kubectl get pod会看到这么多个失败
    的Pod?
    原因是:当第一个Pod启动时,容器失败退出,根据restartPolicy:
    Never,此失败容器不会被重启,但Job DESIRED的Pod是1,目前SUCCESSFUL为0,不满足,所以Job controller会启动新的Pod,直到SUCCESSFUL为1。对于我们这个例子,SUCCESSFUL永远也到不了1,所以Job controller会一直创建新的Pod。为了终止这个行为,只能删除Job

    如果将restartPolicy设置为OnFailure会怎么样?下面我们实践一下,修改myjob.yml后重新启动
     

    apiVersion: batch/v1
    kind: Job 
    metadata:
     name: myjob
    spec:
     template:
      metadata:
        name: myjob
      spec:
       containers:
       - name: hello
         image: busybox
         command: ["invalid cmd","hello k8s job !"]
       restartPolicy: OnFailure
    root@ubuntu:~/tenant# kubectl get pod | grep myjob
    myjob-f5kvm                      0/1     ContainerCreating   0          9s
    root@ubuntu:~/tenant# kubectl get pod | grep myjob
    myjob-f5kvm                      0/1     ContainerCreating   0          11s
    root@ubuntu:~/tenant# kubectl get pod | grep myjob
    myjob-f5kvm                      0/1     RunContainerError   0          12s
    root@ubuntu:~/tenant# kubectl get pod | grep myjob
    myjob-f5kvm                      0/1     RunContainerError   2          45s
    root@ubuntu:~/tenant#

    # RESTARTS为2,而且不断增加,说明OnFailure生效,容器失败后会自动重启,不会创建新的pod

    Job的并行性

    有时我们希望能同时运行多个Pod,提高Job的执行效率。这个可以通过parallelism设置

    root@ubuntu:~/tenant# cat job.yaml 
    apiVersion: batch/v1
    kind: Job 
    metadata:
     name: myjob
    spec:
     parallelism: 2 ##同时运行两个pod
     template:
      metadata:
        name: myjob
      spec:
       containers:
       - name: hello
         image: busybox
         command: ["echo","hello k8s job !"]
       restartPolicy: OnFailure
    root@ubuntu:~/tenant#  kubectl delete  -f job.yaml
    job.batch "myjob" deleted
    root@ubuntu:~/tenant#  kubectl create  -f job.yaml
    job.batch/myjob created
    root@ubuntu:~/tenant#  kubectl get pod | grep myjob
    myjob-nxw5t                      0/1     Completed           0          11s
    myjob-qsc9r                      0/1     ContainerCreating   0          11s
    root@ubuntu:~/tenant#  kubectl get jobs.batch 
    NAME    COMPLETIONS   DURATION   AGE
    myjob   2/1 of 2      14s        19s
    root@ubuntu:~/tenant#  kubectl get jobs
    NAME    COMPLETIONS   DURATION   AGE
    myjob   2/1 of 2      14s        26s
    root@ubuntu:~/tenant#  kubectl get pod | grep myjob
    myjob-nxw5t                      0/1     Completed   0          30s
    myjob-qsc9r                      0/1     Completed   0          30s
    root@ubuntu:~/tenant#  kubectl get pod | grep myjob
    myjob-nxw5t                      0/1     Completed   0          36s
    myjob-qsc9r                      0/1     Completed   0          36s
    root@ubuntu:~/tenant#  kubectl get pod | grep myjob
    myjob-nxw5t                      0/1     Completed   0          43s
    myjob-qsc9r                      0/1     Completed   0          43s
    root@ubuntu:~/tenant# kubectl logs  myjob-nxw5t
    hello k8s job !
    root@ubuntu:~/tenant# kubectl logs   myjob-qsc9r 
    hello k8s job !
    root@ubuntu:~/tenant# 

    我们还可以通过completions设置Job成功完成Pod的总数

    root@ubuntu:~/tenant# cat job.yaml 
    apiVersion: batch/v1
    kind: Job 
    metadata:
     name: myjob
    spec:
     parallelism: 2 ##同时运行两个pod
     completions: 4
     template:
      metadata:
        name: myjob
      spec:
       containers:
       - name: hello
         image: busybox
         command: ["echo","hello k8s job !"]
       restartPolicy: OnFailure
    root@ubuntu:~/tenant# cat job.yaml 
    apiVersion: batch/v1
    kind: Job 
    metadata:
     name: myjob
    spec:
     parallelism: 2 ##同时运行两个pod
     completions: 4
     template:
      metadata:
        name: myjob
      spec:
       containers:
       - name: hello
         image: busybox
         command: ["echo","hello k8s job !"]
       restartPolicy: OnFailure
     
    root@ubuntu:~/tenant#  kubectl create  -f job.yaml
    job.batch/myjob created
    root@ubuntu:~/tenant#  kubectl get pod | grep myjob
    myjob-27fss                      0/1     Completed           0          19s
    myjob-dqfgw                      0/1     Completed           0          19s
    myjob-m954l                      0/1     ContainerCreating   0          4s
    myjob-x9bps                      0/1     ContainerCreating   0          9s
    root@ubuntu:~/tenant#  kubectl get pod | grep myjob
    myjob-27fss                      0/1     Completed           0          25s
    myjob-dqfgw                      0/1     Completed           0          25s
    myjob-m954l                      0/1     ContainerCreating   0          10s
    myjob-x9bps                      0/1     Completed           0          15s
    root@ubuntu:~/tenant#  kubectl get pod | grep myjob
    myjob-27fss                      0/1     Completed   0          28s
    myjob-dqfgw                      0/1     Completed   0          28s
    myjob-m954l                      0/1     Completed   0          13s
    myjob-x9bps                      0/1     Completed   0          18s
    root@ubuntu:~/tenant# kubectl logs  myjob-m954l
    hello k8s job !
    root@ubuntu:~/tenant# kubectl logs   myjob-dqfgw 
    hello k8s job !
    root@ubuntu:~/tenant# 


     

  • 相关阅读:
    Python自动化开发从浅入深-语言基础
    Python自动化开发从浅入深-初识Python
    python访问mysql
    列表和元组核心办法
    字典核新方法
    字符串的核心应用
    个人总结:字典并非完全无序
    Python 基础【二】 下
    Python 基础【二】 上
    windows开发的python移植到linux的问题
  • 原文地址:https://www.cnblogs.com/dream397/p/15075445.html
Copyright © 2020-2023  润新知