• 开源基于docker的任务调度器pipeline,比`quartzs` 更强大的分布式任务调度器


    pipeline 分布式任务调度器

    目标: 基于docker的布式任务调度器, 比quartzs,xxl-job 更强大的分布式任务调度器。

    可以将要执行的任务打包为docker镜像,或者选择已有镜像,自定义脚本程序,通过pipeline框架来实现调度。

    开源地址: https://github.com/jadepeng/docker-pipeline

    架构

    架构

    • pipeline master 中心节点,管理和调度任务
    • pipeline agent 执行任务的节点,接收到任务后,调用docker执行pipeline任务

    功能特性 && TODO List

    • [x] 分布式框架,高可用,服务注册与状态维护
    • [x] Agent执行任务
    • [x] rolling日志接口
    • [x] 运行老版本pipeline任务
    • [x] 支持定时执行任务(固定周期和cron表达式)
    • [ ] 快速创建任务,支持python、node等脚本程序直接执行
      • [x] python、java等基础镜像
      • [x] 快速docker镜像任务API
      • [ ] 快速创建脚本任务
    • [ ] 根据资源配额(内存、CPU)调度任务, 运行任务需要指定资源配额
    • [ ] agent 增加label标识,调度时可以调度到指定label的agent,比如gpu=true
    • [ ] 增加任务管理web, 管理提交任务、查询运行日志等
      • [x] 复用腾讯bk-job 网页
      • [ ] 修改bk-job前端,适配pipeline

    进展

    2021.07.31

    • 支持定时执行任务(固定周期和cron表达式)
    • 增加分布式mongodb锁,多master时,同时只能有一个master schedule任务

    2021.07.28

    • 新增运行老版本pipeline任务能力
    • 增加日志接口

    2021.07.27

    • 引入bk-job的ui,待修改

    2021.07.21

    • Master 调用 agent执行任务
    • agnet 启动docker执行任务

    2021.07.19

    • 基于jhipster搭建框架
    • 分布式实现

    数据结构

    一个pipeline 任务:

    • 支持多个pipelineTask
    • 一个pipelineTask 包含多个Step
    @Data
    public class Pipeline {
    
        @Id
        private String id;
    
        private String name;
    
        @JSONField(name = "pipeline")
        private List<PipelineTask> pipelineTasks = new ArrayList<>();
    
        private List<Network> networks = Lists.newArrayList(new Network());
    
        private List<Volume> volumes = Lists.newArrayList(new Volume());
    
        private String startNode;
    
        /**
         * 调度类型:
         *      1) CRON, 设置cronExpression
         *      2) FIX_RATE, 设置fixRateInSeconds
         */
        private ScheduleType scheduleType = ScheduleType.NONE;
    
        /**
         * CRON表达式,在scheduleType=CRON 时生效
         */
        private String cronExpression;
    
        /**
         * 固定周期运行,比如每隔多少s,在scheduleType=FIX_RATE 时生效
         */
        private int fixRateInSeconds;
    
        /**
         * 是否需要调度,为true时,才调度
         */
        @Indexed
        private boolean enableTrigger;
    
        private long lastTriggerTime;
    
        @Indexed
        private long nextTriggerTime;
    
        /**
         * 执行超时时间
         */
        private int executorTimeout;
    
        /**
         * 重试次数
         */
        private int executorFailRetryCount;
    
        /**
         * 内存限制
         */
        private String memory;
    
        /**
         * CPU 限制
         */
        private String cpu;
    
        @Data
        @Builder
        public static class PipelineTask {
    
            /**
             * 名称
             */
            String name;
    
            /**
             * 别名
             */
            String alias;
    
            /**
             * 依赖的pipelines,必须依赖的执行完成才能运行该PipelineTask
             */
            List<String> dependencies;
    
            /**
             * 任务步骤,顺序执行
             */
            List<Step> steps;
        }
    
        @Data
        public static class Network {
            String name = "pipeline_default";
            String driver = "bridge";
        }
    
        @Data
        public static class Volume {
            String name = "pipeline_default";
            String driver = "local";
        }
    
        @Data
        public static class StepNetwork {
            private String name;
            private List<String> aliases = Lists.newArrayList("default");
    
            public StepNetwork(String name) {
                this.name = name;
            }
        }
    
    }
    
        
    
    

    举例:

    {
        "_id" : "29103d5e4a77409b9f6050eea8110bb3",
        "name" : "docker image pipeline",
        "pipelineTasks" : [ 
            {
                "name" : "docker image pipeline",
                "steps" : [ 
                    {
                        "name" : "defaultJob",
                        "image" : "java-pipeline:1.0.1",
                        "workingDir" : "/workspace",
                        "environment" : {},
                        "networks" : [ 
                            {
                                "name" : "pipeline_default",
                                "aliases" : [ 
                                    "default"
                                ]
                            }
                        ],
                        "onSuccess" : false,
                        "authConfig" : {}
                    }
                ]
            }
        ],
        "networks" : [ 
            {
                "name" : "pipeline_default",
                "driver" : "bridge"
            }
        ],
        "volumes" : [ 
            {
                "name" : "pipeline_default",
                "driver" : "local"
            }
        ],
        "cronExpression" : "0 0 * * * ?",
        "fixRateInSeconds" : 0,
        "scheduleType" : "CRON",
        "enableTrigger" : true,
        "lastTriggerTime" : 1627744509047,
        "nextTriggerTime" : 1627747200000,
        "executorTimeout" : 0,
        "executorFailRetryCount" : 0,
        "isAvailable" : 1,
        "runningPipelines" : [],
        "finishedPipeliens" : [],
        "created_by" : "admin",
        "created_date" : "2021-07-20T04:33:16.477Z",
        "last_modified_by" : "system",
        "last_modified_date" : "2021-07-31T15:15:09.048Z"
    }
    
    

    使用说明

    安装部署

    编译

    使用mvn编译

    mvn package -DskipTests
    

    部署master

    根据需要,修改master的prod配置文件application-prod.yml

    包含kafka配置,server端口,mongodb地址,jwt secret配置。

    mongodb 会自动新建collection和初始化数据,无需手动导入数据。

    kafka:
        producer:
          bootstrap-servers: 127.0.0.1:9092
          retries: 3
          batch-size: 2000
          buffer-memory: 33554432
        consumer:
          group-id: consumer-pipeline
          auto-offset-reset: earliest
          enable-auto-commit: true
          bootstrap-servers: 172.31.161.38:9092
    
    server:
      port: 8080
    
    spring:
      data:
        mongodb:
          uri: mongodb://127.0.0.1:28017
          database: pipeline
    
    jhipster:
      security:
        authentication:
          jwt:
            base64-secret:
    

    注意master的jwt secret需要和agent的保持一致。

    配置好后,启动:

    nohup java -jar pipeline-master-$version.jar --spring.profiles.active=prod &
    

    可以将application-prod.yml放在和jar包同一目录。

    部署agent

    根据需要,修改master的prod配置文件application-prod.yml

    包含:

    • eureka的defaultZone,配置master的地址
    • 端口
    • docker地址
      • docker-tls-verify: 是否启动tls验证
      • docker-cert-path:启动tls验证的ca证书
      • pipeline-log-path: 运行日志存储路径
    
    eureka:
      instance:
        prefer-ip-address: true
      client:
        service-url:
          defaultZone: http://admin:${jhipster.registry.password}@127.0.0.1:8080/eureka/
    
    server:
      port: 8081
    
    application:
      docker-server: 
      docker-tls-verify: true
      docker-cert-path: /mnt/parastor/pipeline/ca/
      pipeline-log-path: /mnt/parastor/pipeline/logs/
    
    
    jhipster:
      security:
        authentication:
          jwt:
            base64-secret:
    

    执行老版本任务

    POST /api/pipelines/exec-old
    
    
    Body:
    
    {
    	"networks":[
    		{
    			"driver":"bridge",
    			"name":"pipeline_network_3eac4b36209a41e58a5f22dd403fee50"
    		}
    	],
    	"pipeline":[
    		{
    			"alias":"Word",
    			"dependencies":[],
    			"name":"pipeline_task_3eac4b36209a41e58a5f22dd403fee50_1",
    			"nextPipelines":[],
    			"steps":[
    				{
    					"alias":"Word",
    					"auth_config":{},
    					"command":[
    						"echo $CI_SCRIPT | base64 -d | /bin/bash -e"
    					],
    					"entrypoint":[
    						"/bin/bash",
    						"-c"
    					],
    					"environment":{
    						"CI_SCRIPT":"CmlmIFsgLW4gIiRDSV9ORVRSQ19NQUNISU5FIiBdOyB0aGVuCmNhdCA8PEVPRiA+ICRIT01FLy5uZXRyYwptYWNoaW5lICRDSV9ORVRSQ19NQUNISU5FCmxvZ2luICRDSV9ORVRSQ19VU0VSTkFNRQpwYXNzd29yZCAkQ0lfTkVUUkNfUEFTU1dPUkQKRU9GCmNobW9kIDA2MDAgJEhPTUUvLm5ldHJjCmZpCnVuc2V0IENJX05FVFJDX1VTRVJOQU1FCnVuc2V0IENJX05FVFJDX1BBU1NXT1JECnVuc2V0IENJX1NDUklQVAplY2hvICsgamF2YSAtY3AgL2RhdGF2b2x1bWUvcGRmX3RvX3dvcmQvcGRmYm94X3V0aWwtMS4wLVNOQVBTSE9ULmphciBjb20uaWZseXRlay5pbmRleGVyLlJ1bm5lciAtLWlucHV0UERGIC9kYXRhdm9sdW1lL2V4dHJhY3QvZjkyYzJhNzViYWU4NGJiMDg4MzIwNWRiM2YyZGFlNzkvcGRmL2VjNWMwYjk0M2QwYjRmNDI5MzcyMmE1ZGRjNjFlNjZkL0hTNy5wZGYgLS1vdXRwdXRXb3JkIC9kYXRhdm9sdW1lL2V4dHJhY3QvZjkyYzJhNzViYWU4NGJiMDg4MzIwNWRiM2YyZGFlNzkvcGRmVG9Xb3JkL2VjNWMwYjk0M2QwYjRmNDI5MzcyMmE1ZGRjNjFlNjZkLyAtLXNjaGVtYUlucHV0UGF0aCAvZGF0YXZvbHVtZS9leHRyYWN0L2ticWEvZjkyYzJhNzViYWU4NGJiMDg4MzIwNWRiM2YyZGFlNzkgLS1lbnRpdHlJbmRleFBhdGggL2RhdGF2b2x1bWUvZXh0cmFjdC9mOTJjMmE3NWJhZTg0YmIwODgzMjA1ZGIzZjJkYWU3OS9wZGZUb1dvcmQvZW50aXR5IC0tZmllbGRJbmRleFBhdGggL2RhdGF2b2x1bWUvZXh0cmFjdC9mOTJjMmE3NWJhZTg0YmIwODgzMjA1ZGIzZjJkYWU3OS9wZGZUb1dvcmQvZmllbGQgLS10eXBlIGx1Y2VuZSAtLW91dHB1dCAvZGF0YXZvbHVtZS9leHRyYWN0L2Y5MmMyYTc1YmFlODRiYjA4ODMyMDVkYjNmMmRhZTc5L3BkZlRvV29yZC9lYzVjMGI5NDNkMGI0ZjQyOTM3MjJhNWRkYzYxZTY2ZC9lbnRpdHlJbmZvLnR4dApqYXZhIC1jcCAvZGF0YXZvbHVtZS9wZGZfdG9fd29yZC9wZGZib3hfdXRpbC0xLjAtU05BUFNIT1QuamFyIGNvbS5pZmx5dGVrLmluZGV4ZXIuUnVubmVyIC0taW5wdXRQREYgL2RhdGF2b2x1bWUvZXh0cmFjdC9mOTJjMmE3NWJhZTg0YmIwODgzMjA1ZGIzZjJkYWU3OS9wZGYvZWM1YzBiOTQzZDBiNGY0MjkzNzIyYTVkZGM2MWU2NmQvSFM3LnBkZiAtLW91dHB1dFdvcmQgL2RhdGF2b2x1bWUvZXh0cmFjdC9mOTJjMmE3NWJhZTg0YmIwODgzMjA1ZGIzZjJkYWU3OS9wZGZUb1dvcmQvZWM1YzBiOTQzZDBiNGY0MjkzNzIyYTVkZGM2MWU2NmQvIC0tc2NoZW1hSW5wdXRQYXRoIC9kYXRhdm9sdW1lL2V4dHJhY3Qva2JxYS9mOTJjMmE3NWJhZTg0YmIwODgzMjA1ZGIzZjJkYWU3OSAtLWVudGl0eUluZGV4UGF0aCAvZGF0YXZvbHVtZS9leHRyYWN0L2Y5MmMyYTc1YmFlODRiYjA4ODMyMDVkYjNmMmRhZTc5L3BkZlRvV29yZC9lbnRpdHkgLS1maWVsZEluZGV4UGF0aCAvZGF0YXZvbHVtZS9leHRyYWN0L2Y5MmMyYTc1YmFlODRiYjA4ODMyMDVkYjNmMmRhZTc5L3BkZlRvV29yZC9maWVsZCAtLXR5cGUgbHVjZW5lIC0tb3V0cHV0IC9kYXRhdm9sdW1lL2V4dHJhY3QvZjkyYzJhNzViYWU4NGJiMDg4MzIwNWRiM2YyZGFlNzkvcGRmVG9Xb3JkL2VjNWMwYjk0M2QwYjRmNDI5MzcyMmE1ZGRjNjFlNjZkL2VudGl0eUluZm8udHh0Cg=="
    					},
    					"image":"registry.iflyresearch.com/aimind/java:v1.0.0",
    					"name":"pipeline_task_3eac4b36209a41e58a5f22dd403fee50_1",
    					"networks":[
    						{
    							"aliases":[
    								"default"
    							],
    							"name":"pipeline_network_3eac4b36209a41e58a5f22dd403fee50"
    						}
    					],
    					"on_success":true,
    					"volumes":[
    						"pipeline_default:/aimind",
    						"/mnt/parastor/aimind/shared/:/share",
    						"/mnt/parastor/aimind/pipeline-jobs/2021/07/26/3eac4b36209a41e58a5f22dd403fee50:/workspace",
    						"/mnt/parastor/aimind/datavolumes/carmaster:/datavolume"
    					],
    					"working_dir":"/workspace"
    				}
    			]
    		}
    	],
    	"volumes":[
    		{
    			"driver":"local",
    			"name":"pipeline_default"
    		}
    	]
    }
    

    成功返回:

    {
        "retcode": "000000",
        "desc": "成功",
        "data": {
            "id": "8137f344-f52d-4595-bdbb-425363847b61",
        }
    }
    

    可根据id获取日志。

    获取job执行日志

    GET /api/pipelines/jobLog/{jobid}/
    

    结果:

    {
        "retcode": "000000",
        "desc": "成功",
        "data": {
            "currentTask": null,
            "logs": [
                {
                    "id": "e76a686f68b64c0783b7721b058be137",
                    "jobId": "8137f344-f52d-4595-bdbb-425363847b61",
                    "status": "FINISHED",
                    "taskName": "pipeline_task_3eac4b36209a41e58a5f22dd403fee50_1",
                    "exitedValue": 0,
                    "logs": [
                        "proc "pipeline_task_3eac4b36209a41e58a5f22dd403fee50_1" started",
                        "pipeline_task_3eac4b36209a41e58a5f22dd403fee50_1:+ java -cp /datavolume/pdf_to_word/pdfbox_util-1.0-SNAPSHOT.jar com.iflytek.indexer.Runner --inputPDF /datavolume/extract/f92c2a75bae84bb0883205db3f2dae79/pdf/ec5c0b943d0b4f4293722a5ddc61e66d/HS7.pdf --outputWord /datavolume/extract/f92c2a75bae84bb0883205db3f2dae79/pdfToWord/ec5c0b943d0b4f4293722a5ddc61e66d/ --schemaInputPath /datavolume/extract/kbqa/f92c2a75bae84bb0883205db3f2dae79 --entityIndexPath /datavolume/extract/f92c2a75bae84bb0883205db3f2dae79/pdfToWord/entity --fieldIndexPath /datavolume/extract/f92c2a75bae84bb0883205db3f2dae79/pdfToWord/field --type lucene --output /datavolume/extract/f92c2a75bae84bb0883205db3f2dae79/pdfToWord/ec5c0b943d0b4f4293722a5ddc61e66d/entityInfo.txt",
                        "proc "pipeline_task_3eac4b36209a41e58a5f22dd403fee50_1" exited with status 0"
                    ]
                }
            ],
            "exitedValue": 0,
            "status": "FINISHED",
            "pipelineJobSt": 1627477250599,
            "pipelineJobFt": 1627477274299
        }
    }
    

    周期任务

    如果pipelien需要周期执行,需要配置enableTrigger为true,同时设置按照CRON或者FIX_RATE` 运行:

    • FIX_RATE: 固定周期,通过fixRateInSeconds配置周期运行时间

    示例:每360秒运行一次:

    {
        // pipeline ...
        "pipelineTasks" : [ ],
        "fixRateInSeconds" : 360,
        "scheduleType" : "FIX_RATE",
        "enableTrigger" : true
    }
    
    • CRON: 按照CRON表达式周期执行,通过cronExpression配置.

    示例:每小时开始的时候运行一次:

    {
        // pipeline ...
        "pipelineTasks" : [ ],
        "cronExpression" : "0 0 * * * ?",
        "scheduleType" : "CRON",
        "enableTrigger" : true
    }
    

    更多待解锁

  • 相关阅读:
    Delphi中的操作技巧
    C#的排序算法以及随机产生不重复数字的几个Demo
    使用jquery弹窗动态选择脚本示例
    利用override多态原理实现对相似页面的后台代码的抽象,并实现动态GridView动态列数据绑定
    使用VS2010开发一个简单的自定义字段类型
    C#语言使用多态(接口与override) ——帮您剔除对面向对象多态性的疑惑
    探讨复杂linq之group by 和 join
    感悟从java到.NET开发快速入门总结
    使用控制台调试SharePoint出现的一些问题的解决方案
    论欧洲列强争霸霸主——欧洲杯冠军猜想
  • 原文地址:https://www.cnblogs.com/xiaoqi/p/docker-pipeline.html
Copyright © 2020-2023  润新知