• Python APSchedule安装使用与源码分析


    我们的项目中用apschedule作为核心定时调度模块。所以对apschedule进行了一些调查和源码级的分析。

    1、为什么选择apschedule?

    听信了一句话,apschedule之于python就像是quartz之于java。实际用起来还是不错的。

    2、安装

    # pip安装方式
    $ pip install apscheduler
    # 源码编译方式
    $ wget https://pypi.python.org/pypi/APScheduler/#downloads
    $ python setup.py install

    3、apschedule有四个主要的组件

    1)trigger - 触发器

    2)job stores - 任务存储(内存memory和持久化persistence)

    3)executor - 执行器(实现是基于concurrent.futures的线程池或者进程池)

    4)schedulers - 调度器(控制着其他的组件,最常用的是background方式和blocking方式)

    先上一个例子

    复制代码
    # -*- coding:utf-8 -*-
    import redis
    from datetime import datetime, timedelta
    from apscheduler.schedulers.background import BackgroundScheduler
    from apscheduler.jobstores.redis import RedisJobStore
    from apscheduler.jobstores.sqlalchemy import SQLAlchemyJobStore
    from apscheduler.executors.pool import ThreadPoolExecutor, ProcessPoolExecutor
    from apscheduler.events import EVENT_JOB_MAX_INSTANCES, EVENT_JOB_ERROR, EVENT_JOB_MISSED
    class ScheduleFactory(object):
        def __init__(self):
            if not hasattr(ScheduleFactory, '__scheduler'):
                __scheduler = ScheduleFactory.get_instance()
            self.scheduler = __scheduler
    
        @staticmethod
        def get_instance():
            pool = redis.ConnectionPool(
                host='10.94.99.56',
                port=6379,
            )
            r = redis.StrictRedis(connection_pool=pool)
            jobstores = {
                'redis': RedisJobStore(2, r),
                'default': SQLAlchemyJobStore(url='sqlite:///jobs.sqlite')
            }
            executors = {
                'default': ThreadPoolExecutor(max_workers=30),
                'processpool': ProcessPoolExecutor(max_workers=30)
            }
            job_defaults = {
                'coalesce': False,
                'max_instances': 3
            }
            scheduler = BackgroundScheduler(jobstores=jobstores, executors=executors, job_defaults=job_defaults, daemonic=False)
    return scheduler
    复制代码

    说明:上例中,scheduleFactory被实现为一个单例模式,保证new出的对象全局唯一

    4、对scheduler的选择

    这里只给出两个场景:

    1)BackgroundScheduler:这种方式在创建scheduler的父进程退出后,任务同时停止调度。适用范围:集成在服务中,例如django。

    2)BlockingScheduler:这种方式会阻塞住创建shceduler的进程,适用范围:该程序只干调度这一件事情。

    选择完调度器之后

    1)scheduler.start() 启动调度器

    2)scheduler.shutdown() 停止调度器,调用该方法,调度器等到所有执行中的任务执行完成再退出,可以使用wait=False禁用

    程序变为如下样子

    复制代码
    class ScheduleFactory(object):
        def __init__(self):
            if not hasattr(ScheduleFactory, '__scheduler'):
                __scheduler = ScheduleFactory.get_instance()
            self.scheduler = __scheduler
    
        @staticmethod
        def get_instance():
            pool = redis.ConnectionPool(
                host='10.94.99.56',
                port=6379,
            )
            r = redis.StrictRedis(connection_pool=pool)
            jobstores = {
                'redis': RedisJobStore(2, r),
                'default': SQLAlchemyJobStore(url='sqlite:///jobs.sqlite')
            }
            executors = {
                'default': ThreadPoolExecutor(max_workers=30),
                'processpool': ProcessPoolExecutor(max_workers=30)
            }
            job_defaults = {
                'coalesce': False,
                'max_instances': 3
            }
            scheduler = BackgroundScheduler(jobstores=jobstores, executors=executors, job_defaults=job_defaults, daemonic=False)
            # scheduler = BlockingScheduler(jobstores=jobstores, executors=executors, job_defaults=job_defaults, daemonic=False)
            return scheduler
    
        def start(self):
            self.scheduler.start()
    
        def shutdown(self):
            self.scheduler.shutdown()
    复制代码

    5、对jobstores的选择

    大的方向有两个:

    1)非持久化

    可选的stores:MemoryJobStrore

    适用于你不会频繁启动和关闭调度器,而且对定时任务丢失批次不敏感。

    2)持久化

    可选的stores:SQLAlchemyJobStore, RedisJobStore,MongoDBJobStore,ZooKeeperJobStore

    适用于你对定时任务丢失批次敏感的情况

    jobStores初始化配置的方式是使用一个字典,例如

    jobstores = {
                'redis': RedisJobStore(2, r),
                'default': SQLAlchemyJobStore(url='sqlite:///jobs.sqlite')
            }

    key是你配置store的名字,后面在添加任务的使用,可以指定对应的任务使用对应的store,例如这里选用的都是key=default的store。

    def add_job(self, job_func, interval, id, job_func_params=None)
        self.scheduler.add_job(job_func, jobstore='default', trigger='interval', seconds=interval, id=id, kwargs=job_func_params, executor='default', next_run_time=next_run_time, misfire_grace_time=30)

    6、executor的选择

    只说两个,线程池和进程池。默认default是线程池方式。这个数是执行任务的实际并发数,如果你设置的小了而job添加的比较多,可能出现丢失调度的情况。

    同时对于python多线程场景,如果是计算密集型任务,实际的并发度达不到配置的数量。所以这个数字要根据具体的要求设置。

    一般来说我们设置并发为30,对一般的场景是没有问题的。

    executors = {
                'default': ThreadPoolExecutor(max_workers=30),
                'processpool': ProcessPoolExecutor(max_workers=30)
            }

    同样在add_job的时候,我们可以选择对应的执行器

    def add_job(self, job_func, interval, id, job_func_params=None)
        self.scheduler.add_job(job_func, jobstore='default', trigger='interval', seconds=interval, id=id, kwargs=job_func_params, executor='default', next_run_time=next_run_time, misfire_grace_time=30)

    7、trigger的选择

    这是最简单的一个了,有三种,不用配置

    1、date - 每天的固定时间

    2、interval - 间隔多长时间执行

    3、cron - 正则

    8、job的增删改查接口api可以参看手册

    http://apscheduler.readthedocs.io/en/latest/userguide.html#choosing-the-right-scheduler-job-store-s-executor-s-and-trigger-s

    9、问题fix

    1)2017-07-24 14:06:28,480 [apscheduler.executors.default:120] [WARNING]- Run time of job "etl_func (trigger: interval[0:01:00], next run at: 2017-07-24 14:07:27 CST)" was missed by 0:00:01.245424

    这个问题对应的源码片段是

    复制代码
    def run_job(job, jobstore_alias, run_times, logger_name):
        """
        Called by executors to run the job. Returns a list of scheduler events to be dispatched by the
        scheduler.
    
        """
        events = []
        logger = logging.getLogger(logger_name)
        for run_time in run_times:
            # See if the job missed its run time window, and handle
            # possible misfires accordingly
            if job.misfire_grace_time is not None:
                difference = datetime.now(utc) - run_time
                grace_time = timedelta(seconds=job.misfire_grace_time)
                if difference > grace_time:
                    events.append(JobExecutionEvent(EVENT_JOB_MISSED, job.id, jobstore_alias,
                                                    run_time))
                    logger.warning('Run time of job "%s" was missed by %s', job, difference)
                    continue
    
            logger.info('Running job "%s" (scheduled at %s)', job, run_time)
            try:
                retval = job.func(*job.args, **job.kwargs)
            except:
                exc, tb = sys.exc_info()[1:]
                formatted_tb = ''.join(format_tb(tb))
                events.append(JobExecutionEvent(EVENT_JOB_ERROR, job.id, jobstore_alias, run_time,
                                                exception=exc, traceback=formatted_tb))
                logger.exception('Job "%s" raised an exception', job)
            else:
                events.append(JobExecutionEvent(EVENT_JOB_EXECUTED, job.id, jobstore_alias, run_time,
                                                retval=retval))
                logger.info('Job "%s" executed successfully', job)
    
        return events
    复制代码

    这里面有个参数是misfire_grace_time,默认是1s,如果任务的实际执行时间与任务调度时间的时间差>misfire_grace_time,就会warning并且跳过这次任务的调度!!!

    为什么会发生这个问题?

    1)executor并发度不够,你添加的任务太多

    2) misfire_grace_time,还是太小了

    2)如果你使用的trigger=interval,并且设置了misfire_grace_time=30这种的话,如果你首次启动的时间是10:50那么调度间隔和实际执行可能有1分钟的误差

    怎么解决这个问题呢,你可以通过next_run_time设置首次调度的时间,让这个时间取整分钟。例如

    def add_job(self, job_func, interval, id, job_func_params=None):
            next_minute = (datetime.now() + timedelta(minutes=1)).strftime("%Y-%m-%d %H:%M")
            next_run_time = datetime.strptime(next_minute, "%Y-%m-%d %H:%M")
            self.scheduler.add_job(job_func, jobstore='default', trigger='interval', seconds=interval, id=id, kwargs=job_func_params, executor='default', next_run_time=next_run_time, misfire_grace_time=30)

    3)2017-07-25 11:02:00,003 [apscheduler.scheduler:962] [WARNING]- Execution of job "rule_func (trigger: interval[0:01:00], next run at: 2017-07-25 11:02:00 CST)" skipped: maximum number of running instances reached (1)

    对应的源码为

    复制代码
             for job in due_jobs:
                        # Look up the job's executor
                        try:
                            executor = self._lookup_executor(job.executor)
                        except:
                            self._logger.error(
                                'Executor lookup ("%s") failed for job "%s" -- removing it from the '
                                'job store', job.executor, job)
                            self.remove_job(job.id, jobstore_alias)
                            continue
    
                        run_times = job._get_run_times(now)
                        run_times = run_times[-1:] if run_times and job.coalesce else run_times
                        if run_times:
                            try:
                                executor.submit_job(job, run_times)
                            except MaxInstancesReachedError:
                                self._logger.warning(
                                    'Execution of job "%s" skipped: maximum number of running '
                                    'instances reached (%d)', job, job.max_instances)
                                event = JobSubmissionEvent(EVENT_JOB_MAX_INSTANCES, job.id,
                                                           jobstore_alias, run_times)
                                events.append(event)
                           
    复制代码

    submit_job的源码

        with self._lock:
                if self._instances[job.id] >= job.max_instances:
                    raise MaxInstancesReachedError(job)
    
                self._do_submit_job(job, run_times)
                self._instances[job.id] += 1
    
    

    这是什么意思呢,当对一个job的一次调度的任务数>max_instances,会触发这个异常,并终止调度。例如对一个批次的调度,比如job1,在10:00这次的调度,执行的时候发现有两个任务被添加了。这怎么会发生呢?会。可能09:59分的调度没有成功执行,但是持久化了下来,那么在10:00会尝试再次执行。

    max_instances默认是1,如果想让这种异常放过的话,你可以设置max_instances大一些,比如max_instances=3

    10、如果你想监控你的调度,那么apschedule提供了listener机制,可以监听一些异常。只需要注册监听者就好

    复制代码
      def add_err_listener(self):
            self.scheduler.add_listener(err_listener, EVENT_JOB_MAX_INSTANCES|EVENT_JOB_MISSED|EVENT_JOB_ERROR)
    
    def err_listener(ev):
        msg = ''
        if ev.code == EVENT_JOB_ERROR:
            msg = ev.traceback
        elif ev.code == EVENT_JOB_MISSED:
            msg = 'missed job, job_id:%s, schedule_run_time:%s' % (ev.job_id, ev.scheduled_run_time)
        elif ev.code == EVENT_JOB_MAX_INSTANCES:
            msg = 'reached maximum of running instances, job_id:%s' %(ev.job_id)
        rs = RobotSender()
        rs.send(
            "https://oapi.dingtalk.com/robot/send?access_token=499ca69a2b45402c00503acea611a6ae6a2f1bacb0ca4d33365595d768bb2a58",
            u"[apscheduler调度异常] 异常信息:%s" % (msg),
            '15210885002',
            False
        )
    复制代码

    最后的代码

    复制代码
    # -*- coding:utf-8 -*-
    import redis
    from datetime import datetime, timedelta
    from apscheduler.schedulers.background import BackgroundScheduler, BlockingScheduler
    from apscheduler.jobstores.redis import RedisJobStore
    from apscheduler.jobstores.sqlalchemy import SQLAlchemyJobStore
    from apscheduler.executors.pool import ThreadPoolExecutor, ProcessPoolExecutor
    from apscheduler.events import EVENT_JOB_MAX_INSTANCES, EVENT_JOB_ERROR, EVENT_JOB_MISSED
    from alarmkits.send_robot import RobotSender
    
    
    class ScheduleFactory(object):
        def __init__(self):
            if not hasattr(ScheduleFactory, '__scheduler'):
                __scheduler = ScheduleFactory.get_instance()
            self.scheduler = __scheduler
    
        @staticmethod
        def get_instance():
            pool = redis.ConnectionPool(
                host='10.94.99.56',
                port=6379,
            )
            r = redis.StrictRedis(connection_pool=pool)
            jobstores = {
                'redis': RedisJobStore(2, r),
                'default': SQLAlchemyJobStore(url='sqlite:///jobs.sqlite')
            }
            executors = {
                'default': ThreadPoolExecutor(max_workers=30),
                'processpool': ProcessPoolExecutor(max_workers=30)
            }
            job_defaults = {
                'coalesce': False,
                'max_instances': 3
            }
            scheduler = BackgroundScheduler(jobstores=jobstores, executors=executors, job_defaults=job_defaults, daemonic=False)
            # scheduler = BlockingScheduler(jobstores=jobstores, executors=executors, job_defaults=job_defaults, daemonic=False)
            return scheduler
    
        def start(self):
            self.scheduler.start()
    
        def shutdown(self):
            self.scheduler.shutdown()
    
        def add_job(self, job_func, interval, id, job_func_params=None):
            next_minute = (datetime.now() + timedelta(minutes=1)).strftime("%Y-%m-%d %H:%M")
            next_run_time = datetime.strptime(next_minute, "%Y-%m-%d %H:%M")
            self.scheduler.add_job(
                    job_func,
                    jobstore='default',
                    trigger='interval',
                    seconds=interval,
                    id=id,
                    kwargs=job_func_params,
                    executor='default',
                    next_run_time=next_run_time,
                    misfire_grace_time=30,
                    max_instances=3
            )
    
        def remove_job(self, id):
            self.scheduler.remove_job(id)
    
        def modify_job(self, id, interval):
            self.scheduler.modify_job(job_id=id, seconds=interval)
    
        def add_err_listener(self):
            self.scheduler.add_listener(err_listener, EVENT_JOB_MAX_INSTANCES|EVENT_JOB_MISSED|EVENT_JOB_ERROR)
    
    def err_listener(ev):
        msg = ''
        if ev.code == EVENT_JOB_ERROR:
            msg = ev.traceback
        elif ev.code == EVENT_JOB_MISSED:
            msg = 'missed job, job_id:%s, schedule_run_time:%s' % (ev.job_id, ev.scheduled_run_time)
        elif ev.code == EVENT_JOB_MAX_INSTANCES:
            msg = 'reached maximum of running instances, job_id:%s' %(ev.job_id)
        rs = RobotSender()
        rs.send(
            "https://oapi.dingtalk.com/robot/send?access_token=499ca69a2b45402c00503acea611a6ae6a2f1bacb0ca4d33365595d768bb2a58",
            u"[apscheduler调度异常] 异常信息:%s" % (msg),
            '15210885002',
            False
        )
    复制代码

            

  • 相关阅读:
    招银网络
    MYSQL基础
    http中get和post请求的作用和区别
    设计模式
    STL浅析
    云盾态势感知系统检测到您的服务器出现了紧急安全事件:挖矿木马
    SVN服务器搭建详解--权限划分
    MySQL主主复制,mysql主从复制,MySQL+keepalived故障转移。
    Redhat7.0系统利用amoeba对mysql数据进行读写分离的操作,MySQL数据库的主从配置
    源码安装zabbix LNMP源码安装
  • 原文地址:https://www.cnblogs.com/zhuminghui/p/9145319.html
Copyright © 2020-2023  润新知