• python scrapy 重复执行


    from twisted.internet import reactor, defer
    from scrapy.crawler import CrawlerRunner
    from scrapy.utils.log import configure_logging
    import time
    import logging
    from scrapy.utils.project import get_project_settings
    
    
    #在控制台打印日志
    configure_logging()
    #CrawlerRunner获取settings.py里的设置信息
    runner = CrawlerRunner(get_project_settings())
    
    @defer.inlineCallbacks
    def crawl():
        while True:
            logging.info("new cycle starting")
            yield runner.crawl("xxxxx")
            #1s跑一次
            time.sleep(1)
        reactor.stop()
    
    crawl()
    reactor.run()
    

      

  • 相关阅读:
    MYSQL[18]
    MYSQL[11]
    hdu 1847
    hdu 2149
    uva 10341
    hdu 1850
    uva 10391字典树
    hdu 2473
    uva 10761
    hdu 1198
  • 原文地址:https://www.cnblogs.com/winstonsias/p/12106667.html
Copyright © 2020-2023  润新知