写了个定时爬取的任务,办法比较笨将原来的内容删掉,重新爬取每天3点爬一遍,幸亏网站比较小... ## startTime=3 import os import time while 1: now=time.time() if time.localtime(now).tm_hour==startTime: os.system("rm -rf crawled") command="bash bin/nutch crawl urls.txt -dir crawled -depth 2" os.system(command) else: time.sleep(30*60)