一个爬虫项目里有多个爬虫，如何识别数据来源

一个爬虫项目里有多个爬虫，如何识别数据来源
问题描述：在一个爬虫项目里有多个爬虫进行数据的爬取，如何在pipeline中识别数据是来自哪个爬虫的，

方法：

方法一：在爬虫的parse函数下，对爬取的数据添加一个标识字段：
```
1 def parse(self, response):
2     item["come_from"]="spider_name"
```
方法二：在pipelines.py中的process_item函数里的spider的属性来判断：
```
1 class MyspiderPipeline(object):
2     def process_item(self, item, spider):
3         if spider.name=="spider_name":#spider_name是自己定义的爬虫名
4             
```
相关阅读:
linux下实现在程序运行时的函数替换(热补丁)【转】
进程的切换和系统的一般执行过程【转】
linux 系统函数之（dirname, basename）【转】
PHP的ob_start()函数用法
 JavaScript闭包（Closure）学习笔记
 利用PHP的register_shutdown_function来记录PHP的输出日志，模拟析构函数
 PHP get_class_methods函数用法
 PHP中的魔术方法：__construct, __destruct , __call, __callStatic,__get, __set, __isset, __unset , __sleep, __wakeup, __toString, __set_state, __clone and __autoload
PHP get_class 返回对象的类名
 利用session_set_save_handler()函数将session保存到MySQL数据库中
原文地址：https://www.cnblogs.com/zhiliang9408/p/10003555.html