scrapy.Request 的callback传参的两种方式
1.使用 lambda方式传递参数
def parse(self, response): for sel in response.xpath('//li[@class="clearfix"]/div[@class="list_con"]'): item=DmozItem() item['href']=sel.xpath('h2/a/@href').extract()[0] yield scrapy.Request(item['href'], callback=lambda response, it=item: self.others_parse(response,it),dont_filter=True) yield item def others_parse(self, response, it): it['url'] = response.url yield it
2.在某些情况下,您可能有兴趣向这些回调函数传递参数,以便稍后在第二个回调中接收参数。您可以使用该Request.meta
属性。
def parse(self, response): for sel in response.xpath('//li[@class="clearfix"]/div[@class="list_con"]'): item=DmozItem() item['href']=sel.xpath('h2/a/@href').extract()[0] request= scrapy.Request(item['href'], callback=others_parse,dont_filter=True) request.meta['item'] = item yield request def others_parse(self, response): item = response.meta['item'] item['other_url'] = response.url yield item