• https://scrapingclub.com/exercise/basic_login/


    遇到的问题:csrftoken cfduid 是在request.headers里面的,一直在找怎么在scrapy里get request.header,从scrapy shell ,then fetch then request.headers
    可以get正确的内容,但是scrapy project中,不知道怎么写代码,网上找到response.request.headers,这个写法,但是返回的结果没有cookies
    formdata中的csrfmiddlewaretoken在html里面隐藏着,直接拿到就好了,现在就要拿到csrftoken cfduid构造cookie
    cfduid在response.headers拿不到,但是又不会拿request.headers,于是放弃拿cfduid,只发送csrftoken,试了一下成功了。。。
    def parse(self, response):

    pattern = re.compile('csrftoken=(.*?);')
    csrftoken = pattern.findall(response.headers.get("set-cookie").decode("utf-8"))[0]
    cookie = {
    # '__cfduid': 'd67f5270ed84c0000af9c771fdee950631551004073',
    '_ga': 'GA1.2.2009295084.1551004056',
    '_gid': 'GA1.2.513859849.1551004056',
    'csrftoken': csrftoken
    }
    return scrapy.FormRequest('https://scrapingclub.com/exercise/basic_login/',cookies=cookie, headers=header, callback=self.parse_login,
    formdata={
    'name':'scrapingclub',
    'password':'scrapingclub',
    'csrfmiddlewaretoken':response.css("form input::attr(value)").get()
    }
    )
    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- focus on what you want to be
  • 相关阅读:
    HDU 2795
    HDU 1394
    HDU 1754
    HDU 1166
    SDOI 2006
    HDU 1423
    HDU 1561
    centos7.4 搭建zabbix-server 3.4.5
    vim 简单笔记
    jdk环境并配置环境变量
  • 原文地址:https://www.cnblogs.com/bamboozone/p/10455321.html
Copyright © 2020-2023  润新知