• Python3 使用requests库读取本地保存的cookie文件实现免登录访问


    1.  读取selenium模块保存的本地cookie文件来访问知乎

    读取http://www.cnblogs.com/strivepy/p/9233389.html保存的本地cookie来访问知乎的用户设置界面,用selenium保存下来的json文件如下格式:

    1 [{"domain": "www.zhihu.com", "expiry": 1527855266.402958, "httpOnly": false, "name": "tgw_l7_route", "path": "/", "secure": false, "value": "200d77f3369d188920b797ddf09ec8d1"},
    2  {"domain": ".zhihu.com", "expiry": 1622462366.40309, "httpOnly": false, "name": "d_c0", "path": "/", "secure": false, "value": ""AFAkkY_hrg2PTvLVtweW-Ok8mRLKop4IJZY=|1527854371""}, 
    3  {"domain": ".zhihu.com", "httpOnly": false, "name": "_xsrf", "path": "/", "secure": false, "value": "7da6b4e4-c77d-47a4-81fa-68b1262235c8"}....后面的删掉了]

    包含很多用不到的信息,比如pathsecure等,在读取cookie只需要读取每个cookienamevalue属性。代码放在名为zhihu.py模块中:

     1 # -*- coding: utf-8 -*-
     2 
     3 import requests
     4 import json
     5 import os
     6 from requests.cookies import RequestsCookieJar
     7 
     8 
     9 def parse_index():
    10     url = 'https://www.zhihu.com/settings/account'
    11     headers = {
    12         'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.146 Safari/537.36'
    13     }
    14     cookies = getcookies_decode_to_dict()
    15     # cookies = getcookies_decode_to_cookiejar()
    16     # requests.get()方法的cookies参数只接收Dict或者CookieJar对象
    17     response = requests.get(url=url, headers=headers, cookies=cookies)
    18     print(response.url)
    19     print(response.text)
    20 
    21 
    22 def getcookies_decode_to_dict():
    23     path = os.getcwd() + '/cookies/'
    24     if not os.path.exists(path):
    25         print('Cookie文件不存在,请先运行cookiesload.py')
    26     else:
    27         cookies_dict = {}
    28         with open(path + 'cookies.txt', 'r') as f:
    29             cookies = json.loads(f.read())
    30             for cookie in cookies:
    31                 cookies_dict[cookie['name']] = cookie['value']
    32             return cookies_dict
    33 
    34 
    35 def getcookies_decode_to_cookiejar():
    36     path = os.getcwd() + '/cookies/'
    37     if not os.path.exists(path):
    38         print('Cookie文件不存在,请先运行cookiesload.py')
    39     else:
    40         cookiejar = RequestsCookieJar()
    41         with open(path + 'cookies.txt', 'r') as f:
    42             cookies = json.loads(f.read())
    43             for cookie in cookies:
    44                 cookiejar.set(cookie['name'], cookie['value'])
    45             return cookiejar
    46 
    47 
    48 if __name__ == '__main__':
    49     parse_index()

    获取到的源码显示,已经成功抓取到知乎用户设置页面。

  • 相关阅读:
    CSS3 探索发现系列:一组梦幻般的 CSS3 动画按钮效果
    JS日历控件集合附效果图、源代码
    uml 类图聚合与组合
    统计 按年,月,季度
    Visual studio 2008快捷键大全 (转)
    SQL常用语句和函数(转)
    C#函数传参的out与ref的应用(转)
    Session.Abandon和Session.Clear有何不同 (转)
    .net C#:String.Format数字格式化输出 (转)
    〈转贴〉如何解决 Windows XP 中的硬件和软件驱动程序问题
  • 原文地址:https://www.cnblogs.com/strivepy/p/9233437.html
Copyright © 2020-2023  润新知