• Python常用库之Requests自我总结


    Python常用库之Requests自我总结

    简介

    pip install requests
    

    使用

    requests的使用流程:

    1. 引入包
    2. 发送get, post请求

    发送请求

    参数说明

    常用的参数说明如下:

    get请求

    1. 无参数请求, url参数必须
    import requests
    
    r = requests.get(url="https://github.com/timeline.json")
    print(r.text)
    
    {"message":"Hello there, wayfaring stranger. If you’re reading this then you probably didn’t see our blog post a couple of years back announcing that this API would go away: http://git.io/17AROg Fear not, you should be able to get what you need from the shiny new Events API instead.","documentation_url":"https://developer.github.com/v3/activity/events/#list-public-events"}
    
    1. 有参数的请求
    import requests
    
    params = {"key1":"python", "key2":"java"}
    
    r = requests.get(url="http://httpbin.org/get", params = params)
    
    print('url is {}'.format(r.url))
    print('The status code is {}'.format(r.status_code))
    print('The cookie info is {}'.format(r.cookies))
    print('return body is {}'.format(r.json()))
    
    url is http://httpbin.org/get?key1=python&key2=java
    The status code is 200
    The cookie info is <RequestsCookieJar[]>
    return body is {'args': {'key1': 'python', 'key2': 'java'}, 'headers': {'Accept': '*/*', 'Accept-Encoding': 'gzip, deflate', 'Host': 'httpbin.org', 'User-Agent': 'python-requests/2.21.0'}, 'origin': '114.94.175.75, 114.94.175.75', 'url': 'https://httpbin.org/get?key1=python&key2=java'}
    
    1. 请求下载图片
    import requests
    from io import BytesIO
    from PIL import Image
    
    r = requests.get('https://pic3.zhimg.com/247d9814fec770e2c85cc858525208b2_is.jpg')
    i = Image.open(BytesIO(r.content))
    i.show()
    

    Post请求:上传表单,文本,文件,图片

    post请求的方式有很多种, 如上传表单,发送文本、文件或者图片以及下载文件

    • 表单形式提交的post请求:

      • 将数据传递给post()方法的data参数
    • json文本形式提交的post请求

      • 将json数据dumps后传递给data参数
      • 直接将json数据传递给post()方法的json参数
    • 单个文件提交的post请求

      • 将文件流给post()方法的files参数
    • 多个文件提交的post请求

      • 将文件设到一个元组的列表中,其中元组结构为 (form_field_name, file_info);然后将数据传递给post()方法的files
    1. 表单形式发送post请求
    import requests
    
    # review the dict method because they are used commonally
    data = {}
    data.setdefault('custname', 'woodman')
    data.update({'custtel':'13012345678','custemail':'woodman@11.com', 'size':'small'})
    print(data)
    
    r = requests.post(url='http://httpbin.org/post', data=data)
    r.json()
    
    {'custname': 'woodman', 'custtel': '13012345678', 'custemail': 'woodman@11.com', 'size': 'small'}
    
    
    
    
    
    {'args': {},
     'data': '',
     'files': {},
     'form': {'custemail': 'woodman@11.com',
      'custname': 'woodman',
      'custtel': '13012345678',
      'size': 'small'},
     'headers': {'Accept': '*/*',
      'Accept-Encoding': 'gzip, deflate',
      'Content-Length': '74',
      'Content-Type': 'application/x-www-form-urlencoded',
      'Host': 'httpbin.org',
      'User-Agent': 'python-requests/2.21.0'},
     'json': None,
     'origin': '114.94.175.75, 114.94.175.75',
     'url': 'https://httpbin.org/post'}
    
    1. post json格式请求
    import requests
    import json
    
    url = 'https://api.github.com/some/endpoint'
    playload = {}
    playload.setdefault('some', 'data')
    
    # use the parameter named data to transfer the data
    r = requests.post(url, data=json.dumps(playload))
    print(r.text)
    
    # use the parameter named json to transfer the data
    r1 = requests.post(url, json=playload)
    print(r1.text)
    
    {"message":"Not Found","documentation_url":"https://developer.github.com/v3"}
    {"message":"Not Found","documentation_url":"https://developer.github.com/v3"}
    
    1. post提交单个文件
    # 上传单个文件
    url = 'http://httpbin.org/post'
    # 注意文件打开的模式,使用二进制模式不容易发生错误
    files = {'file': open('report.txt', 'rb')}
    # 也可以显式地设置文件名,文件类型和请求头
    # files = {'file': ('report.xls', open('report.xls', 'rb'), 'application/vnd.ms-excel', {'Expires': '0'})}
    r = requests.post(url, files=files)
    r.encoding = 'utf-8'
    print(r.text)
    
    ---------------------------------------------------------------------------
    
    FileNotFoundError                         Traceback (most recent call last)
    
    <ipython-input-24-0f0c20495d4f> in <module>()
          2 url = 'http://httpbin.org/post'
          3 # 注意文件打开的模式,使用二进制模式不容易发生错误
    ----> 4 files = {'file': open('report.txt', 'rb')}
          5 # 也可以显式地设置文件名,文件类型和请求头
          6 # files = {'file': ('report.xls', open('report.xls', 'rb'), 'application/vnd.ms-excel', {'Expires': '0'})}
    
    
    FileNotFoundError: [Errno 2] No such file or directory: 'report.txt'
    
    1. 上传多个文件
    url = 'http://httpbin.org/post'
    multiple_files = [
       ('images', ('foo.png', open('foo.png', 'rb'), 'image/png')),
       ('images', ('bar.png', open('bar.png', 'rb'), 'image/png'))]
    r = requests.post(url, files=multiple_files)
    print(r.text)
    
    
    ---------------------------------------------------------------------------
    
    FileNotFoundError                         Traceback (most recent call last)
    
    <ipython-input-25-f6b08b6db381> in <module>()
          1 url = 'http://httpbin.org/post'
          2 multiple_files = [
    ----> 3    ('images', ('foo.png', open('foo.png', 'rb'), 'image/png')),
          4    ('images', ('bar.png', open('bar.png', 'rb'), 'image/png'))]
          5 r = requests.post(url, files=multiple_files)
    
    
    FileNotFoundError: [Errno 2] No such file or directory: 'foo.png'
    
    1. 将字符串作为文件上传
    url = 'http://httpbin.org/post'
    files = {'file': ('report.csv', 'some,data,to,send
    another,row,to,send
    ')}
    r = requests.post(url, files=files)
    print(r.text)
    
    {
      "args": {}, 
      "data": "", 
      "files": {
        "file": "some,data,to,send
    another,row,to,send
    "
      }, 
      "form": {}, 
      "headers": {
        "Accept": "*/*", 
        "Accept-Encoding": "gzip, deflate", 
        "Content-Length": "184", 
        "Content-Type": "multipart/form-data; boundary=c0c362abb4044e30928b8f66c8ac1c40", 
        "Host": "httpbin.org", 
        "User-Agent": "python-requests/2.21.0"
      }, 
      "json": null, 
      "origin": "114.94.175.75, 114.94.175.75", 
      "url": "https://httpbin.org/post"
    }
    

    get与post请求的header与cookie管理

    获取get与post请求响应的header与cookie分别使用r.headers与r.cookies。
    如果提交请求数据是对header与cookie有修改,需要在get()与post()方法中加入headers或cookies参数,它们值的类型都是字典。

    1. 定制请求头
      headers 里面有两个比较重要的参数:User-Agent和 Referer

    有时候访问的时候,出现service不可用的情况,要把这两个参数加上去。

    注意:requests自带headers管理,一般情况下不需要设置header信息。Requests 不会基于定制 header 的具体情况改变自己的行为。只不过在最后的请求中,所有的 header 信息都会被传递进去。

    import requests
    
    url = 'https://api.github.com/some/endpoint'
    headers = {'User-Agent':'my-app/0.0.1'}
    
    r = requests.get(url=url, headers = headers)
    print(r.headers)
    print(r.text)
    
    {'Date': 'Tue, 15 Oct 2019 11:21:11 GMT', 'Content-Type': 'application/json; charset=utf-8', 'Transfer-Encoding': 'chunked', 'Server': 'GitHub.com', 'Status': '404 Not Found', 'X-RateLimit-Limit': '60', 'X-RateLimit-Remaining': '58', 'X-RateLimit-Reset': '1571142057', 'X-GitHub-Media-Type': 'github.v3; format=json', 'Access-Control-Expose-Headers': 'ETag, Link, Location, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval, X-GitHub-Media-Type', 'Access-Control-Allow-Origin': '*', 'Strict-Transport-Security': 'max-age=31536000; includeSubdomains; preload', 'X-Frame-Options': 'deny', 'X-Content-Type-Options': 'nosniff', 'X-XSS-Protection': '1; mode=block', 'Referrer-Policy': 'origin-when-cross-origin, strict-origin-when-cross-origin', 'Content-Security-Policy': "default-src 'none'", 'Content-Encoding': 'gzip', 'X-GitHub-Request-Id': '08C7:769F:13B990D:1A07D30:5DA5ABA6'}
    {"message":"Not Found","documentation_url":"https://developer.github.com/v3"}
    
    1. 定制cookies信息
    # 直接以字典型时传递cookie
    url = 'http://httpbin.org/cookies'
    cookies = {"cookies_are":'working'}
    r = requests.get(url, cookies=cookies)
    # 获取响应的cookie信息,返回结果是RequestsCookieJar对象
    print(r.cookies)
    print(r.text)
    
    <RequestsCookieJar[]>
    {
      "cookies": {
        "cookies_are": "working"
      }
    }
    

    session和cookie存储

    如果你向同一主机发送多个请求,每个请求对象让你能够跨请求保持session和cookie信息,这时我们要使用到requests的Session()来保持回话请求的cookie和session与服务器的相一致。

    1. 创建一个session会话
    import requests
    
    s = requests.Session()
    r = s.get(url='https://github.com/timeline.json')
    print(r.text)
    
    # 使用with 去除session没有关闭的风险
    with requests.Session() as s:
        r_post = s.post(url='https://github.com/timeline.json')
        print(r.json())    
    
    {"message":"Hello there, wayfaring stranger. If you’re reading this then you probably didn’t see our blog post a couple of years back announcing that this API would go away: http://git.io/17AROg Fear not, you should be able to get what you need from the shiny new Events API instead.","documentation_url":"https://developer.github.com/v3/activity/events/#list-public-events"}
    {'message': 'Hello there, wayfaring stranger. If you’re reading this then you probably didn’t see our blog post a couple of years back announcing that this API would go away: http://git.io/17AROg Fear not, you should be able to get what you need from the shiny new Events API instead.', 'documentation_url': 'https://developer.github.com/v3/activity/events/#list-public-events'}
    

    requests的session会话需要注意的是会话方法级别的参数也不会被跨请求保持。

    1. session的参数不会被跨请求保持
    s = requests.Session()
    
    r = s.get('http://httpbin.org/cookies', cookies={'from-my': 'browser'})
    print("Using the GET for session and the response is {}".format(r.text))
    
    r1 = s.get('http://httpbin.org/cookies')
    print(r1.text)
    
    
    
    Using the GET for session and the response is {
      "cookies": {
        "from-my": "browser"
      }
    }
    
    <html>
    <head><title>502 Bad Gateway</title></head>
    <body bgcolor="white">
    <center><h1>502 Bad Gateway</h1></center>
    <hr><center>nginx</center>
    </body>
    </html>
    

    requests请求返回对象Response的常用方法

    Response响应类常用属性与方法:

    Response.url 请求url
    Response.status_code 响应状态码
    Response.text 获取响应内容
    Response.json() 活动响应的JSON内容
    Response.ok 请求是否成功,status_code<400 返回True
    Response.headers 响应header信息
    Response.cookies 响应的cookie
    Response.elapsed 请求响应的时间。
    Response.links 返回响应头部的links连接,相当于Response.headers.get('link')
    Response.raw 获取原始套接字响应,需要将初始请求参数stream=True
    Response.content 以字节形式获取响应提,多用于非文本请求
    Response.iter_content() 迭代获取响应数据
    Response.history 重定向请求历史记录
    Response.reason 响应状态的文本原因,如:"Not Found" or "OK"
    Response.close() 关闭并释放链接,释放后不能再次访问’raw’对象。一般不会调用。

    %%debug
    
    import requests
    
    r = requests.get('http://www.baidu.com')
    
    print('状态码:',r.status_code)
    print('请求是否成功:',r.ok)
    print('响应提文本内容:',r.reason)
    print('重定向历史:',r.history)
    print('header的link:',r.links)
    print('响应时长:',r.elapsed)
    # r.raw 获取到内容,请求时将stream设为True
    print('原始套接字响应:',r.raw)
    print('原始套接字响应:',r.raw.read())
    
    NOTE: Enter 'c' at the ipdb>  prompt to continue execution.
    > <string>(3)<module>()
    
    ipdb> r
    状态码: 200
    请求是否成功: True
    响应提文本内容: OK
    重定向历史: []
    header的link: {}
    响应时长: 0:00:00.042962
    原始套接字响应: <urllib3.response.HTTPResponse object at 0x0656CA50>
    原始套接字响应: b''
    --Return--
    None
    > <string>(15)<module>()
    

    参考文献

    1. https://zhuanlan.zhihu.com/p/33288426
  • 相关阅读:
    前后端渲染
    与你一起的日子
    Python 字符串转化成整形数组
    Python列出文件和目录
    Eclipse 导入Gson包
    java.lang.NoClassDefFoundError (Eclipse)
    成功的背后!(给所有IT人)
    事件与概率
    Servlet获取form表单上传文件及其他参数
    Servelt学习笔记之二——使用Servlet提取表单中的数据
  • 原文地址:https://www.cnblogs.com/Tcorner/p/12856934.html
Copyright © 2020-2023  润新知