requests学习

一、简介

requests是使用Apache2 licensed 许可证的HTTP库。用python编写。比urllib2模块更简洁。

Request支持HTTP连接保持和连接池，支持使用cookie保持会话，支持文件上传，支持自动响应内容的编码，支持国际化的URL和POST数据自动编码。

在python内置模块的基础上进行了高度的封装，从而使得python进行网络请求时，变得人性化，使用Requests可以轻而易举的完成浏览器可有的任何操作。

Requests 完全满足今日 web 的需求。

Keep-Alive & 连接池
国际化域名和 URL
带持久 Cookie 的会话
浏览器式的 SSL 认证
自动内容解码
基本/摘要式的身份认证
优雅的 key/value Cookie
自动解压
Unicode 响应体
HTTP(S) 代理支持
文件分块上传
流下载
连接超时
分块请求
支持 .netrc

requests主要收集了以下模块：
　　requests.Request
　　requests.Response
　　requests.Session 用于
　　requests.HTTPError 用于

requests主要包含了以下方法：
　　requests.request
　　requests.get
　　requests.post
　　requests.cookies
　　requests.sessions
　　requests.ssl
　　requests.head
　　requests.put
　　requests.delete
　　requests.options
　　requests.session
　　requests.pacth

二、requests模块定义了以下方法：

1、request

Help on function request in module requests.api:

request(method, url, **kwargs)
    Constructs and sends a :class:`Request <Request>`.
    
    :param method: method for the new :class:`Request` object: ``GET``, ``OPTIONS``, ``HEAD``, ``POST``, ``PUT``, ``PATCH``, or ``DELETE``.
    :param url: URL for the new :class:`Request` object.
    :param params: (optional) Dictionary, list of tuples or bytes to send
        in the query string for the :class:`Request`.
    :param data: (optional) Dictionary, list of tuples, bytes, or file-like
        object to send in the body of the :class:`Request`.
    :param json: (optional) A JSON serializable Python object to send in the body of the :class:`Request`.
    :param headers: (optional) Dictionary of HTTP Headers to send with the :class:`Request`.
    :param cookies: (optional) Dict or CookieJar object to send with the :class:`Request`.
    :param files: (optional) Dictionary of ``'name': file-like-objects`` (or ``{'name': file-tuple}``) for multipart encoding upload.
        ``file-tuple`` can be a 2-tuple ``('filename', fileobj)``, 3-tuple ``('filename', fileobj, 'content_type')``
        or a 4-tuple ``('filename', fileobj, 'content_type', custom_headers)``, where ``'content-type'`` is a string
        defining the content type of the given file and ``custom_headers`` a dict-like object containing additional headers
        to add for the file.
    :param auth: (optional) Auth tuple to enable Basic/Digest/Custom HTTP Auth.
    :param timeout: (optional) How many seconds to wait for the server to send data
        before giving up, as a float, or a :ref:`(connect timeout, read
        timeout) <timeouts>` tuple.
    :type timeout: float or tuple
    :param allow_redirects: (optional) Boolean. Enable/disable GET/OPTIONS/POST/PUT/PATCH/DELETE/HEAD redirection. Defaults to ``True``.
    :type allow_redirects: bool
    :param proxies: (optional) Dictionary mapping protocol to the URL of the proxy.
    :param verify: (optional) Either a boolean, in which case it controls whether we verify
            the server's TLS certificate, or a string, in which case it must be a path
            to a CA bundle to use. Defaults to ``True``.
    :param stream: (optional) if ``False``, the response content will be immediately downloaded.
    :param cert: (optional) if String, path to ssl client cert file (.pem). If Tuple, ('cert', 'key') pair.
    :return: :class:`Response <Response>` object
    :rtype: requests.Response

help(requests.request)

简单代码如下：

>>> import requests
>>> req = requests.request('GET', 'https://httpbin.org/get')
>>> req
<Response [200]>

2、get

Help on function get in module requests.api:

get(url, params=None, **kwargs)
    Sends a GET request.
    
    :param url: URL for the new :class:`Request` object.
    :param params: (optional) Dictionary, list of tuples or bytes to send
        in the query string for the :class:`Request`.
    :param \*\*kwargs: Optional arguments that ``request`` takes.
    :return: :class:`Response <Response>` object
    :rtype: requests.Response

help(requests.get)

简单代码如下：

import requests
# requests.get = get(url, params=None, **kwargs)
url = "http://www.bjgjwy.net/"
user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'

response = requests.get(url)                #response是<class 'requests.models.Response'>
print(response.text)                        #response.text是str类型，response.content是bytes类型

3、post

Help on function post in module requests.api:

post(url, data=None, json=None, **kwargs)
    Sends a POST request.
    
    :param url: URL for the new :class:`Request` object.
    :param data: (optional) Dictionary, list of tuples, bytes, or file-like
        object to send in the body of the :class:`Request`.
    :param json: (optional) json data to send in the body of the :class:`Request`.
    :param \*\*kwargs: Optional arguments that ``request`` takes.
    :return: :class:`Response <Response>` object
    :rtype: requests.Response

help(requests.post)

简单代码如下：

#requests.post = post(url, data=None, json=None, **kwargs)
>>> payload = {'key1': 'value1', 'key2': 'value2'}
>>> r = requests.post('http://httpbin.org/post', data = payload)
>>> print(r.text)
{
  "args": {},
  "data": "",
  "files": {},
  "form": {
    "key1": "value1",
    "key2": "value2"
  },
  "headers": {
    "Accept": "*/*",
    "Accept-Encoding": "gzip, deflate",
    "Content-Length": "23",
    "Content-Type": "application/x-www-form-urlencoded",
    "Host": "httpbin.org",
    "User-Agent": "python-requests/2.26.0",
    "X-Amzn-Trace-Id": "Root=1-6106c6ed-6b89c461168de0fc642b5bdd"
  },
  "json": null,
  "origin": "183.8.9.128",
  "url": "http://httpbin.org/post"
}

4、总结

# HTTP请求类型
# get类型
r = requests.get('https://github.com/timeline.json')
# post类型
r = requests.post("http://m.ctrip.com/post")
# put类型
r = requests.put("http://m.ctrip.com/put")
# delete类型
r = requests.delete("http://m.ctrip.com/delete")
# head类型
r = requests.head("http://m.ctrip.com/head")
# options类型
r = requests.options("http://m.ctrip.com/get")

# 获取响应内容
print(r.content) #以字节的方式去显示，中文显示为字符
print(r.text) #以文本的方式去显示

#URL传递参数
payload = {'keyword': '香港', 'salecityid': '2'}
r = requests.get("http://m.ctrip.com/webapp/tourvisa/visa_list", params=payload) 
print（r.url） #示例为http://m.ctrip.com/webapp/tourvisa/visa_list?salecityid=2&keyword=香港

#获取/修改网页编码
r = requests.get('https://github.com/timeline.json')
print （r.encoding）


#json处理
r = requests.get('https://github.com/timeline.json')
print（r.json()） # 需要先import json    

# 定制请求头
url = 'http://m.ctrip.com'
headers = {
'User-Agent' : 'Mozilla/5.0 (Linux; Android 4.2.1; en-us; 
Nexus 4 Build/JOP40D) AppleWebKit/535.19 (KHTML, 
like Gecko) Chrome/18.0.1025.166 Mobile Safari/535.19'
}
r = requests.post(url, headers=headers)
print （r.request.headers)

#复杂post请求
url = 'http://m.ctrip.com'
payload = {'some': 'data'}
r = requests.post(url, data=json.dumps(payload)) #如果传递的payload是string而不是dict，需要先调用dumps方法格式化一下

# post多部分编码文件
url = 'http://m.ctrip.com'
files = {'file': open('report.xls', 'rb')}
r = requests.post(url, files=files)

# 响应状态码
r = requests.get('http://m.ctrip.com')
print(r.status_code)
    
# 响应头
r = requests.get('http://m.ctrip.com')
print (r.headers)
print (r.headers['Content-Type'])
print (r.headers.get('content-type')) #访问响应头部分内容的两种方式
    
# Cookies
url = 'http://example.com/some/cookie/setting/url'
r = requests.get(url)
r.cookies['example_cookie_name']    #读取cookies
    
url = 'http://m.ctrip.com/cookies'
cookies = dict(cookies_are='working')
r = requests.get(url, cookies=cookies) #发送cookies

#Github 将所有的 HTTP 请求重定向到 HTTPS：
>>> r = requests.get('http://github.com')
>>> r.url
'https://github.com/'
>>> r.status_code
200
>>> r.history
[<Response [301]>]

#如果你使用的是GET、OPTIONS、POST、PUT、PATCH 或者 DELETE，那么你可以通过 allow_redirects 参数禁用重定向处理：
>>> r = requests.get('http://github.com', allow_redirects=False)
>>> r.status_code
301
>>> r.history
[]

#设置超时时间
r = requests.get('http://m.ctrip.com', timeout=0.001)

#设置访问代理
proxies = {
           "http": "http://10.10.1.10:3128",
           "https": "http://10.10.1.100:4444",
          }
r = requests.get('http://m.ctrip.com', proxies=proxies)


#如果代理需要用户名和密码，则需要这样：
proxies = {
    "http": "http://user:pass@10.10.1.10:3128/",
}

5、实战运用

（1）直接使用已知的cookie访问

特点：

　　简单，但需要先在浏览器登录

原理：

　　简单地说，cookie保存在发起请求的客户端中，服务器利用cookie来区分不同的客户端。因为http是一种无状态的连接，当服务器一下子收到好几个请求时，是无法判断出哪些请求是同一个客户端发起的。而“访问登录后才能看到的页面”这一行为，恰恰需要客户端向服务器证明：“我是刚才登录过的那个客户端”。于是就需要cookie来标识客户端的身份，以存储它的信息（如登录状态）。

　　当然，这也意味着，只要得到了别的客户端的cookie，我们就可以假冒成它来和服务器对话。这给我们的程序带来了可乘之机。

　　我们先用浏览器登录，然后使用开发者工具查看cookie。接着在程序中携带该cookie向网站发送请求，就能让你的程序假扮成刚才登录的那个浏览器，得到只有登录后才能看到的页面。

具体步骤：

1.用浏览器登录，获取浏览器里的cookie字符串

　　先使用浏览器登录。再打开开发者工具，转到network选项卡。在左边的Name一栏找到当前的网址，选择右边的Headers选项卡，查看Request Headers，这里包含了该网站颁发给浏览器的cookie。对，就是后面的字符串。把它复制下来，一会儿代码里要用到。

　　注意，最好是在运行你的程序前再登录。如果太早登录，或是把浏览器关了，很可能复制的那个cookie就过期无效了。

2.写代码

import requests

headers = {
'Cookie': 'Hm_lvt_6dfe3c8f195b43b8e667a2a2e5936122=1619613970; Hm_lvt_9a6989efd45cf2d0fd1001009b528352=1628663333; PHPSESSID=v3j4e701lbbo2vqj0anic5c8r6; username=test_spider; _identity-frontend=e996a1b5148c9ad539c3fef0cda920f86aba775e47e22204b90777063e2b079aa:2:{i:0;s:18:"_identity-frontend";i:1;s:19:"[194185,"",2592000]";}; Hm_lpvt_9a6989efd45cf2d0fd1001009b528352=1628663346',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.182 Safari/537.36'
}
url = 'https://www.biquwx.la/modules/article/bookcase.php'
res = requests.get(url=url,headers=headers)

print(res._content.decode("utf-8"))

（2）模拟登录后用session保持登录状态

原理：
　　我们先在程序中向网站发出登录请求，也就是提交包含登录信息的表单（用户名、密码等）。

session是会话的意思。和cookie的相似之处在于，它也可以让服务器“认得”客户端。简单理解就是，把每一个客户端和服务器的互动当作一个“会话”。既然在同一个“会话”里，服务器自然就能知道这个客户端是否登录过。
具体步骤：
1.找出表单提交到的页面

　　还是要利用浏览器的开发者工具。转到network选项卡，并勾选Preserve Log（重要！）。在浏览器里登录网站。然后在左边的Name一栏找到表单提交到的页面。怎么找呢？看看右侧，转到Headers选项卡。首先，在General那段，Request Method应当是POST。其次最下方应该要有一段叫做Form Data的，里面可以看到你刚才输入的用户名和密码等。也可以看看左边的Name，如果含有login这个词，有可能就是提交表单的页面（不一定！）。
　　这里要强调一点，“表单提交到的页面”通常并不是你填写用户名和密码的页面！所以要利用工具来找到它。

2.找出要提交的数据
　　虽然你在浏览器里登陆时只填了用户名和密码，但表单里包含的数据可不只这些。从Form Data里就可以看到需要提交的所有数据。

3.写代码

import requests

#登录时需要POST的数据
data = {
'LoginForm[username]': 'test_spider',
'LoginForm[password]': 'test_spiders',
'action': 'login',
'submit':  ' 登  录 '
}

#设置请求头
headers = {'User-agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36'}

#登录时表单提交到的地址（用开发者工具可以看到）
login_url = 'https://www.biquwx.la/login.php'

#构造Session
session = requests.Session()

#在session中发送登录请求，此后这个session里就存储了cookie
#可以用print(session.cookies.get_dict())查看
resp = session.post(login_url, data)


#登录后才能访问的网页
url = 'https://www.biquwx.la/modules/article/bookcase.php'

#构造访问请求
resp = session.get(url)


print(resp.content.decode('utf-8'))

（3）使用无头浏览器访问

特点：

　　功能强大，几乎可以对付任何网页，但会导致代码效率低

原理：

　　如果能在程序里调用一个浏览器来访问网站，那么像登录这样的操作就轻而易举了。在Python中可以使用Selenium库来调用浏览器，写在代码里的操作（打开网页、点击……）会变成浏览器忠实地执行。这个被控制的浏览器可以是Firefox，Chrome等，但最常用的还是PhantomJS这个无头（没有界面）浏览器。也就是说，只要把填写用户名密码、点击“登录”按钮、打开另一个网页等操作写到程序中，PhamtomJS就能确确实实地让你登录上去，并把响应返回给你。

具体步骤：

1.安装selenium库、PhantomJS浏览器

2.在源代码中找到登录时的输入文本框、按钮这些元素

　　因为要在无头浏览器中进行操作，所以就要先找到输入框，才能输入信息。找到登录按钮，才能点击它。

　　在浏览器中打开填写用户名密码的页面，将光标移动到输入用户名的文本框，右键，选择“审查元素”，就可以在右边的网页源代码中看到文本框是哪个元素。同理，可以在源代码中找到输入密码的文本框、登录按钮。

3.考虑如何在程序中找到上述元素

　　Selenium库提供了find_element(s)_by_xxx的方法来找到网页中的输入框、按钮等元素。其中xxx可以是id、name、tag_name（标签名）、class_name（class），也可以是xpath（xpath表达式）等等。当然还是要具体分析网页源代码。

webdriver.PhantomJS常用属性如下

['add_cookie', 'application_cache', 'back', 'close', 'create_web_element', 'current_url', 'current_window_handle', 'delete_all_cookies',
 'delete_cookie', 'desired_capabilities', 'execute', 'execute_async_script', 'execute_script', 'file_detector', 'file_detector_context',
 'find_element', 'find_element_by_class_name', 'find_element_by_css_selector', 'find_element_by_id', 'find_element_by_link_text', 
 'find_element_by_name', 'find_element_by_partial_link_text', 'find_element_by_tag_name', 'find_element_by_xpath', 'find_elements', 
 'find_elements_by_class_name', 'find_elements_by_css_selector', 'find_elements_by_id', 'find_elements_by_link_text', 'find_elements_by_name',
 'find_elements_by_partial_link_text', 'find_elements_by_tag_name', 'find_elements_by_xpath', 'forward', 'fullscreen_window', 'get', 
 'get_cookie', 'get_cookies', 'get_log', 'get_screenshot_as_base64', 'get_screenshot_as_file', 'get_screenshot_as_png', 'get_window_position',
 'get_window_rect', 'get_window_size', 'implicitly_wait', 'log_types', 'maximize_window', 'minimize_window', 'mobile', 'name', 'orientation',
 'page_source', 'quit', 'refresh', 'save_screenshot', 'set_page_load_timeout', 'set_script_timeout', 'set_window_position', 'set_window_rect',
 'set_window_size', 'start_client', 'start_session', 'stop_client', 'switch_to', 'switch_to_active_element', 'switch_to_alert', 
 'switch_to_default_content', 'switch_to_frame', 'switch_to_window', 'title', 'window_handles']

4.写代码

from selenium import webdriver
from time import sleep

# 创建一个浏览器对象，将驱动程序加载到浏览器中
pjs_obj = webdriver.PhantomJS(executable_path='/root/python/requests/phantomjs-2.1.1-linux-x86_64/bin/phantomjs')
# 浏览器对象执行get方法相当于手动打开对应的url网址
pjs_obj.get('https://www.biquwx.la/')
sleep(2)

# 使用开发者工具定位到要输入的文本框,拿到该标签的属性
username = pjs_obj.find_element_by_id('username')
# 在文本框中录入关键字相当于手动输入账号
username.send_keys('test_spider')
sleep(2)

# 使用开发者工具定位到要输入的文本框,拿到该标签的属性
password = pjs_obj.find_element_by_id('password')
# 在文本框中录入关键字相当于手动输入密码
password.send_keys('test_spiders')

btn = pjs_obj.find_element_by_class_name('int')
# 相当于手动点击按钮
btn.click()
sleep(10)

# 截图
pjs_obj.save_screenshot('1.png')

# 这里可以进行别的代码,比如获取最终页面的源码数据
# 执行js代码（让滚动条向下偏移n个像素（作用：动态加载了更多的电影信息））
js = 'window.scrollTo(0,document.body.scrollHeight)'
pjs_obj.execute_script(js)  # 该函数可以执行一组字符串形式的js代码
sleep(2)
pjs_obj.execute_script(js)  # 该函数可以执行一组字符串形式的js代码
sleep(2)

# 使用爬虫程序爬去当前url中的内容 
html_source = pjs_obj.page_source  # 该属性可以获取当前浏览器的当前页的源码（html） 
with open('./source.html', 'w', encoding='utf-8') as fp:
    fp.write(html_source)
pjs_obj.quit()

访问抽屉网站

#因为是模态浏览器对话框，所以先下载好浏览器驱动
from selenium import webdriver
from time import sleep

# 创建一个浏览器对象，将驱动程序加载到浏览器中
pjs_obj = webdriver.Chrome(executable_path='D:\Ware\installwinsoft\chromedriver_win32\chromedriver.exe')
pjs_obj.maximize_window()

# 浏览器对象执行get方法相当于手动打开对应的url网址
pjs_obj.get('https://dig.chouti.com/')
sleep(2)

btn1 = pjs_obj.find_element_by_id('login_btn')
# 相当于手动点击按钮
btn1.click()
sleep(4)

# 使用开发者工具定位到要输入的文本框,拿到该标签的属性
username = pjs_obj.find_element_by_name("phone")
# 在文本框中录入关键字相当于手动输入账号
username.send_keys('1xxxxxxxxxx')
sleep(2)

# 使用开发者工具定位到要输入的文本框,拿到该标签的属性
password = pjs_obj.find_element_by_name("password")
# 在文本框中录入关键字相当于手动输入密码
password.send_keys('spiders123456')
sleep(2)

#因为是模态对话框,所以用selenium是不能点击登录按钮的,需要执行js代码
btn = 'document.getElementsByClassName("btn-large")[0].click()'
pjs_obj.execute_script(btn)
sleep(10)
pjs_obj.save_screenshot('1.png')

访问抽屉网站

访问抽屉网站

6、验证码问题

（1）输入式验证码

这种验证码主要是通过用户输入图片中的字母、数字、汉字等进行验证。如下图：

解决思路：这种是最简单的一种，只要识别出里面的内容，然后填入到输入框中即可。这种识别技术叫OCR，这里我们推荐使用Python的第三方库，tesserocr。对于没有什么背影影响的验证码如图2，直接通过这个库来识别就可以。但是对于有嘈杂的背景的验证码这种，直接识别识别率会很低，遇到这种我们就得需要先处理一下图片，先对图片进行灰度化，然后再进行二值化，再去识别，这样识别率会大大提高。

（2）滑动式验证码

这种是将备选碎片直线滑动到正确的位置，如下图

解决思路：对于这种验证码就比较复杂一点，但也是有相应的办法。我们直接想到的就是模拟人去拖动验证码的行为，点击按钮，然后看到了缺口的位置，最后把拼图拖到缺口位置处完成验证。
第一步：点击按钮。然后我们发现，在你没有点击按钮的时候那个缺口和拼图是没有出现的，点击后才出现，这为我们找到缺口的位置提供了灵感。
第二步：拖到缺口位置。我们知道拼图应该拖到缺口处，但是这个距离如果用数值来表示？通过我们第一步观察到的现象，我们可以找到缺口的位置。这里我们可以比较两张图的像素，设置一个基准值，如果某个位置的差值超过了基准值，那我们就找到了这两张图片不一样的位置，当然我们是从那块拼图的右侧开始并且从左到右，找到第一个不一样的位置时就结束，这是的位置应该是缺口的left，所以我们使用selenium拖到这个位置即可。这里还有个疑问就是如何能自动的保存这两张图？这里我们可以先找到这个标签，然后获取它的location和size，然后 top，bottom，left，right = location['y'] ,location['y']+size['height']+ location['x'] + size['width'] ,然后截图，最后抠图填入这四个位置就行。具体的使用可以查看selenium文档，点击按钮前抠张图，点击后再抠张图。最后拖动的时候要需要模拟人的行为，先加速然后减速。因为这种验证码有行为特征检测，人是不可能做到一直匀速的，否则它就判定为是机器在拖动，这样就无法通过验证了。

（3）手机验证码验证

（4）点击式的图文验证和图标选择

图文验证：通过文字提醒用户点击图中相同字的位置进行验证。
图标选择：给出一组图片，按要求点击其中一张或者多张。借用万物识别的难度阻挡机器。
这两种原理相似，只不过是一个是给出文字，点击图片中的文字，一个是给出图片，点出内容相同的图片。
这两种没有特别好的方法，只能借助第三方识别接口来识别出相同的内容，推荐一个超级鹰，把验证码发过去，会返回相应的点击坐标。
然后再使用selenium模拟点击即可。具体怎么获取图片和上面方法一样。

三、requests.Request模块

Help on class Request in module requests.models:

class Request(RequestHooksMixin)
 |  Request(method=None, url=None, headers=None, files=None, data=None, params=None, auth=None, cookies=None, hooks=None, json=None)
 |  
 |  A user-created :class:`Request <Request>` object.
 |  
 |  Used to prepare a :class:`PreparedRequest <PreparedRequest>`, which is sent to the server.
 |  
 |  :param method: HTTP method to use.
 |  :param url: URL to send.
 |  :param headers: dictionary of headers to send.
 |  :param files: dictionary of {filename: fileobject} files to multipart upload.
 |  :param data: the body to attach to the request. If a dictionary or
 |      list of tuples ``[(key, value)]`` is provided, form-encoding will
 |      take place.
 |  :param json: json for the body to attach to the request (if files or data is not specified).
 |  :param params: URL parameters to append to the URL. If a dictionary or
 |      list of tuples ``[(key, value)]`` is provided, form-encoding will
 |      take place.
 |  :param auth: Auth handler or (user, pass) tuple.
 |  :param cookies: dictionary or CookieJar of cookies to attach to this request.
 |  :param hooks: dictionary of callback hooks, for internal usage.
 |  
 |  Usage::
 |  
 |    >>> import requests
 |    >>> req = requests.Request('GET', 'https://httpbin.org/get')
 |    >>> req.prepare()
 |    <PreparedRequest [GET]>
 |  
 |  Method resolution order:
 |      Request
 |      RequestHooksMixin
 |      builtins.object
 |  
 |  Methods defined here:
 |  
 |  __init__(self, method=None, url=None, headers=None, files=None, data=None, params=None, auth=None, cookies=None, hooks=None, json=None)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  __repr__(self)
 |      Return repr(self).
 |  
 |  prepare(self)
 |      Constructs a :class:`PreparedRequest <PreparedRequest>` for transmission and returns it.
 |  
 |  ----------------------------------------------------------------------
 |  Methods inherited from RequestHooksMixin:
 |  
 |  deregister_hook(self, event, hook)
 |      Deregister a previously registered hook.
 |      Returns True if the hook existed, False if not.
 |  
 |  register_hook(self, event, hook)
 |      Properly register a hook.
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from RequestHooksMixin:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)

help(requests.Request)

1、requests.Request 模块定义了以下方法：

四、requests.Response模块

Help on class Response in module requests.models:

class Response(builtins.object)
 |  The :class:`Response <Response>` object, which contains a
 |  server's response to an HTTP request.
 |  
 |  Methods defined here:
 |  
 |  __bool__(self)
 |      Returns True if :attr:`status_code` is less than 400.
 |      
 |      This attribute checks if the status code of the response is between
 |      400 and 600 to see if there was a client error or a server error. If
 |      the status code, is between 200 and 400, this will return True. This
 |      is **not** a check to see if the response code is ``200 OK``.
 |  
 |  __enter__(self)
 |  
 |  __exit__(self, *args)
 |  
 |  __getstate__(self)
 |  
 |  __init__(self)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  __iter__(self)
 |      Allows you to use a response as an iterator.
 |  
 |  __nonzero__(self)
 |      Returns True if :attr:`status_code` is less than 400.
 |      
 |      This attribute checks if the status code of the response is between
 |      400 and 600 to see if there was a client error or a server error. If
 |      the status code, is between 200 and 400, this will return True. This
 |      is **not** a check to see if the response code is ``200 OK``.
 |  
 |  __repr__(self)
 |      Return repr(self).
 |  
 |  __setstate__(self, state)
 |  
 |  close(self)
 |      Releases the connection back to the pool. Once this method has been
 |      called the underlying ``raw`` object must not be accessed again.
 |      
 |      *Note: Should not normally need to be called explicitly.*
 |  
 |  iter_content(self, chunk_size=1, decode_unicode=False)
 |      Iterates over the response data.  When stream=True is set on the
 |      request, this avoids reading the content at once into memory for
 |      large responses.  The chunk size is the number of bytes it should
 |      read into memory.  This is not necessarily the length of each item
 |      returned as decoding can take place.
 |      
 |      chunk_size must be of type int or None. A value of None will
 |      function differently depending on the value of `stream`.
 |      stream=True will read data as it arrives in whatever size the
 |      chunks are received. If stream=False, data is returned as
 |      a single chunk.
 |      
 |      If decode_unicode is True, content will be decoded using the best
 |      available encoding based on the response.
 |  
 |  iter_lines(self, chunk_size=512, decode_unicode=False, delimiter=None)
 |      Iterates over the response data, one line at a time.  When
 |      stream=True is set on the request, this avoids reading the
 |      content at once into memory for large responses.
 |      
 |      .. note:: This method is not reentrant safe.
 |  
 |  json(self, **kwargs)
 |      Returns the json-encoded content of a response, if any.
 |      
 |      :param \*\*kwargs: Optional arguments that ``json.loads`` takes.
 |      :raises ValueError: If the response body does not contain valid json.
 |  
 |  raise_for_status(self)
 |      Raises :class:`HTTPError`, if one occurred.
 |  
 |  ----------------------------------------------------------------------
 |  Readonly properties defined here:
 |  
 |  apparent_encoding
 |      The apparent encoding, provided by the chardet library.
 |  
 |  content
 |      Content of the response, in bytes.
 |  
 |  is_permanent_redirect
 |      True if this Response one of the permanent versions of redirect.
 |  
 |  is_redirect
 |      True if this Response is a well-formed HTTP redirect that could have
 |      been processed automatically (by :meth:`Session.resolve_redirects`).
 |  
 |  links
 |      Returns the parsed header links of the response, if any.
 |  
 |  next
 |      Returns a PreparedRequest for the next request in a redirect chain, if there is one.
 |  
 |  ok
 |      Returns True if :attr:`status_code` is less than 400, False if not.
 |      
 |      This attribute checks if the status code of the response is between
 |      400 and 600 to see if there was a client error or a server error. If
 |      the status code is between 200 and 400, this will return True. This
 |      is **not** a check to see if the response code is ``200 OK``.
 |  
 |  text
 |      Content of the response, in unicode.
 |      
 |      If Response.encoding is None, encoding will be guessed using
 |      ``chardet``.
 |      
 |      The encoding of the response content is determined based solely on HTTP
 |      headers, following RFC 2616 to the letter. If you can take advantage of
 |      non-HTTP knowledge to make a better guess at the encoding, you should
 |      set ``r.encoding`` appropriately before accessing this property.
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)
 |  
 |  ----------------------------------------------------------------------
 |  Data and other attributes defined here:
 |  
 |  __attrs__ = ['_content', 'status_code', 'headers', 'url', 'history', '...

jar = requests.cookies.RequestsCookie

help(requests.Response)

1、requests.Response模块定义了以下方法：

五、requests.Session模块

Help on class Session in module requests.sessions:

class Session(SessionRedirectMixin)
 |  A Requests session.
 |  
 |  Provides cookie persistence, connection-pooling, and configuration.
 |  
 |  Basic Usage::
 |  
 |    >>> import requests
 |    >>> s = requests.Session()
 |    >>> s.get('https://httpbin.org/get')
 |    <Response [200]>
 |  
 |  Or as a context manager::
 |  
 |    >>> with requests.Session() as s:
 |    ...     s.get('https://httpbin.org/get')
 |    <Response [200]>
 |  
 |  Method resolution order:
 |      Session
 |      SessionRedirectMixin
 |      builtins.object
 |  
 |  Methods defined here:
 |  
 |  __enter__(self)
 |  
 |  __exit__(self, *args)
 |  
 |  __getstate__(self)
 |  
 |  __init__(self)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  __setstate__(self, state)
 |  
 |  close(self)
 |      Closes all adapters and as such the session
 |  
 |  delete(self, url, **kwargs)
 |      Sends a DELETE request. Returns :class:`Response` object.
 |      
 |      :param url: URL for the new :class:`Request` object.
 |      :param \*\*kwargs: Optional arguments that ``request`` takes.
 |      :rtype: requests.Response
 |  
 |  get(self, url, **kwargs)
 |      Sends a GET request. Returns :class:`Response` object.
 |      
 |      :param url: URL for the new :class:`Request` object.
 |      :param \*\*kwargs: Optional arguments that ``request`` takes.
 |      :rtype: requests.Response
 |  
 |  get_adapter(self, url)
 |      Returns the appropriate connection adapter for the given URL.
 |      
 |      :rtype: requests.adapters.BaseAdapter
 |  
 |  head(self, url, **kwargs)
 |      Sends a HEAD request. Returns :class:`Response` object.
 |      
 |      :param url: URL for the new :class:`Request` object.
 |      :param \*\*kwargs: Optional arguments that ``request`` takes.
 |      :rtype: requests.Response
 |  
 |  merge_environment_settings(self, url, proxies, stream, verify, cert)
 |      Check the environment and merge it with some settings.
 |      
 |      :rtype: dict
 |  
 |  mount(self, prefix, adapter)
 |      Registers a connection adapter to a prefix.
 |      
 |      Adapters are sorted in descending order by prefix length.
 |  
 |  options(self, url, **kwargs)
 |      Sends a OPTIONS request. Returns :class:`Response` object.
 |      
 |      :param url: URL for the new :class:`Request` object.
 |      :param \*\*kwargs: Optional arguments that ``request`` takes.
 |      :rtype: requests.Response
 |  
 |  patch(self, url, data=None, **kwargs)
 |      Sends a PATCH request. Returns :class:`Response` object.
 |      
 |      :param url: URL for the new :class:`Request` object.
 |      :param data: (optional) Dictionary, list of tuples, bytes, or file-like
 |          object to send in the body of the :class:`Request`.
 |      :param \*\*kwargs: Optional arguments that ``request`` takes.
 |      :rtype: requests.Response
 |  
 |  post(self, url, data=None, json=None, **kwargs)
 |      Sends a POST request. Returns :class:`Response` object.
 |      
 |      :param url: URL for the new :class:`Request` object.
 |      :param data: (optional) Dictionary, list of tuples, bytes, or file-like
 |          object to send in the body of the :class:`Request`.
 |      :param json: (optional) json to send in the body of the :class:`Request`.
 |      :param \*\*kwargs: Optional arguments that ``request`` takes.
 |      :rtype: requests.Response
 |  
 |  prepare_request(self, request)
 |      Constructs a :class:`PreparedRequest <PreparedRequest>` for
 |      transmission and returns it. The :class:`PreparedRequest` has settings
 |      merged from the :class:`Request <Request>` instance and those of the
 |      :class:`Session`.
 |      
 |      :param request: :class:`Request` instance to prepare with this
 |          session's settings.
 |      :rtype: requests.PreparedRequest
 |  
 |  put(self, url, data=None, **kwargs)
 |      Sends a PUT request. Returns :class:`Response` object.
 |      
 |      :param url: URL for the new :class:`Request` object.
 |      :param data: (optional) Dictionary, list of tuples, bytes, or file-like
 |          object to send in the body of the :class:`Request`.
 |      :param \*\*kwargs: Optional arguments that ``request`` takes.
 |      :rtype: requests.Response
 |  
 |  request(self, method, url, params=None, data=None, headers=None, cookies=None, files=None, auth=None, timeout=None, allow_redirects=True, proxies=None, hooks=None, stream=None, verify=None, cert=None, json=None)
 |      Constructs a :class:`Request <Request>`, prepares it and sends it.
 |      Returns :class:`Response <Response>` object.
 |      
 |      :param method: method for the new :class:`Request` object.
 |      :param url: URL for the new :class:`Request` object.
 |      :param params: (optional) Dictionary or bytes to be sent in the query
 |          string for the :class:`Request`.
 |      :param data: (optional) Dictionary, list of tuples, bytes, or file-like
 |          object to send in the body of the :class:`Request`.
 |      :param json: (optional) json to send in the body of the
 |          :class:`Request`.
 |      :param headers: (optional) Dictionary of HTTP Headers to send with the
 |          :class:`Request`.
 |      :param cookies: (optional) Dict or CookieJar object to send with the
 |          :class:`Request`.
 |      :param files: (optional) Dictionary of ``'filename': file-like-objects``
 |          for multipart encoding upload.
 |      :param auth: (optional) Auth tuple or callable to enable
 |          Basic/Digest/Custom HTTP Auth.
 |      :param timeout: (optional) How long to wait for the server to send
 |          data before giving up, as a float, or a :ref:`(connect timeout,
 |          read timeout) <timeouts>` tuple.
 |      :type timeout: float or tuple
 |      :param allow_redirects: (optional) Set to True by default.
 |      :type allow_redirects: bool
 |      :param proxies: (optional) Dictionary mapping protocol or protocol and
 |          hostname to the URL of the proxy.
 |      :param stream: (optional) whether to immediately download the response
 |          content. Defaults to ``False``.
 |      :param verify: (optional) Either a boolean, in which case it controls whether we verify
 |          the server's TLS certificate, or a string, in which case it must be a path
 |          to a CA bundle to use. Defaults to ``True``. When set to
 |          ``False``, requests will accept any TLS certificate presented by
 |          the server, and will ignore hostname mismatches and/or expired
 |          certificates, which will make your application vulnerable to
 |          man-in-the-middle (MitM) attacks. Setting verify to ``False`` 
 |          may be useful during local development or testing.
 |      :param cert: (optional) if String, path to ssl client cert file (.pem).
 |          If Tuple, ('cert', 'key') pair.
 |      :rtype: requests.Response
 |  
 |  send(self, request, **kwargs)
 |      Send a given PreparedRequest.
 |      
 |      :rtype: requests.Response
 |  
 |  ----------------------------------------------------------------------
 |  Data and other attributes defined here:
 |  
 |  __attrs__ = ['headers', 'cookies', 'auth', 'proxies', 'hooks', 'params...
 |  
 |  ----------------------------------------------------------------------
 |  Methods inherited from SessionRedirectMixin:
 |  
 |  get_redirect_target(self, resp)
 |      Receives a Response. Returns a redirect URI or ``None``
 |  
 |  rebuild_auth(self, prepared_request, response)
 |      When being redirected we may want to strip authentication from the
 |      request to avoid leaking credentials. This method intelligently removes
 |      and reapplies authentication where possible to avoid credential loss.
 |  
 |  rebuild_method(self, prepared_request, response)
 |      When being redirected we may want to change the method of the request
 |      based on certain specs or browser behavior.
 |  
 |  rebuild_proxies(self, prepared_request, proxies)
 |      This method re-evaluates the proxy configuration by considering the
 |      environment variables. If we are redirected to a URL covered by
 |      NO_PROXY, we strip the proxy configuration. Otherwise, we set missing
 |      proxy keys for this URL (in case they were stripped by a previous
 |      redirect).
 |      
 |      This method also replaces the Proxy-Authorization header where
 |      necessary.
 |      
 |      :rtype: dict
 |  
 |  resolve_redirects(self, resp, req, stream=False, timeout=None, verify=True, cert=None, proxies=None, yield_requests=False, **adapter_kwargs)
 |      Receives a Response. Returns a generator of Responses or Requests.
 |  
 |  should_strip_auth(self, old_url, new_url)
 |      Decide whether Authorization header should be removed when redirecting
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from SessionRedirectMixin:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)Help on class Session in module requests.sessions:

class Session(SessionRedirectMixin)
 |  A Requests session.
 |  
 |  Provides cookie persistence, connection-pooling, and configuration.
 |  
 |  Basic Usage::
 |  
 |    >>> import requests
 |    >>> s = requests.Session()
 |    >>> s.get('https://httpbin.org/get')
 |    <Response [200]>
 |  
 |  Or as a context manager::
 |  
 |    >>> with requests.Session() as s:
 |    ...     s.get('https://httpbin.org/get')
 |    <Response [200]>
 |  
 |  Method resolution order:
 |      Session
 |      SessionRedirectMixin
 |      builtins.object
 |  
 |  Methods defined here:
 |  
 |  __enter__(self)
 |  
 |  __exit__(self, *args)
 |  
 |  __getstate__(self)
 |  
 |  __init__(self)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  __setstate__(self, state)
 |  
 |  close(self)
 |      Closes all adapters and as such the session
 |  
 |  delete(self, url, **kwargs)
 |      Sends a DELETE request. Returns :class:`Response` object.
 |      
 |      :param url: URL for the new :class:`Request` object.
 |      :param \*\*kwargs: Optional arguments that ``request`` takes.
 |      :rtype: requests.Response
 |  
 |  get(self, url, **kwargs)
 |      Sends a GET request. Returns :class:`Response` object.
 |      
 |      :param url: URL for the new :class:`Request` object.
 |      :param \*\*kwargs: Optional arguments that ``request`` takes.
 |      :rtype: requests.Response
 |  
 |  get_adapter(self, url)
 |      Returns the appropriate connection adapter for the given URL.
 |      
 |      :rtype: requests.adapters.BaseAdapter
 |  
 |  head(self, url, **kwargs)
 |      Sends a HEAD request. Returns :class:`Response` object.
 |      
 |      :param url: URL for the new :class:`Request` object.
 |      :param \*\*kwargs: Optional arguments that ``request`` takes.
 |      :rtype: requests.Response
 |  
 |  merge_environment_settings(self, url, proxies, stream, verify, cert)
 |      Check the environment and merge it with some settings.
 |      
 |      :rtype: dict
 |  
 |  mount(self, prefix, adapter)
 |      Registers a connection adapter to a prefix.
 |      
 |      Adapters are sorted in descending order by prefix length.
 |  
 |  options(self, url, **kwargs)
 |      Sends a OPTIONS request. Returns :class:`Response` object.
 |      
 |      :param url: URL for the new :class:`Request` object.
 |      :param \*\*kwargs: Optional arguments that ``request`` takes.
 |      :rtype: requests.Response
 |  
 |  patch(self, url, data=None, **kwargs)
 |      Sends a PATCH request. Returns :class:`Response` object.
 |      
 |      :param url: URL for the new :class:`Request` object.
 |      :param data: (optional) Dictionary, list of tuples, bytes, or file-like
 |          object to send in the body of the :class:`Request`.
 |      :param \*\*kwargs: Optional arguments that ``request`` takes.
 |      :rtype: requests.Response
 |  
 |  post(self, url, data=None, json=None, **kwargs)
 |      Sends a POST request. Returns :class:`Response` object.
 |      
 |      :param url: URL for the new :class:`Request` object.
 |      :param data: (optional) Dictionary, list of tuples, bytes, or file-like
 |          object to send in the body of the :class:`Request`.
 |      :param json: (optional) json to send in the body of the :class:`Request`.
 |      :param \*\*kwargs: Optional arguments that ``request`` takes.
 |      :rtype: requests.Response
 |  
 |  prepare_request(self, request)
 |      Constructs a :class:`PreparedRequest <PreparedRequest>` for
 |      transmission and returns it. The :class:`PreparedRequest` has settings
 |      merged from the :class:`Request <Request>` instance and those of the
 |      :class:`Session`.
 |      
 |      :param request: :class:`Request` instance to prepare with this
 |          session's settings.
 |      :rtype: requests.PreparedRequest
 |  
 |  put(self, url, data=None, **kwargs)
 |      Sends a PUT request. Returns :class:`Response` object.
 |      
 |      :param url: URL for the new :class:`Request` object.
 |      :param data: (optional) Dictionary, list of tuples, bytes, or file-like
 |          object to send in the body of the :class:`Request`.
 |      :param \*\*kwargs: Optional arguments that ``request`` takes.
 |      :rtype: requests.Response
 |  
 |  request(self, method, url, params=None, data=None, headers=None, cookies=None, files=None, auth=None, timeout=None, allow_redirects=True, proxies=None, hooks=None, stream=None, verify=None, cert=None, json=None)
 |      Constructs a :class:`Request <Request>`, prepares it and sends it.
 |      Returns :class:`Response <Response>` object.
 |      
 |      :param method: method for the new :class:`Request` object.
 |      :param url: URL for the new :class:`Request` object.
 |      :param params: (optional) Dictionary or bytes to be sent in the query
 |          string for the :class:`Request`.
 |      :param data: (optional) Dictionary, list of tuples, bytes, or file-like
 |          object to send in the body of the :class:`Request`.
 |      :param json: (optional) json to send in the body of the
 |          :class:`Request`.
 |      :param headers: (optional) Dictionary of HTTP Headers to send with the
 |          :class:`Request`.
 |      :param cookies: (optional) Dict or CookieJar object to send with the
 |          :class:`Request`.
 |      :param files: (optional) Dictionary of ``'filename': file-like-objects``
 |          for multipart encoding upload.
 |      :param auth: (optional) Auth tuple or callable to enable
 |          Basic/Digest/Custom HTTP Auth.
 |      :param timeout: (optional) How long to wait for the server to send
 |          data before giving up, as a float, or a :ref:`(connect timeout,
 |          read timeout) <timeouts>` tuple.
 |      :type timeout: float or tuple
 |      :param allow_redirects: (optional) Set to True by default.
 |      :type allow_redirects: bool
 |      :param proxies: (optional) Dictionary mapping protocol or protocol and
 |          hostname to the URL of the proxy.
 |      :param stream: (optional) whether to immediately download the response
 |          content. Defaults to ``False``.
 |      :param verify: (optional) Either a boolean, in which case it controls whether we verify
 |          the server's TLS certificate, or a string, in which case it must be a path
 |          to a CA bundle to use. Defaults to ``True``. When set to
 |          ``False``, requests will accept any TLS certificate presented by
 |          the server, and will ignore hostname mismatches and/or expired
 |          certificates, which will make your application vulnerable to
 |          man-in-the-middle (MitM) attacks. Setting verify to ``False`` 
 |          may be useful during local development or testing.
 |      :param cert: (optional) if String, path to ssl client cert file (.pem).
 |          If Tuple, ('cert', 'key') pair.
 |      :rtype: requests.Response
 |  
 |  send(self, request, **kwargs)
 |      Send a given PreparedRequest.
 |      
 |      :rtype: requests.Response
 |  
 |  ----------------------------------------------------------------------
 |  Data and other attributes defined here:
 |  
 |  __attrs__ = ['headers', 'cookies', 'auth', 'proxies', 'hooks', 'params...
 |  
 |  ----------------------------------------------------------------------
 |  Methods inherited from SessionRedirectMixin:
 |  
 |  get_redirect_target(self, resp)
 |      Receives a Response. Returns a redirect URI or ``None``
 |  
 |  rebuild_auth(self, prepared_request, response)
 |      When being redirected we may want to strip authentication from the
 |      request to avoid leaking credentials. This method intelligently removes
 |      and reapplies authentication where possible to avoid credential loss.
 |  
 |  rebuild_method(self, prepared_request, response)
 |      When being redirected we may want to change the method of the request
 |      based on certain specs or browser behavior.
 |  
 |  rebuild_proxies(self, prepared_request, proxies)
 |      This method re-evaluates the proxy configuration by considering the
 |      environment variables. If we are redirected to a URL covered by
 |      NO_PROXY, we strip the proxy configuration. Otherwise, we set missing
 |      proxy keys for this URL (in case they were stripped by a previous
 |      redirect).
 |      
 |      This method also replaces the Proxy-Authorization header where
 |      necessary.
 |      
 |      :rtype: dict
 |  
 |  resolve_redirects(self, resp, req, stream=False, timeout=None, verify=True, cert=None, proxies=None, yield_requests=False, **adapter_kwargs)
 |      Receives a Response. Returns a generator of Responses or Requests.
 |  
 |  should_strip_auth(self, old_url, new_url)
 |      Decide whether Authorization header should be removed when redirecting
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from SessionRedirectMixin:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)

help(requests.Session)

1、requests.Session模块定义了以下方法：

相关阅读:
[20190401]跟踪dbms_lock.sleep调用.txt
[20190401]隐含参数_mutex_spin_count.txt
[20190401]关于semtimedop函数调用.txt
[20190401]那个更快的疑问.txt
[20190329]探究sql语句相关mutexes补充2.txt
[20190328]简单探究sql语句相关mutexes.txt
[20190324]奇怪的GV$FILESPACE_USAGE视图.txt
[20190322]测试相同语句遇到导致cursor pin S的疑问.txt
linux命令（8）：cp 命令
 linux命令（7）：mv命令
原文地址：https://www.cnblogs.com/windyrainy/p/15156151.html

requests学习

一、简介

二、requests模块定义了以下方法：

1、request

2、get

3、post

4、总结

5、实战运用

（1）直接使用已知的cookie访问

（2）模拟登录后用session保持登录状态

（3）使用无头浏览器访问

6、验证码问题

（1）输入式验证码

（2）滑动式验证码

（3）手机验证码验证

（4）点击式的图文验证 和 图标选择

三、requests.Request模块

1、requests.Request 模块定义了以下方法：

四、requests.Response模块

1、requests.Response模块定义了以下方法：

五、requests.Session模块

1、requests.Session模块定义了以下方法：

（4）点击式的图文验证和图标选择