• python之requests urllib3 连接池


    0.目录

    1.参考

    2. pool_connections 默认值为10,一个站点主机host对应一个pool
      (4)分析
      host A>>host B>>host A page2>>host A page3
      限定只保留一个pool(host),根据TCP源端口可知,第四次get才能复用连接。

    3. pool_maxsize 默认值为10,一个站点主机host对应一个pool, 该pool内根据多线程需求可保留到某一相同主机host的多条连接
      (4)分析
      多线程启动时到特定主机host的连接数没有收到 pool_maxsize 的限制,但是之后只有min(线程数,pool_maxsize ) 的连接数能够保留。
      后续线程(应用层)并不关心实际会使用到的具体连接(传输层源端口)

    1.参考

    【转载-译文】requests库连接池说明

    Requests' secret: pool_connections and pool_maxsize

    Python - 体验urllib3 -- HTTP连接池的应用

      通过wireshark抓取包:

      所有http://ent.qq.com/a/20111216/******.htm对应的src port都是13136,可见端口重用了

    2. pool_connections 默认值为10,一个站点主机host对应一个pool

    (1)代码

    #!/usr/bin/env python
    # -*- coding: UTF-8 -*
    
    import time
    import requests
    from threading import Thread
    
    import logging
    
    logging.basicConfig()
    logging.getLogger().setLevel(logging.DEBUG)
    requests_log = logging.getLogger("requests.packages.urllib3")
    requests_log.setLevel(logging.DEBUG)
    requests_log.propagate = True
    
    url_sohu_1 = 'http://www.sohu.com/sohu/1.html'
    url_sohu_2 = 'http://www.sohu.com/sohu/2.html'
    url_sohu_3 = 'http://www.sohu.com/sohu/3.html'
    url_sohu_4 = 'http://www.sohu.com/sohu/4.html'
    url_sohu_5 = 'http://www.sohu.com/sohu/5.html'
    url_sohu_6 = 'http://www.sohu.com/sohu/6.html'
    
    url_news_1 = 'http://news.163.com/air/'
    url_news_2 = 'http://news.163.com/domestic/'
    url_news_3 = 'http://news.163.com/photo/'
    url_news_4 = 'http://news.163.com/shehui/'
    url_news_5 = 'http://news.163.com/uav/5/'
    url_news_6 = 'http://news.163.com/world/6/'
    
    s = requests.Session()
    s.mount('http://', requests.adapters.HTTPAdapter(pool_connections=1))
    s.get(url_sohu_1)
    s.get(url_news_1)
    s.get(url_sohu_2)
    s.get(url_sohu_3)

    (2)log输出

    DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): www.sohu.com              #host A
    DEBUG:urllib3.connectionpool:http://www.sohu.com:80 "GET /sohu/1.html HTTP/1.1" 404 None
    DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): news.163.com         #host B
    DEBUG:urllib3.connectionpool:http://news.163.com:80 "GET /air/ HTTP/1.1" 200 None
    DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): www.sohu.com         #host A  
    DEBUG:urllib3.connectionpool:http://www.sohu.com:80 "GET /sohu/2.html HTTP/1.1" 404 None  #host A page2
    DEBUG:urllib3.connectionpool:http://www.sohu.com:80 "GET /sohu/3.html HTTP/1.1" 404 None  #host A page3

    (3)wireshark抓包  https过滤方法?用tcp syn? ping m.10010.com 然后 tcp.flags == 0x0002 and ip.dst == 157.255.128.111

    (4)分析

    host A>>host B>>host A page2>>host A page3

    限定只保留一个pool(host),根据TCP源端口可知,第四次get才能复用连接。

    3. pool_maxsize 默认值为10,一个站点主机host对应一个pool, 该pool内根据多线程需求可保留到某一相同主机host的多条连接

    (1)代码

    def thread_get(url):
        s.get(url)
    def thread_get_wait_3s(url):
        s.get(url)    
        time.sleep(3)
        s.get(url) 
    def thread_get_wait_5s(url):
        s.get(url)    
        time.sleep(5)
        s.get(url)
        
    s = requests.Session() 
    s.mount('http://', requests.adapters.HTTPAdapter(pool_maxsize=2))
    t1 = Thread(target=thread_get_wait_5s, args=(url_sohu_1,))
    t2 = Thread(target=thread_get, args=(url_news_1,))
    t3 = Thread(target=thread_get_wait_3s, args=(url_sohu_2,))
    t4 = Thread(target=thread_get_wait_5s, args=(url_sohu_3,))
    t1.start()
    t2.start()
    t3.start()
    t4.start()
    t1.join()
    t2.join()
    t3.join()
    t4.join()

    (2)log输出

    DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): www.sohu.com    #pool_sohu_connection_1_port_54805
    DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): news.163.com    #pool_163_connection_1_port_54806
    DEBUG:urllib3.connectionpool:Starting new HTTP connection (2): www.sohu.com    #pool_sohu_connection_2_port_54807
    DEBUG:urllib3.connectionpool:Starting new HTTP connection (3): www.sohu.com    #pool_sohu_connection_3_port_54808
    DEBUG:urllib3.connectionpool:http://news.163.com:80 "GET /air/ HTTP/1.1" 200 None
    DEBUG:urllib3.connectionpool:http://www.sohu.com:80 "GET /sohu/3.html HTTP/1.1" 404 None
    DEBUG:urllib3.connectionpool:http://www.sohu.com:80 "GET /sohu/2.html HTTP/1.1" 404 None
    DEBUG:urllib3.connectionpool:http://www.sohu.com:80 "GET /sohu/1.html HTTP/1.1" 404 None
    WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: www.sohu.com  #pool_sohu_connection_1_port_54805 被丢弃?最初host sohu能够建立3条连接,之后终究只能保存2条???
    DEBUG:urllib3.connectionpool:http://www.sohu.com:80 "GET /sohu/2.html HTTP/1.1" 404 None     #pool_sohu_connection_2_port_54807 3秒后sohu/2复用了原来sohu/2的端口
    DEBUG:urllib3.connectionpool:http://www.sohu.com:80 "GET /sohu/3.html HTTP/1.1" 404 None     #pool_sohu_connection_2_port_54807 5秒后sohu/3复用了原来sohu/2的端口
    DEBUG:urllib3.connectionpool:http://www.sohu.com:80 "GET /sohu/1.html HTTP/1.1" 404 None     #pool_sohu_connection_3_port_54807 5秒后sohu/1复用了原来sohu/3的端口

    (3)wireshark抓包

    (4)分析

    多线程启动时到特定主机host的连接数没有收到 pool_maxsize 的限制,但是之后只有min(线程数,pool_maxsize ) 的连接数能够保留。

    后续线程(应用层)并不关心实际会使用到的具体连接(传输层源端口) 

  • 相关阅读:
    java工作复习——执行JS脚本——滚动条01
    java工作复习——cookie的增删查
    java工作复习——4大时间等待——显示等待(转载)
    java工作复习——4大时间等待——显示等待
    java工作复习——4大时间等待——隐式等待等待(driver.manage().timeouts().implicitlyWait(10, TimeUnit.SECONDS);)
    java工作复习——4大时间等待——强制等待(Thread.sleep(5000);)
    java工作复习——键盘事件——action类——复制、粘贴 (转载)
    java工作复习——键盘事件——action类——复制、粘贴
    java工作复习——鼠标事件——action类——转载总结
    java工作复习——鼠标事件——action类——模拟鼠标从一个位置移动到另外一个位置
  • 原文地址:https://www.cnblogs.com/my8100/p/7342010.html
Copyright © 2020-2023  润新知