• python协程有多厉害?


      爬一个××网站上的东西,测算了一下协程的速度提升到底有多大,网站链接就不放了。。。

    import requests
    from bs4 import BeautifulSoup as sb
    import lxml
    import time
    
    url = 'http://www.××××.com/html/part/index27_'
    url_list = []
    
    start = time.time()
    
    for i in range(2,47):
        print('get page '+str(i))
        headers = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.78 Safari/537.36'}
        res = requests.get((url+str(i)+'.html'), headers)
        res.encoding = 'gb2312'
        soup = sb(res.text, 'lxml')
        div = sb(res.text, 'lxml').find('div', class_="box list channel")
        for li in div.find_all('li'):
            urls = ('http://www.××××.com' + li.a.get('href'))
            url_list.append(urls)
            print(urls)
    print(url_list)
    print(time.time()-start)

    爬完用时 111.7 s。

    来试试协程:

      

    import requests
    from bs4 import BeautifulSoup as sb
    import lxml
    import time
    from gevent import monkey
    import gevent
    
    monkey.patch_all()
    
    url = 'http://www.231ka.com/html/part/index27_'
    url_list = []
    
    for i in range(2,47):
        url_list.append(url+str(i)+'.html')
    
    def get(url):
        print('get data from :'+url)
        headers = {
            'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.78 Safari/537.36'}
        res = requests.get(url, headers)
        res.encoding = 'gb2312'
        soup = sb(res.text, 'lxml')
        div = sb(res.text, 'lxml').find('div', class_="box list channel")
        for li in div.find_all('li'):
            ur = ('http://www.231ka.com' + li.a.get('href'))
            print(ur)
    
    start = time.time()
    
    task = []
    for url in url_list:
        task.append(gevent.spawn(get,url))
    gevent.joinall(task)
    
    
    print(time.time()-start)

    结果是: 55.6 s

    也就是说在同样是单线程的情况下,采用了协程后可以使得时间缩减一半,而且仅仅是使用了python的第三方协程库实现的。

    牛逼了

  • 相关阅读:
    移动web性能优化从入门到进阶
    授权保存到相册
    授权通讯地址
    windows putty 链接到 linux 免密码
    my docker note
    docker run -i -t --rm
    Command Not Found
    firewall-cmd 笔记
    vim 插件 Tabularize
    vim :find
  • 原文地址:https://www.cnblogs.com/peter1994/p/7507235.html
Copyright © 2020-2023  润新知