• python协程有多厉害?


      爬一个××网站上的东西,测算了一下协程的速度提升到底有多大,网站链接就不放了。。。

    import requests
    from bs4 import BeautifulSoup as sb
    import lxml
    import time
    
    url = 'http://www.××××.com/html/part/index27_'
    url_list = []
    
    start = time.time()
    
    for i in range(2,47):
        print('get page '+str(i))
        headers = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.78 Safari/537.36'}
        res = requests.get((url+str(i)+'.html'), headers)
        res.encoding = 'gb2312'
        soup = sb(res.text, 'lxml')
        div = sb(res.text, 'lxml').find('div', class_="box list channel")
        for li in div.find_all('li'):
            urls = ('http://www.××××.com' + li.a.get('href'))
            url_list.append(urls)
            print(urls)
    print(url_list)
    print(time.time()-start)

    爬完用时 111.7 s。

    来试试协程:

      

    import requests
    from bs4 import BeautifulSoup as sb
    import lxml
    import time
    from gevent import monkey
    import gevent
    
    monkey.patch_all()
    
    url = 'http://www.231ka.com/html/part/index27_'
    url_list = []
    
    for i in range(2,47):
        url_list.append(url+str(i)+'.html')
    
    def get(url):
        print('get data from :'+url)
        headers = {
            'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.78 Safari/537.36'}
        res = requests.get(url, headers)
        res.encoding = 'gb2312'
        soup = sb(res.text, 'lxml')
        div = sb(res.text, 'lxml').find('div', class_="box list channel")
        for li in div.find_all('li'):
            ur = ('http://www.231ka.com' + li.a.get('href'))
            print(ur)
    
    start = time.time()
    
    task = []
    for url in url_list:
        task.append(gevent.spawn(get,url))
    gevent.joinall(task)
    
    
    print(time.time()-start)

    结果是: 55.6 s

    也就是说在同样是单线程的情况下,采用了协程后可以使得时间缩减一半,而且仅仅是使用了python的第三方协程库实现的。

    牛逼了

  • 相关阅读:
    【GO】GO语言学习笔记一
    【GO】GO语言学习笔记三
    【GO】GO语言学习笔记二
    linux 安装Ngnix
    【GO】GO语言学习笔记四
    【GO】GO语言学习笔记五
    Extjs4.1 gridPanel动态列
    Extjs4.1 gridPanel单元格背景颜色渲染
    Extjs4.1中动态改变gridpanel的数据源
    Extjs4.1 vtype验证
  • 原文地址:https://www.cnblogs.com/peter1994/p/7507235.html
Copyright © 2020-2023  润新知