• CSDN刷阅读数


    今天我们来盘一下csdn,做一个小程序,为什么做这个呢?今天小编看着我的博客的阅读数,唉,惨不忍睹,没办法,只能想一些........呃呃呃呃,你懂的。

    话不多说,分析一波csdn的阅读数,计数原理是每次进入页面记作一次,所以我们很简单的构建一个访问的小爬虫就好了,那么开始操作。

     1 import  requests
     2 import  time
     3 from lxml import etree
     4 import  random
     5  6 def post_article():
     7     '''下面url换成自己的,获取自己所有博客的链接'''
     8     response = requests.get(url='me_url',headers = getHeaders())
     9     text = response.content.decode('utf-8')
    10     html = etree.HTML(text)
    11     urls = html.xpath('//h4/a/@href')
    12     for url in urls:
    13         article_url.append(url)
    14         
    15 def access_url():
    16     '''访问其中一个url,随机从自己的博客中选中进行访问'''
    17     try:
    18         url = random.choice(article_url)
    19         response = requests.get(url, headers=getHeaders())
    20         time.sleep(2)
    21     except Exception as e :
    22         print(e)

    根据上面的代码,你的博客阅读数会蹭蹭的上涨,唉,想想都泪奔,要靠这种,

    我们下面写一下注意的就可以,设置headers,还有睡眠时间等,频繁的访问会使服务器拒绝为你增加阅读数,you ok?(散装英语).

    再加上我们设置的headers:

     1 def getHeaders():
     2     user_agent_list = [ 
     3         "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/22.0.1207.1 Safari/537.1" 
     4         "Mozilla/5.0 (X11; CrOS i686 2268.111.0) AppleWebKit/536.11 (KHTML, like Gecko) Chrome/20.0.1132.57 Safari/536.11", 
     5         "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.6 (KHTML, like Gecko) Chrome/20.0.1092.0 Safari/536.6", 
     6         "Mozilla/5.0 (Windows NT 6.2) AppleWebKit/536.6 (KHTML, like Gecko) Chrome/20.0.1090.0 Safari/536.6", 
     7         "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/19.77.34.5 Safari/537.1", 
     8         "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/536.5 (KHTML, like Gecko) Chrome/19.0.1084.9 Safari/536.5", 
     9         "Mozilla/5.0 (Windows NT 6.0) AppleWebKit/536.5 (KHTML, like Gecko) Chrome/19.0.1084.36 Safari/536.5", 
    10         "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1063.0 Safari/536.3", 
    11         "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1063.0 Safari/536.3", 
    12         "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_0) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1063.0 Safari/536.3", 
    13         "Mozilla/5.0 (Windows NT 6.2) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1062.0 Safari/536.3", 
    14         "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1062.0 Safari/536.3", 
    15         "Mozilla/5.0 (Windows NT 6.2) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1061.1 Safari/536.3", 
    16         "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1061.1 Safari/536.3", 
    17         "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1061.1 Safari/536.3", 
    18         "Mozilla/5.0 (Windows NT 6.2) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1061.0 Safari/536.3", 
    19         "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.24 (KHTML, like Gecko) Chrome/19.0.1055.1 Safari/535.24", 
    20         "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/535.24 (KHTML, like Gecko) Chrome/19.0.1055.1 Safari/535.24"
    21     ]
    22     UserAgent = random.choice(user_agent_list)
    23     headers = {'User-Agent': UserAgent}
    24     return headers

    主程序代码块:

     1 if __name__ == '__main__':
     2     index = 0
     3     post_article()
     4     print('进行到这了。。。')
     5     while True:
     6         access_url()
     7         print(index)
     8         index += 1
     9         '''自己随意设计的次数'''
    10         if index == 100000:
    11             break

    这个小爬虫就这么出来了,不要过度使用,只为学习技术,有任何纠纷跟我无关(瑟瑟发抖)。

     

  • 相关阅读:
    记一次HTTP劫持故障排查
    前面任意字符+固定字符+任意字符+固定字符匹配
    php-fpm启动,重启,终止操作
    crontab防止脚本周期内未执行完重复执行
    js深拷贝和浅拷贝
    vue 异步刷新页面,
    vue强制刷新组件
    js判断终端以及APP应用判断
    微信返回上一页的按钮会强制性使用页面缓存,不刷新页面
    json键和值转数组
  • 原文地址:https://www.cnblogs.com/xbhog/p/11745793.html
Copyright © 2020-2023  润新知