学习进度条42

对于python 3来说,urllib是一个非常重要的一个模块，可以非常方便的模拟浏览器访问互联网,对于python 3 爬虫来说, urllib更是一个必不可少的模块,它可以帮助我们方便地处理URL.urllib.request是urllib的一个子模块,可以打开和处理一些复杂的网址The urllib.requestmodule defines functions and classes which help in opening URLs (mostly HTTP) in a complex world — basic and digest authentication, redirections, cookies and more.urllib.request.urlopen()方法实现了打开url,并返回一个 http.client.HTTPResponse对象,通过http.client.HTTPResponse的read()方法,获得response body,转码最后通过print()打印出来.urllib.request.urlopen(url, data=None, [timeout, ]***, cafile=None, capath=None, cadefault=False, context=None)For HTTP and HTTPS URLs, this function returns a http.client.HTTPResponseobject slightly modified.< 出自: https://docs.python.org/3/library/urllib.request.html >decode('utf-8')用来将页面转换成utf-8的编码格式，否则会出现乱码二模拟浏览器爬取信息在访问某些网站的时候，网站通常会用判断访问是否带有头文件来鉴别该访问是否为爬虫，用来作为反爬取的一种策略。

相关阅读:
Socket与系统调用深度分析
AudioRecord::getMinFrameCount
c++: address argument to atomic operation must be a pointer to _Atomic type
python秒表，方便测试计时
Android驱动笔记（13）——PMIC reset介绍
Android驱动笔记（10）——DOS或BAT脚本语法
第八章——Linux设备模型（1）
第四章——IOCTL（1）
第三章（扩展）——虚拟串口设备
第三章——字符驱动设备

原文地址：https://www.cnblogs.com/hhw12345/p/14910353.html