>>> import os >>> os.getcwd() 'C:\Python33' >>> os.chdir('E:\python\mmy') >>> os.getcwd() 'E:\python\mmy' >>> import urllib.request >>> urllib.request.urlopen('http://image.edai.com/avatar/000/88/14/23_avatar_middle.jpg') <http.client.HTTPResponse object at 0x00000000032E0FD0> >>> response = urllib.request.urlopen('http://image.edai.com/avatar/000/88/14/23_avatar_middle.jpg') >>> response.getcode() 200 >>> response.geturl() 'http://image.edai.com/avatar/000/88/14/23_avatar_middle.jpg' >>> response.info() <http.client.HTTPMessage object at 0x00000000032ED6A0> >>> print(response.info()) Expires: Thu, 31 Dec 2037 23:55:55 GMT Date: Sat, 30 Jan 2016 13:18:38 GMT Server: nginx/0.8.42 Content-Type: image/jpeg Content-Length: 8053 Last-Modified: Thu, 08 Jan 2015 06:46:11 GMT Pragma: public Accept-Ranges: bytes Age: 1 X-Via: 1.1 scxx84:1 (Cdn Cache Server V2.0) Connection: close Cache-Control: public, must-revalidate, proxy-revalidate >>> pic = response.read() >>> with open('liuhui.jpg', 'wb') as f: f.write(pic) 8053 >>>
在本地磁盘已经有了liuhui.jpg了!!!
以上代码是在idle下操作的,其实真实的代码只有如下:
import urllib.request response = urllib.request.urlopen('http://image.edai.com/avatar/000/88/14/23_avatar_middle.jpg') pic = response.read() with open('liuhui.jpg', 'wb') as f: f.write(pic)
注:
1.urlopen()的参数可以是一个字符串或一个request对象,当为一个字符串时,其实是执行了两个步骤:
(1)req = urllib.request.Request('http://image.edai.com/avatar/000/88/14/23_avatar_middle.jpg');
(2)response = urllib.request.urlopen(req).
2.urllib.request.urlopen()的返回值是一个http.client.HTTPResponse对象,即客户端http响应结果对象。我们知道http协议,对客户端请求是需要响应的,而响应的数据包还包含了http头部信息。因此,urllib.request.urlopen()返回的对象还有以下3个重要的方法:
(1)getcode(): Htpp状态码
(2)geturl():客户端请求的url地址
(3)info():就是http的头部信息(header)