• python网页请求urllib2模块简单封装代码


    这篇文章主要分享一个python网页请求模块urllib2模块的简单封装代码。

    原文转自:http://www.jbxue.com/article/16585.html

    对python网页请求模块urllib2进行简单的封装。
    例子:

    #!/usr/bin/python
    
    #coding: utf-8
    import base64
    import urllib
    import urllib2
    import time
    class SendRequest:
    '''
    This class use to set and request the http, and get the info of response.
    e.g. set Authorization Type, request tyep..
    e.g. get html content, state code, cookie..
    SendRequest('http://10.75.0.103:8850/2/photos/square/type.json', 
    data='source=216274069', type='POST', auth='base',
    user='zl2010', password='111111')
    '''
    def __init__(self, url, data=None, type='GET', auth=None, user=None, password=None, cookie = None, **header):
    '''
    url:request, raise error if none
    date: data for post or get, must be dict type
    type: GET, POST
    auth: option, if has the value must be 'base' or 'cookie'
    user: user for auth
    password: password for auth
    cookie: if request with cookie 
    other header info: 
    e.g. referer='www.sina.com.cn' 
    '''
    self.url = url
    self.data = data
    self.type = type
    self.auth = auth
    self.user = user
    self.password = password 
    self.cookie = cookie
    
    if 'referer' in header:
    self.referer = header[referer]
    else:
    self.referer = None
    
    if 'user-agent' in header:
    self.user_agent = header[user-agent]
    else:
    self.user_agent = None
    
    self.setup_request()
    self.send_request() 
    def setup_request(self):
    '''
    setup a request 
    '''
    if self.url == None or self.url == '':
    raise 'The url should not empty!'
    
    # set request type 
    #print self.url
    #print self.type
    #print self.data
    #print self.auth
    #print self.user
    #print self.password 
    if self.type == 'POST': 
    self.Req = urllib2.Request(self.url, self.data)
    elif self.type == 'GET':
    if self.data == None:
    self.Req = urllib2.Request(self.url)
    else:
    self.Req = urllib2.Request(self.url + '?' + self.data)
    else:
    print 'The http request type NOT support now!'
    
    ##set auth type 
    if self.auth == 'base':
    if self.user == None or self.password == None:
    raise 'The user or password was not given!'
    else:
    auth_info = base64.encodestring(self.user + ':' + self.password).replace('
    ','')
    auth_info = 'Basic ' + auth_info 
    #print auth_info 
    self.Req.add_header("Authorization", auth_info)
    elif self.auth == 'cookie':
    if self.cookie == None:
    raise 'The cookie was not given!'
    else:
    self.Req.add_header("Cookie", self.cookie) 
    else:
    pass ##add other auth type here
    ##set other header info 
    if self.referer:
    self.Req.add_header('referer', self.referer)
    if self.user_agent:
    self.Req.add_header('user-agent', self.user_agent)
    
    
    def send_request(self): 
    '''
    send a request 
    '''
    # get a response object 
    try:
    self.Res = urllib2.urlopen(self.Req)
    self.source = self.Res.read()
    self.goal_url = self.Res.geturl()
    self.code = self.Res.getcode()
    self.head_dict = self.Res.info().dict
    self.Res.close()
    except urllib2.HTTPError, e:
    self.code = e.code
    print e
    
    def get_code(self):
    return self.code
    
    def get_url(self):
    return self.goal_url
    
    def get_source(self): 
    return self.source
    
    def get_header_info(self):
    return self.head_dict
    def get_cookie(self):
    if 'set-cookie' in self.head_dict:
    return self.head_dict['set-cookie']
    else:
    return None
    
    def get_content_type(self):
    if 'content-type' in self.head_dict:
    return self.head_dict['content-type']
    else:
    return None
    
    def get_expires_time(self):
    if 'expires' in self.head_dict:
    return self.head_dict['expires']
    else:
    return None
    
    def get_server_name(self):
    if 'server' in self.head_dict:
    return self.head_dict['server']
    else:
    return None
    
    def __del__(self):
    pass 
    __all__ = [SendRequest,]
    
    if __name__ == '__main__':
    '''
    The example for using the SendRequest class 
    '''
    value = {'source':'216274069'}
    data = urllib.urlencode(value)
    url = 'http://10.75.0.103:8850/2/photos/square/type.json'
    user = 'wz_0001'
    password = '111111'
    auth = 'base'
    type = 'POST'
    t2 = time.time()
    rs = SendRequest('http://www.google.com')
    #rs = SendRequest(url, data=data, type=type, auth=auth, user=user, password=password)
    print 't2: ' + str(time.time() - t2)
    print '---------------get_code()---------------'
    print rs.get_code()
    print '---------------get_url()---------------'
    print rs.get_url()
    print '---------------get_source()---------------'
    print rs.get_source()
    print '---------------get_cookie()---------------'
    print rs.get_cookie()
    rs = None
  • 相关阅读:
    LINUX下使用crontab进行RMAN备份实验
    cocos2d-x 通过JNI实现c/c++和Android的java层函数互调
    整型与字符型之间转化
    MFC的最大化,最小化,关闭
    [置顶] IT屌丝的离职申请
    The Priest Mathematician
    jQuery入门学习贴
    poj3308Paratroopers(最小割)
    Nginx 开启 debug 日志的办法
    关于产品的一些思考——(四十二)网易之有道云笔记协同版
  • 原文地址:https://www.cnblogs.com/study100/p/3539515.html
Copyright © 2020-2023  润新知