• httpclient 模拟浏览器动作需注意的cookie和HTTP头等信息


    转自:http://resolute.javaeye.com/blog/491701

     

    commons-httpclient是apache下的一个开源项目,提供了一个纯java实现的http客户端。使用它能够非常方便发送HTTP请求,接受HTTP应答,自己主动管理Cookie等等。 

    对于contact-list类库来说,须要使用的功能有,自己主动管理Cookie,设置HTTP头。发送HTTP请求,接受HTTP应答,转发HTTP重定向,还有输出HTTP请求/应答日志,以下对这些功能的实现进行解释: 

    1. 自己主动管理Cookie 
    view source 
    print? 
    1.public EmailImporter(String email, String password, String encoding) { 
    2.    ...... 
    3.    client = new HttpClient(); 
    4.    client.getParams().setCookiePolicy(CookiePolicy.BROWSER_COMPATIBILITY); 
    5.    client.getParams().setParameter("http.protocol.single-cookie-header", true); 
    6.} 

    当中将HttpClient的Cookie策略设置为CookiePolicy.BROWSER_COMPATIBILITY,即表示java client将依照浏览器的方式来自己主动处理Cookie。当然你也能够在执行过程中手动调整cookie。比方: 

    hotmail登录之前须要设置当前时间的Cookie: 
    view source 
    print? 
    1.client.getState().addCookie(new Cookie("login.live.com", "CkTst", "G" + new Date().getTime())); 

    只是,httpclient似乎没有提供删除cookie的功能,于是我添加了两个cookie管理的接口。一个是保留指定的cookies,一个是删除指定的cookies: 
    view source 
    print?

     
    01.protected void retainCookies(String[] cookieNames) { 
    02.    Cookie[] cookies = client.getState().getCookies(); 
    03.    ArrayList<Cookie> retainCookies = new ArrayList<Cookie>(); 
    04.    for (Cookie cookie : cookies) { 
    05.        if (Arrays.binarySearch(cookieNames, cookie.getName()) >= 0) { 
    06.            retainCookies.add(cookie); 
    07.        } 
    08.    } 
    09.    client.getState().clearCookies(); 
    10.    client.getState().addCookies(retainCookies.toArray(new Cookie[0])); 
    11.} 
    12. 
    13.protected void removeCookies(String[] cookieNames) { 
    14.    Cookie[] cookies = client.getState().getCookies(); 
    15.    ArrayList<Cookie> retainCookies = new ArrayList<Cookie>(); 
    16.    for (Cookie cookie : cookies) { 
    17.        if (Arrays.binarySearch(cookieNames, cookie.getName()) < 0) { 
    18.            retainCookies.add(cookie); 
    19.        } 
    20.    } 
    21.    client.getState().clearCookies(); 
    22.    client.getState().addCookies(retainCookies.toArray(new Cookie[0])); 
    23.} 

    2. 设置HTTP头: 

    http头的设置,能够让邮件server觉得是在和浏览器打交道,而避免被refuse的可能: 
    view source 
    print? 
    01.private void setHeaders(HttpMethod method) { 
    02.    method.setRequestHeader("Accept", "text/html,application/xhtml+xml,application/xml;"); 
    03.    method.setRequestHeader("Accept-Language", "zh-cn"); 
    04.    method.setRequestHeader("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-CN; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3"); 
    05.    method.setRequestHeader("Accept-Charset", encoding); 
    06.    method.setRequestHeader("Keep-Alive", "300"); 
    07.    method.setRequestHeader("Connection", "Keep-Alive"); 
    08.    method.setRequestHeader("Cache-Control", "no-cache"); 
    09.} 

    另外,在GET和POST的时候设置referer值。以及在POST的时候设置Content-Type: 
    view source 
    print? 
    1.protected String doPost(String actionUrl, NameValuePair[] params, String referer) throws HttpException, IOException { 
    2.    ...... 
    3.    method.setRequestHeader("Referer", referer); 
    4.    method.setRequestHeader("Content-Type", "application/x-www-form-urlencoded"); 
    5.    ...... 
    6.} 

    3. 发送HTTP请求,接收HTTP应答。

    在contact-list中仅仅使用了GET和POST请求,我也做了简单的封装: 
    view source 
    print? 
    01.protected String doGet(String url, String referer) throws HttpException, IOException { 
    02.    GetMethod method = new GetMethod(url); 
    03.    setHeaders(method); 
    04.    method.setRequestHeader("Referer", referer); 
    05.    // log request 
    06.    client.executeMethod(method); 
    07.    String responseStr = readInputStream(method.getResponseBodyAsStream()); 
    08.    // log response 
    09.    method.releaseConnection(); 
    10.    lastUrl = method.getURI().toString(); 
    11.    return responseStr; 
    12.} 
    13. 
    14.protected String doPost(String actionUrl, NameValuePair[] params, String referer) throws HttpException, IOException { 
    15.    PostMethod method = new PostMethod(actionUrl); 
    16.    setHeaders(method); 
    17.    method.setRequestHeader("Referer", referer); 
    18.    method.setRequestHeader("Content-Type", "application/x-www-form-urlencoded"); 
    19.    method.setRequestBody(params); 
    20.    // log request 
    21.    client.executeMethod(method); 
    22.    String responseStr = readInputStream(method.getResponseBodyAsStream()); 
    23.    // log response 
    24.    method.releaseConnection(); 
    25.    if (method.getResponseHeader("Location") != null) { 
    26.        // do redirect 
    27.    } else { 
    28.        lastUrl = method.getURI().toString(); 
    29.        return responseStr; 
    30.    } 
    31.} 

    4. HTTP重定向,主要是两种,一种是依据HTTP头的Location 
    view source 
    print?

     
    1.if (method.getResponseHeader("Location").getValue().startsWith("http")) { 
    2.    return doGet(method.getResponseHeader("Location").getValue()); 
    3.} else { 
    4.    return doGet("http://" + getResponseHost(method) + method.getResponseHeader("Location").getValue()); 
    5.} 

    还有一种是依据javascript中的window.location.replace。

     

    5. 输出请求/应答日志。这个对调试很重要: 
    view source 
    print? 
    01.private void logGetRequest(GetMethod method) throws URIException { 
    02.    logger.debug("do get request: " + method.getURI().toString()); 
    03.    logger.debug("header:/n" + getHeadersStr(method.getRequestHeaders())); 
    04.    logger.debug("cookie:/n" + getCookieStr()); 
    05.} 
    06. 
    07.private void logGetResponse(GetMethod method, String responseStr) throws URIException { 
    08.    logger.debug("do get response: " + method.getURI().toString()); 
    09.    logger.debug("header: /n" + getHeadersStr(method.getResponseHeaders())); 
    10.    logger.debug("body: /n" + responseStr); 
    11.} 
    12. 
    13.private void logPostRequest(PostMethod method) throws URIException { 
    14.    logger.debug("do post request: " + method.getURI().toString()); 
    15.    logger.debug("header:/n" + getHeadersStr(method.getRequestHeaders())); 
    16.    logger.debug("body:/n" + getPostBody(method.getParameters())); 
    17.    logger.debug("cookie:/n" + getCookieStr()); 
    18.} 
    19. 
    20.private void logPostResponse(PostMethod method, String responseStr) throws URIException { 
    21.    logger.debug("do post response:" + method.getURI().toString()); 
    22.    logger.debug("header:/n" + getHeadersStr(method.getResponseHeaders())); 
    23.    logger.debug("body:/n" + responseStr); 
    24.} 
    25. 
    26.private String getHeadersStr(Header[] headers) { 
    27.    StringBuilder builder = new StringBuilder(); 
    28.    for (Header header : headers) { 
    29.        builder.append(header.getName()).append(": ").append(header.getValue()).append("/n"); 
    30.    } 
    31.    return builder.toString(); 
    32.} 
    33. 
    34.private String getPostBody(NameValuePair[] postValues) { 
    35.    StringBuilder builder = new StringBuilder(); 
    36.    for (NameValuePair pair : postValues) { 
    37.        builder.append(pair.getName()).append(":").append(pair.getValue()).append("/n"); 
    38.    } 
    39.    return builder.toString(); 
    40.} 
    41. 
    42.private String getCookieStr() { 
    43.    Cookie[] cookies = client.getState().getCookies(); 
    44.    StringBuilder builder = new StringBuilder(); 
    45.    for (Cookie cookie : cookies) { 
    46.        builder.append(cookie.getDomain()).append(":") 
    47.               .append(cookie.getName()).append("=").append(cookie.getValue()).append(";") 
    48.               .append(cookie.getPath()).append(";") 
    49.               .append(cookie.getExpiryDate()).append(";") 
    50.               .append(cookie.getSecure()).append(";/n"); 
    51.    } 
    52.    return builder.toString(); 
    53.}

  • 相关阅读:
    spring mvc 源码简要分析
    tomcat 开启远程debug
    jdk1.5-jdk1.9的主要区别
    关于elasticsearch 6.x及其插件head安装(单机与集群)5分钟解决
    mysql主从配置(5分钟解决问题)
    内部类总结
    Colored Sticks
    vim 将tab转为空格
    shell编程
    vim -- 查找和替换
  • 原文地址:https://www.cnblogs.com/yjbjingcha/p/6986200.html
Copyright © 2020-2023  润新知