• HttpClient POST 的 UTF-8 编码问题


    http://www.360doc.com/content/09/0915/15/61497_6003890.shtml不 过在实际使用中, 还是发现按照最基本的方式调用 HttpClient 时, 并不支持 UTF-8 编码, 在网络上找过一些文章, 也不得要领, 于是查看了 commons-httpClient3.0.1 的一些代码, 首先在 PostMethod 中找到了 generateRequestEntity() 方法:
        /**
         * Generates a request entity from the post parameters, if present.   Calls
         * {@link EntityEnclosingMethod#generateRequestBody()} if parameters have not been set.
         *
         * @since 3.0
         */
        protected RequestEntity generateRequestEntity() {
            if (!this.params.isEmpty()) {
                // Use a ByteArrayRequestEntity instead of a StringRequestEntity.
                // This is to avoid potential encoding issues.   Form url encoded strings
                // are ASCII by definition but the content type may not be.   Treating the content
                // as bytes allows us to keep the current charset without worrying about how
                // this charset will effect the encoding of the form url encoded string.
                String content = EncodingUtil.formUrlEncode(getParameters(), getRequestCharSet());
                ByteArrayRequestEntity entity = new ByteArrayRequestEntity(
                    EncodingUtil.getAsciiBytes(content),
                    FORM_URL_ENCODED_CONTENT_TYPE
                );
                return entity;
            } else {
                return super.generateRequestEntity();
            }
        }

    原来使用 NameValuePair 加入的 HTTP 请求的参数最终都会转化为 RequestEntity 提交到 HTTP 服务器, 接着在 PostMethod 的父类 EntityEnclosingMethod 中找到了如下的代码:
        /**
         * Returns the request's charset.   The charset is parsed from the request entity's
         * content type, unless the content type header has been set manually.
         *
         * @see RequestEntity#getContentType()
         *
         * @since 3.0
         */
        public String getRequestCharSet() {
            if (getRequestHeader("Content-Type") == null) {
                // check the content type from request entity
                // We can't call getRequestEntity() since it will probably call
                // this method.
                if (this.requestEntity != null) {
                    return getContentCharSet(
                        new Header("Content-Type", requestEntity.getContentType()));
                } else {
                    return super.getRequestCharSet();
                }
            } else {
                return super.getRequestCharSet();
            }
        }


    解决方案

    从上面两段代码可以看出是 HttpClient 是如何依据 "Content-Type" 获得请求的编码(字符集), 而这个编码又是如何应用到提交内容的编码过程中去的. 按照这个原来, 其实我们只需要重载 getRequestCharSet() 方法, 返回我们需要的编码(字符集)名称, 就可以解决 UTF-8 或者其它非默认编码提交 POST 请求时的乱码问题了.

    测试

    首先在 Tomcat 的 ROOT WebApp 下部署一个页面 test.jsp, 作为测试页面, 主要代码片段如下:
    <%@ page contentType="text/html;charset=UTF-8"%>
    <%@ page session="false" %>
    <%
    request.setCharacterEncoding("UTF-8");
    String val = request.getParameter("TEXT");
    System.out.println(">>>> The result is " + val);
    %>


    接着写一个测试类, 主要代码如下:
        public static void main(String[] args) throws Exception, IOException {
            String url = "http://localhost:8080/test.jsp";
            PostMethod postMethod = new UTF8PostMethod(url);
            //填入各个表单域的值
            NameValuePair[] data = {
                    new NameValuePair("TEXT", "中文"),
            };
            //将表单的值放入postMethod中
            postMethod.setRequestBody(data);
            //执行postMethod
            HttpClient httpClient= new HttpClient();
            httpClient.executeMethod(postMethod);
        }
        
        //Inner class for UTF-8 support
        public static class UTF8PostMethod extends PostMethod{
            public UTF8PostMethod(String url){
                super(url);
            }
            @Override
            public String getRequestCharSet() {
                //return super.getRequestCharSet();
                return "UTF-8";
            }
        }


    运行这个测试程序, 在 Tomcat 的后台输出中可以正确打印出 ">>>> The result is 中文" .
  • 相关阅读:
    eclipse无法打断点,提示debug absent line number information
    jQueryValidator 验证非负数
    Oracle 11g中递归查询父类及子类集合
    修改上传功能时遇到的问题
    使用Tomcat页面乱码问题
    javaScript正则匹配汉字与特殊字符(项目中遇到关键字匹配的方法)
    Oracle 11g中字符串截取的实现
    软连接和硬连接区别 Alex
    Linux发行版的系统目录名称命名规则以及用途 Alex
    如何通过脚本实现显示版本号、CPU、硬盘和内存条大小 Alex
  • 原文地址:https://www.cnblogs.com/svennee/p/4078787.html
Copyright © 2020-2023  润新知