最近,了解到一些朋友又深陷其中,便把自己以前的解决方法拿出来晒晒,希望对误入该行业的朋友有些帮助。
原理:html不能设置编码,但是xml可以,我们采用xml,把数据携带过去,不就可以了嘛。
js核心代码:
//注:xmlDoc,xmlHtml对象创建请用更完善的方式,这里只是简单演示一下
function doSubmit(){
var str=document.getElementById("input1").value;
//假设str就是你要提交的数据
alert(str);
//"MSXML2.DOMDocument", "Microsoft.XMLDOM", "MSXML.DOMDocument", "MSXML3.DOMDocument"
var xmlDoc=new ActiveXObject("MSXML2.DOMDocument");
//初始化xml文档对象
xmlDoc.loadXML("<html></html>");
xmlDoc.documentElement.text=str;//作为内容去携带
//如果采用属性携带数据更方便,那么可以用下面的方法
//xmlDoc.documentElement.setAttribute("name","msg");
//xmlDoc.documentElement.setAttribute("value",str);
alert(xmlDoc.xml);//查看生成的xml内容
//"MSXML2.XMLHttp.5.0","MSXML2.XMLHttp.4.0","MSXML2.XMLHttp.3.0","MSXML2.XMLHttp","Microsoft.XMLHttp";
var xmlHttp=new ActiveXObject("MSXML2.XMLHttp.5.0");
var url="servlet/MyServlet?time="+(new Date()).getTime();
xmlHttp.open("POST",url,false);
xmlHttp.send(xmlDoc);//把xml对象发送出去
alert(xmlHttp.responseText);
}
servlet/action核心代码:
// 读取ajax发送来的xml数据
SAXReader xmlReader = new SAXReader();
Document document = null;
try {
document = xmlReader.read(request.getInputStream());
} catch (Exception ex) {
System.err.println("xml读取失败,可能没有xml数据.");
ex.printStackTrace();
}
System.out.println("接收到xml数据:" + document.asXML());
// 解析xml
String str = document.getRootElement().getText();
System.out.println("解析出来的数据:" + str);
// 返回结果
response.setContentType("text/html; charset=UTF-8");// GBK也行,指明返回的编码
response.getWriter().print("服务器返回信息:成功啦!no(∩_∩)o...哈哈!");// 返回中文也没问题啦
---------------------------
在使用url进行参数传递时,经常会传递一些中文名的参数或URL地址,在后台处理时会发生转换错误。在有些传递页面使用GB2312,而在接收页面使用UTF8,这样接收到的参数就可能会与原来发生不一致。使用服务器端的urlEncode函数编码的URL,与使用客户端javascript的 encodeURI函数编码的URL,结果就不一样。
采用ISO Latin字符集对指定的字符串进行编码。所有的空格符、标点符号、特殊字符以及其他非ASCII字符都将被转化成%xx格式的字符编码(xx等于该字符在字符集表里面的编码的16进制数字)。比如,空格符对应的编码是%20。unescape方法与此相反。不会被此方法编码的字符: @ * / +
英文解释:MSDN JScript Reference: The escape method returns a string value (in Unicode format) that contains the contents of [the argument]. All spaces, punctuation, accented characters, and any other non- ASCII characters are replaced with %xx encoding, where xx is equivalent to the hexadecimal number representing the character. For example, a space is returned as "%20."
Edge Core Javascript Guide: The escape and unescape functions let you encode and decode strings. The escape function returns the hexadecimal encoding of an argument in the ISO Latin character set. The unescape function returns the ASCII string for the specified hexadecimal encoding value.
encodeURI() 方法:把URI字符串采用UTF-8编码格式转化成escape格式的字符串。不会被此方法编码的字符:! @ # $& * ( ) = : / ; ? + '
英文解释:MSDN JScript Reference: The encodeURI method returns an encoded URI. If you pass the result to decodeURI, the original string is returned. The encodeURI method does not encode the following characters: ":", "/", ";", and "?". Use encodeURIComponent to encode these characters. Edge Core Javascript Guide: Encodes a Uniform Resource Identifier (URI) by replacing each instance of certain characters by one, two, or three escape sequences representing the UTF- 8 encoding of the character
encodeURIComponent() 方法:把URI字符串采用 UTF-8编码格式转化成escape格式的字符串。与encodeURI()相比,这个方法将对更多的字符进行编码,比如 / 等字符。所以如果字符串里面包含了URI的几个部分的话,不能用这个方法来进行编码,否则 / 字符被编码之后URL将显示错误。不会被此方法编码的字符:! * ( )
英文解释:MSDN JScript Reference: The encodeURIComponent method returns an encoded URI. If you pass the result to decodeURIComponent, the original string is returned. Because the encodeURIComponent method encodes all characters, be careful if the string represents a path such as /folder1 /folder2 /default.html. The slash characters will be encoded and will not be valid if sent as a request to a web server. Use the encodeURI method if the string contains more than a single URI component. Mozilla Developer Core Javascript Guide: Encodes a Uniform Resource Identifier (URI) component by replacing each instance of certain characters by one, two, or three escape sequences representing the UTF- 8 encoding of the character.
另外,encodeURI/encodeURIComponent是在javascript1.5之后引进的,escape则在javascript1.0版本就有。
英文注释:The escape() method does not encode the + character which is interpreted as a space on the server side as well as generated by forms with spaces in their fields. Due to this shortcoming, you should avoid use of escape() whenever possible. The best alternative is usually encodeURIComponent().Use of the encodeURI() method is a bit more specialized than escape() in that it encodes for URIs [REF] as opposed to the querystring, which is part of a URL. Use this method when you need to encode a string to be used for any resource that uses URIs and needs certain characters to remain un- encoded. Note that this method does not encode the ' character, as it is a valid character within URIs.Lastly, the encodeURIComponent() method should be used in most cases when encoding a single component of a URI. This method will encode certain chars that would normally be recognized as special chars for URIs so that many components may be included. Note that this method does not encode the ' character, as it is a valid character within URIs.
--------------------
其实url 中文处理可以使用两个编码的函数
escape 和 unescape 说明如下:
Encodes String objects so they can be read on all computers.
escape(
charString
)
|
The required charString argument is any String object or literal to be encoded.
The escape method returns a string value (in Unicode format) that contains the contents of charstring. All spaces, punctuation, accented characters, and any other non-ASCII characters are replaced with % xx encoding, where xx is equivalent to the hexadecimal number representing the character. For example, a space is returned as "%20."
Characters with a value greater than 255 are stored using the %u xxxx format.
=============================================
Decodes String objects encoded with the escape method.
unescape(charString)
|
The required charString argument is a String object or literal to be decoded.
The unescape method returns a string value that contains the contents of charstring. All characters encoded with the %xx hexadecimal form are replaced by their ASCII character set equivalents.
Characters encoded in %u xxxx format (Unicode characters) are replaced with the Unicode character with hexadecimal encoding xxxx.
Note |
---|
The unescape method should not be used to decode Uniform Resource Identifiers (URI). Use decodeURI and decodeURIComponent methods instead. |