非西欧语系乱码原因
在没有设置任何内容类型或编码之前,HttpServletResponse使用的字符编码默认是ISO-8859-1。也就是说,如果直接输出中文,在浏览器上就会看到乱码。
有两种方式可以修改HttpServletResponse输出的编码方式
方式一:设置response的setLocale
浏览器如果有发送Accept-Language标头,则可以使用HttpServletRequest的getLocale()来取得一个Locale对象,代表客户端可接受的语系。可以使用HttpServletResponse的setLocale()来设置地区(Locale)信息,地区信息就包括了语系与编码信息。语系信息通常通过响应标头Content-Language来设置,而setLocale()也会设置HTTP响应的Content-Language标头。
具体实现步骤:
(1)在web.xml中设置默认的区域与编码对应,如果不设置对应关系,编码方式不会变,还是默认的ISO-8859-1,设置示例在下面的web.xml里。
(2)设置好以上信息后,若使用下面的其中一个
response.setLocale(Locale.TAIWAN);
response.setLocale(new Locale("zh", "TW"));
response.setLocale(Locale.CHINA);
就会将HTTP响应的Content-Language设置为zh_CN,而字符编码处理设置为UTF-8。
此时,若使用HttpServletResponse的getCharacterEncoding()方法取得编码设置就是UTF-8。
example
<!DOCTYPE web-app PUBLIC "-//Sun Microsystems, Inc.//DTD Web Application 2.3//EN" "http://java.sun.com/dtd/web-app_2_3.dtd" > <web-app xmlns="http://java.sun.com/xml/ns/j2ee" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="2.4" xsi:schemaLocation="http://java.sun.com/xml/ns/j2ee http://java.sun.com/xml/ns/j2ee/web-app_2_4.xsd"> <display-name>Archetype Created Web Application</display-name> <!-- 用于映射本地文件在网页上显示为utf-8编码 --> <locale-encoding-mapping-list> <locale-encoding-mapping> <locale>zh_CN</locale> <encoding>UTF-8</encoding> </locale-encoding-mapping> <locale-encoding-mapping> <locale>zh_TW</locale> <encoding>UTF-8</encoding> </locale-encoding-mapping> </locale-encoding-mapping-list> </web-app>
<html> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> </head> <body> <form method="get" action="world"> 名称:<input type="text" name="name"><br> <button>发出 GET 请求</button> </form><br><br> <form method="post" action="world"> 名称:<input type="text" name="name"><br> <button>发出 POST 请求</button> </form> </body> </html>
package com.test; import org.junit.Test; import javax.servlet.ServletException; import javax.servlet.annotation.WebServlet; import javax.servlet.http.HttpServlet; import javax.servlet.http.HttpServletRequest; import javax.servlet.http.HttpServletResponse; import java.io.IOException; import java.io.PrintWriter; import java.io.UnsupportedEncodingException; import java.net.URLDecoder; import java.net.URLEncoder; import java.util.Locale; @WebServlet("/world") public class MyServlet extends HttpServlet { @Override protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { String name = request.getParameter("name"); name = new String(name.getBytes("ISO-8859-1"), "UTF-8"); System.out.println("name: " + name); response.setContentType("text/html;"); response.setLocale(Locale.TAIWAN); System.out.println(response.getCharacterEncoding()); // UTF-8 response.getWriter().write("Hello, " + name); } @Override protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { String name = request.getParameter("name"); name = new String(name.getBytes("ISO-8859-1"), "UTF-8"); System.out.println("name: " + name); response.setContentType("text/html;"); response.setLocale(Locale.TAIWAN); System.out.println(response.getCharacterEncoding()); // UTF-8 response.getWriter().write("Hello, " + name); } }
方式二:使用response的setCharacterEncoding()或setContentType()
调用HttpServletResponse的setContentType()时,指定charset,charset的值会自动用来调用setCharacterEncoding()。
浏览器需要知道如何处理你的响应,所以必须告知内容类型,setContentType()方法在响应中设置content-type响应标头,你只要指定MIME(Multipurpose Internet Mail Extensions)类型就可以了。由于编码设置与内容类型通常都要设置,所以调用setContentType()设置内容类型时,同时指定charset属性是个方便且常见的做法。
如果使用了setCharacterEncoding()或setContentType()时指定了charset,则setLocale()就会被忽略。
example
Html代码还使用上面的
package com.test; import org.junit.Test; import javax.servlet.ServletException; import javax.servlet.annotation.WebServlet; import javax.servlet.http.HttpServlet; import javax.servlet.http.HttpServletRequest; import javax.servlet.http.HttpServletResponse; import java.io.IOException; import java.io.PrintWriter; import java.io.UnsupportedEncodingException; import java.net.URLDecoder; import java.net.URLEncoder; import java.util.Locale; @WebServlet("/world") public class MyServlet extends HttpServlet { @Override protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { String name = request.getParameter("name"); name = new String(name.getBytes("ISO-8859-1"), "UTF-8"); System.out.println("name: " + name); response.setContentType("text/html;"); response.setCharacterEncoding("UTF-8"); System.out.println(response.getCharacterEncoding()); // UTF-8 response.getWriter().write("Hello, " + name); } @Override protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { String name = request.getParameter("name"); name = new String(name.getBytes("ISO-8859-1"), "UTF-8"); System.out.println("name: " + name); response.setContentType("text/html;"); response.setCharacterEncoding("UTF-8"); System.out.println(response.getCharacterEncoding()); // UTF-8 response.getWriter().write("Hello, " + name); } }
注意
这两种方式,无论使用哪一种方式,都要设置响应内容类型response.setContentType("text/html;")才能解决乱码。
如果不设置响应内容类型,虽然响应编码方式设置成功,通过response.getCharacterEncoding()得到的是我们想要的UTF-8,但到浏览器依然乱码。