使用原生API来发送http请求,而不是使用apache的库,原因在于这个第三方库变化实在太快了,每个版本都有不小的变化。对于程序员来说,使用它反而会有很多麻烦,比如自己曾经写过的代码将无法复用。
原理简介
使用Java发送这两种请求的代码大同小异,只是一些参数设置的不同。步骤如下:
1.生成统一资源定位器(java.net.URL),并据此生成一个连接(java.net.URLConnection)
2.设置请求的参数
3.发送请求(get和post有区别)
4.以输入流的形式获取返回内容
5.关闭输入流
抓取百度网页
上代码:
package test; import java.io.BufferedReader; import java.io.IOException; import java.io.InputStreamReader; import java.io.PrintWriter; import java.net.URL; import java.net.URLConnection; import java.util.List; import java.util.Map; public class HttpRequest { /** * 向指定URL发送GET方法的请求 * * @param url * 发送请求的URL * @param param * 请求参数,请求参数应该是 name1=value1&name2=value2 的形式。 * @return URL 所代表远程资源的响应结果 */ public static String sendGet(String url, String param) { String result = ""; BufferedReader in = null; try { String urlNameString = url + "?" + param; URL realUrl = new URL(urlNameString); // 打开和URL之间的连接 URLConnection connection = realUrl.openConnection(); // 设置通用的请求属性 connection.setRequestProperty("accept", "*/*"); connection.setRequestProperty("connection", "Keep-Alive"); connection.setRequestProperty("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;SV1)"); // 建立实际的连接 connection.connect(); // 获取所有响应头字段 Map<String, List<String>> map = connection.getHeaderFields(); // 遍历所有的响应头字段 for (String key : map.keySet()) { System.out.println(key + "--->" + map.get(key)); } // 定义 BufferedReader输入流来读取URL的响应 in = new BufferedReader(new InputStreamReader( connection.getInputStream())); String line; while ((line = in.readLine()) != null) { result += line; System.out.println("@2" + line); } } catch (Exception e) { System.out.println("发送GET请求出现异常!" + e); e.printStackTrace(); } // 使用finally块来关闭输入流 finally { try { if (in != null) { in.close(); } } catch (Exception e2) { e2.printStackTrace(); } } return result; } /** * 向指定 URL 发送POST方法的请求 * * @param url * 发送请求的 URL * @param param * 请求参数,请求参数应该是 name1=value1&name2=value2 的形式。 * @return 所代表远程资源的响应结果 */ public static String sendPost(String url, String param) { PrintWriter out = null; BufferedReader in = null; String result = ""; try { URL realUrl = new URL(url); // 打开和URL之间的连接 URLConnection conn = realUrl.openConnection(); // 设置通用的请求属性 conn.setRequestProperty("accept", "*/*"); conn.setRequestProperty("connection", "Keep-Alive"); conn.setRequestProperty("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;SV1)"); // 发送POST请求必须设置如下两行 conn.setDoOutput(true); conn.setDoInput(true); // 获取URLConnection对象对应的输出流 out = new PrintWriter(conn.getOutputStream()); // 发送请求参数 out.print(param); // flush输出流的缓冲 out.flush(); // 定义BufferedReader输入流来读取URL的响应 in = new BufferedReader( new InputStreamReader(conn.getInputStream())); String line; while ((line = in.readLine()) != null) { result += line; } } catch (Exception e) { System.out.println("发送 POST 请求出现异常!"+e); e.printStackTrace(); } //使用finally块来关闭输出流、输入流 finally{ try{ if(out!=null){ out.close(); } if(in!=null){ in.close(); } } catch(IOException ex){ ex.printStackTrace(); } } return result; } public static void main(String[] args) { //发送 GET 请求 //String s=HttpRequest.sendGet("https://www.baidu.com", "key=123&v=456"); String s=HttpRequest.sendPost("https://www.baidu.com", ""); System.out.println("@1" + s); //发送 POST 请求 //String sr=HttpRequest.sendPost("https://www.baidu.com", "key=123&v=456"); //System.out.println(sr); } }
两种方法在控制台中的输出是不同的,用post请求才能得到整个html。原因是,如果用get方法发送请求,会被服务器要求重定向到http协议的url:
<noscript><meta http-equiv="refresh" content="0;url=http://www.baidu.com/"></noscript>
tip:
httpUrlConnection.setDoOutput(true);以后就可以使用conn.getOutputStream().write()
httpUrlConnection.setDoInput(true);以后就可以使用conn.getInputStream().read();
get请求用不到conn.getOutputStream(),因为参数直接追加在地址后面,因此默认是false。
post请求(比如:文件上传)需要往服务区传输大量的数据,这些数据是放在http的body里面的,因此需要在建立连接以后,往服务端写数据。
因为总是使用conn.getInputStream()获取服务端的响应,因此默认值是true。
下面分别详细介绍。
使用Get方法:
package test; import java.io.BufferedReader; import java.io.InputStream; import java.io.InputStreamReader; import java.net.HttpURLConnection; import java.net.URL; import java.net.URLConnection; public class HttpGetRequest { /** * Main * @param args * @throws Exception */ public static void main(String[] args) throws Exception { System.out.println(doGet()); } /** * Get Request * @return * @throws Exception */ public static String doGet() throws Exception { URL localURL = new URL("http://localhost:8080/OneHttpServer/"); URLConnection connection = localURL.openConnection(); HttpURLConnection httpURLConnection = (HttpURLConnection)connection; httpURLConnection.setRequestProperty("Accept-Charset", "utf-8"); httpURLConnection.setRequestProperty("Content-Type", "application/x-www-form-urlencoded"); InputStream inputStream = null; InputStreamReader inputStreamReader = null; BufferedReader reader = null; StringBuffer resultBuffer = new StringBuffer(); String tempLine = null; if (httpURLConnection.getResponseCode() >= 300) { throw new Exception("HTTP Request is not success, Response code is " + httpURLConnection.getResponseCode()); } try { inputStream = httpURLConnection.getInputStream(); inputStreamReader = new InputStreamReader(inputStream); reader = new BufferedReader(inputStreamReader); while ((tempLine = reader.readLine()) != null) { resultBuffer.append(tempLine); } } finally { if (reader != null) { reader.close(); } if (inputStreamReader != null) { inputStreamReader.close(); } if (inputStream != null) { inputStream.close(); } } return resultBuffer.toString(); } } HttpGetRequest
使用Post方法:
package test; import java.io.BufferedReader; import java.io.InputStream; import java.io.InputStreamReader; import java.io.OutputStream; import java.io.OutputStreamWriter; import java.net.HttpURLConnection; import java.net.URL; import java.net.URLConnection; public class HttpPostRequest { /** * Main * @param args * @throws Exception */ public static void main(String[] args) throws Exception { System.out.println(doPost()); } /** * Post Request * @return * @throws Exception */ public static String doPost() throws Exception { String parameterData = "username=nickhuang&blog=http://www.cnblogs.com/nick-huang/"; URL localURL = new URL("http://localhost:8080/OneHttpServer/"); URLConnection connection = localURL.openConnection(); HttpURLConnection httpURLConnection = (HttpURLConnection)connection; httpURLConnection.setDoOutput(true); httpURLConnection.setRequestMethod("POST"); httpURLConnection.setRequestProperty("Accept-Charset", "utf-8"); httpURLConnection.setRequestProperty("Content-Type", "application/x-www-form-urlencoded"); httpURLConnection.setRequestProperty("Content-Length", String.valueOf(parameterData.length())); OutputStream outputStream = null; OutputStreamWriter outputStreamWriter = null; InputStream inputStream = null; InputStreamReader inputStreamReader = null; BufferedReader reader = null; StringBuffer resultBuffer = new StringBuffer(); String tempLine = null; try { outputStream = httpURLConnection.getOutputStream(); outputStreamWriter = new OutputStreamWriter(outputStream); outputStreamWriter.write(parameterData.toString()); outputStreamWriter.flush(); if (httpURLConnection.getResponseCode() >= 300) { throw new Exception("HTTP Request is not success, Response code is " + httpURLConnection.getResponseCode()); } inputStream = httpURLConnection.getInputStream(); inputStreamReader = new InputStreamReader(inputStream); reader = new BufferedReader(inputStreamReader); while ((tempLine = reader.readLine()) != null) { resultBuffer.append(tempLine); } } finally { if (outputStreamWriter != null) { outputStreamWriter.close(); } if (outputStream != null) { outputStream.close(); } if (reader != null) { reader.close(); } if (inputStreamReader != null) { inputStreamReader.close(); } if (inputStream != null) { inputStream.close(); } } return resultBuffer.toString(); } } HttpPostRequest
封装&复用
这样,这个类的实例就引用了一个请求器,帮助线程完成抓取任务
package test; import java.io.BufferedReader; import java.io.IOException; import java.io.InputStream; import java.io.InputStreamReader; import java.io.OutputStream; import java.io.OutputStreamWriter; import java.net.HttpURLConnection; import java.net.InetSocketAddress; import java.net.Proxy; import java.net.URL; import java.net.URLConnection; import java.util.Iterator; import java.util.Map; public class HttpRequestor { private String charset = "utf-8"; private Integer connectTimeout = null; private Integer socketTimeout = null; private String proxyHost = null; private Integer proxyPort = null; /** * Do GET request * @param url * @return * @throws Exception * @throws IOException */ public String doGet(String url) throws Exception { URL localURL = new URL(url); URLConnection connection = openConnection(localURL); HttpURLConnection httpURLConnection = (HttpURLConnection)connection; httpURLConnection.setRequestProperty("Accept-Charset", charset); httpURLConnection.setRequestProperty("Content-Type", "application/x-www-form-urlencoded"); InputStream inputStream = null; InputStreamReader inputStreamReader = null; BufferedReader reader = null; StringBuffer resultBuffer = new StringBuffer(); String tempLine = null; if (httpURLConnection.getResponseCode() >= 300) { throw new Exception("HTTP Request is not success, Response code is " + httpURLConnection.getResponseCode()); } try { inputStream = httpURLConnection.getInputStream(); inputStreamReader = new InputStreamReader(inputStream); reader = new BufferedReader(inputStreamReader); while ((tempLine = reader.readLine()) != null) { resultBuffer.append(tempLine); } } finally { if (reader != null) { reader.close(); } if (inputStreamReader != null) { inputStreamReader.close(); } if (inputStream != null) { inputStream.close(); } } return resultBuffer.toString(); } /** * Do POST request * @param url * @param parameterMap * @return * @throws Exception */ public String doPost(String url, Map parameterMap) throws Exception { /* Translate parameter map to parameter date string */ StringBuffer parameterBuffer = new StringBuffer(); if (parameterMap != null) { Iterator iterator = parameterMap.keySet().iterator(); String key = null; String value = null; while (iterator.hasNext()) { key = (String)iterator.next(); if (parameterMap.get(key) != null) { value = (String)parameterMap.get(key); } else { value = ""; } parameterBuffer.append(key).append("=").append(value); if (iterator.hasNext()) { parameterBuffer.append("&"); } } } System.out.println("POST parameter : " + parameterBuffer.toString()); URL localURL = new URL(url); URLConnection connection = openConnection(localURL); HttpURLConnection httpURLConnection = (HttpURLConnection)connection; httpURLConnection.setDoOutput(true); httpURLConnection.setRequestMethod("POST"); httpURLConnection.setRequestProperty("Accept-Charset", charset); httpURLConnection.setRequestProperty("Content-Type", "application/x-www-form-urlencoded"); httpURLConnection.setRequestProperty("Content-Length", String.valueOf(parameterBuffer.length())); OutputStream outputStream = null; OutputStreamWriter outputStreamWriter = null; InputStream inputStream = null; InputStreamReader inputStreamReader = null; BufferedReader reader = null; StringBuffer resultBuffer = new StringBuffer(); String tempLine = null; try { outputStream = httpURLConnection.getOutputStream(); outputStreamWriter = new OutputStreamWriter(outputStream); outputStreamWriter.write(parameterBuffer.toString()); outputStreamWriter.flush(); if (httpURLConnection.getResponseCode() >= 300) { throw new Exception("HTTP Request is not success, Response code is " + httpURLConnection.getResponseCode()); } inputStream = httpURLConnection.getInputStream(); inputStreamReader = new InputStreamReader(inputStream); reader = new BufferedReader(inputStreamReader); while ((tempLine = reader.readLine()) != null) { resultBuffer.append(tempLine); } } finally { if (outputStreamWriter != null) { outputStreamWriter.close(); } if (outputStream != null) { outputStream.close(); } if (reader != null) { reader.close(); } if (inputStreamReader != null) { inputStreamReader.close(); } if (inputStream != null) { inputStream.close(); } } return resultBuffer.toString(); } private URLConnection openConnection(URL localURL) throws IOException { URLConnection connection; if (proxyHost != null && proxyPort != null) { Proxy proxy = new Proxy(Proxy.Type.HTTP, new InetSocketAddress(proxyHost, proxyPort)); connection = localURL.openConnection(proxy); } else { connection = localURL.openConnection(); } return connection; } /** * Render request according setting * @param request */ private void renderRequest(URLConnection connection) { if (connectTimeout != null) { connection.setConnectTimeout(connectTimeout); } if (socketTimeout != null) { connection.setReadTimeout(socketTimeout); } } /* * Getter & Setter */ public Integer getConnectTimeout() { return connectTimeout; } public void setConnectTimeout(Integer connectTimeout) { this.connectTimeout = connectTimeout; } public Integer getSocketTimeout() { return socketTimeout; } public void setSocketTimeout(Integer socketTimeout) { this.socketTimeout = socketTimeout; } public String getProxyHost() { return proxyHost; } public void setProxyHost(String proxyHost) { this.proxyHost = proxyHost; } public Integer getProxyPort() { return proxyPort; } public void setProxyPort(Integer proxyPort) { this.proxyPort = proxyPort; } public String getCharset() { return charset; } public void setCharset(String charset) { this.charset = charset; } } HttpRequestor
HttpRequestor的测试代码
客户端代码:
package test; import java.util.HashMap; import java.util.Map; public class Call { public static void main(String[] args) throws Exception { /* Post Request */ Map dataMap = new HashMap(); dataMap.put("username", "Nick Huang"); dataMap.put("blog", "IT"); System.out.println(new HttpRequestor().doPost("http://localhost:8080/OneHttpServer/", dataMap)); /* Get Request */ System.out.println(new HttpRequestor().doGet("http://localhost:8080/OneHttpServer/")); } } Call
服务端代码:
import java.io.IOException; import javax.servlet.ServletException; import javax.servlet.http.HttpServlet; import javax.servlet.http.HttpServletRequest; import javax.servlet.http.HttpServletResponse; public class LoginServlet extends HttpServlet { private static final long serialVersionUID = 1L; public LoginServlet() { super(); } protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { this.doPost(request, response); } protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { String username = request.getParameter("username"); String blog = request.getParameter("blog"); System.out.println(username); System.out.println(blog); response.setContentType("text/plain; charset=UTF-8"); response.setCharacterEncoding("UTF-8"); response.getWriter().write("It is ok!"); } } LoginServlet
web.xml
<?xml version="1.0" encoding="UTF-8"?> <web-app xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://java.sun.com/xml/ns/javaee" xmlns:web="http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd" xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd" id="WebApp_ID" version="2.5"> <display-name>OneHttpServer</display-name> <welcome-file-list> <welcome-file>LoginServlet</welcome-file> </welcome-file-list> <servlet> <description></description> <display-name>LoginServlet</display-name> <servlet-name>LoginServlet</servlet-name> <servlet-class>LoginServlet</servlet-class> </servlet> <servlet-mapping> <servlet-name>LoginServlet</servlet-name> <url-pattern>/LoginServlet</url-pattern> </servlet-mapping> </web-app> web.xml