两种方法分别采用HttpClient和URLConnection,同时解决乱码问题。
经真机测试,好像是HttpClient方式比较稳定,一般都能下载到,但是URLConnection在EDGE网络下经常下不到数据。
HttpClient方式:
public String getHtml(String url) throws
IOException, URISyntaxException{
URI u=new URI(url);
DefaultHttpClient httpclient =new DefaultHttpClient();
HttpGet httpget =new HttpGet(u);
ResponseHandler<String> responseHandler = new BasicResponseHandler();
String content = httpclient.execute(httpget, responseHandler);
content = new String(content.getBytes("ISO-8859-1"),"UTF-8"); //没这个会乱码
return content;
}
URLConnection方式:
public String getHTML(String url) {
try{
URL newUrl=new URL(url);
URLConnection connect=newUrl.openConnection();
connect.setRequestProperty("User-Agent", "Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; DigExt)");
DataInputStream dis=new DataInputStream(connect.getInputStream());
BufferedReader in = new BufferedReader(new InputStreamReader(dis,"UTF-8"));//目标页面编码为UTF-8
String html="";
String readLine=null;
while((readLine=in.readLine())!=null){
html=html+readLine; }
in.close();
return html;
}
catch(MalformedURLException me){ }
catch(IOException ioe){ }
return null;}