• Android HttpClient和URLConnection两种下载HTML源码的方法


    两种方法分别采用HttpClient和URLConnection,同时解决乱码问题。

    经真机测试,好像是HttpClient方式比较稳定,一般都能下载到,但是URLConnection在EDGE网络下经常下不到数据。

    HttpClient方式:

    public String getHtml(String url) throws
    IOException, URISyntaxException{

      URI u=new URI(url);

      DefaultHttpClient httpclient =new DefaultHttpClient();        
      HttpGet httpget =new HttpGet(u);

      ResponseHandler<String> responseHandler = new BasicResponseHandler();
      String content = httpclient.execute(httpget, responseHandler);
      content = new String(content.getBytes("ISO-8859-1"),"UTF-8");        //没这个会乱码
      return content;
    }


    URLConnection方式:

    public String getHTML(String url) {

      try{

        URL newUrl=new URL(url);
        URLConnection connect=newUrl.openConnection();
        connect.setRequestProperty("User-Agent", "Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; DigExt)");
        DataInputStream dis=new DataInputStream(connect.getInputStream());
        BufferedReader in = new BufferedReader(new InputStreamReader(dis,"UTF-8"));//目标页面编码为UTF-8

        String html="";
        String readLine=null;
        while((readLine=in.readLine())!=null){
            html=html+readLine;        } 

        in.close();
        return html;
              }

        catch(MalformedURLException me){        } 

      catch(IOException ioe){        }

      return null;}

  • 相关阅读:
    bzoj 2213: [Poi2011]Difference
    51nod 1079 中国剩余定理
    51nod 1074 约瑟夫环 V2
    SpringBoot:第二篇 集成mybatis
    SpringBoot:第二篇 集成日志lombok
    SpringBoot:第一篇 新建spring boot 应用
    JVM常用内存参数配置
    深入研究Java GC
    应用性能监控分析
    Java -- 深入浅出GC自动回收机制
  • 原文地址:https://www.cnblogs.com/mumue/p/2433986.html
Copyright © 2020-2023  润新知