• 十六、从网络中获取网页数据


    从网络中获取网页数据时,网页有可能使用GZIP压缩技术对页面进行压缩,这样就会减小通过网络传输的数据量,提高浏览的速度。因此在获取网络数据时要对其进行判断,对GZIP格式的数据使用GZIPInputStream对其特殊处理,否则在获取数据可能出现乱码哦.

                     

     以下为网络中获取网页数据的案例代码

     

    package com.ljq.test;

    import java.io.ByteArrayOutputStream;
    import java.io.InputStream;
    import java.net.HttpURLConnection;
    import java.net.URL;
    import java.util.zip.GZIPInputStream;
    import java.util.zip.GZIPOutputStream;

    /**
    * 从网络中获取网页数据
    *
    *
    @author jiqinlin
    *
    */
    public class InternetTest2 {

    @SuppressWarnings(
    "static-access")
    public static void main(String[] args) throws Exception {
    String result
    = "";
    //URL url = new URL("http://www.sohu.com");
    URL url = new URL("http://www.ku6.com/");
    HttpURLConnection conn
    = (HttpURLConnection) url.openConnection();
    conn.setConnectTimeout(
    6* 1000);//设置连接超时
    if (conn.getResponseCode() != 200) throw new RuntimeException("请求url失败");
    InputStream is
    = conn.getInputStream();//得到网络返回的输入流
    if("gzip".equals(conn.getContentEncoding())){
    result
    = new InternetTest2().readDataForZgip(is, "GBK");
    }
    else {
    result
    = new InternetTest2().readData(is, "GBK");
    }
    conn.disconnect();
    System.out.println(result);
    System.err.println(
    "ContentEncoding: " + conn.getContentEncoding());
    }

    //第一个参数为输入流,第二个参数为字符集编码
    public static String readData(InputStream inSream, String charsetName) throws Exception{
    ByteArrayOutputStream outStream
    = new ByteArrayOutputStream();
    byte[] buffer = new byte[1024];
    int len = -1;
    while( (len = inSream.read(buffer)) != -1 ){
    outStream.write(buffer,
    0, len);
    }
    byte[] data = outStream.toByteArray();
    outStream.close();
    inSream.close();
    return new String(data, charsetName);
    }

    //第一个参数为输入流,第二个参数为字符集编码
    public static String readDataForZgip(InputStream inStream, String charsetName) throws Exception{
    GZIPInputStream gzipStream
    = new GZIPInputStream(inStream);
    ByteArrayOutputStream outStream
    = new ByteArrayOutputStream();
    byte[] buffer =new byte[1024];
    int len = -1;
    while ((len = gzipStream.read(buffer))!=-1) {
    outStream.write(buffer,
    0, len);
    }
    byte[] data = outStream.toByteArray();
    outStream.close();
    gzipStream.close();
    inStream.close();
    return new String(data, charsetName);
    }

    }
  • 相关阅读:
    Mysql存储过程和函数
    python反编译chm文件并生成pdf文件
    python转换html到pdf文件
    python获取系统开机时间
    OpenSL ES: 利用OpenSL ES实现录音功能
    android: 根据文件uri 获取文件名
    Java: InputStream转化为byte数组
    Linux: 查看二进制文件
    Vim: 回到上次编辑的位置
    LayoutInflate: Avoid passing null as the view root
  • 原文地址:https://www.cnblogs.com/linjiqin/p/2064736.html
Copyright © 2020-2023  润新知