• 爬虫小例子


    package com.textPa.two;
    
    import java.io.BufferedWriter;
    import java.io.File;
    import java.io.FileWriter;
    import java.nio.charset.Charset;
    
    import org.apache.http.HttpEntity;
    import org.apache.http.HttpResponse;
    import org.apache.http.client.methods.HttpGet;
    import org.apache.http.impl.client.CloseableHttpClient;
    import org.apache.http.impl.client.HttpClients;
    import org.apache.http.util.EntityUtils;
    
    public class RetrivePage {
        
        public static void main(String[] args) {
            CloseableHttpClient httpClient = HttpClients.createDefault();
    //        HttpGet getHttp = new HttpGet("http://www.baidu.com");
            HttpGet getHttp = new HttpGet("http://club.news.sohu.com/zz0578/thread/4bqnexpi3no");
            String content = null;
            BufferedWriter writer = null;
            
            HttpResponse response;
            try {
                response = httpClient.execute(getHttp);
                HttpEntity entity = response.getEntity();
                
                if(entity!=null){
                    content = EntityUtils.toString(entity,Charset.forName("GBK"));
                    System.out.println(content);
                    File file = new File("d:\baidu.html");
                    writer = new BufferedWriter(new FileWriter(file));
                    writer.write(content);
                    writer.flush();
                    writer.close();
                    System.out.println("创建成功");
                }
            }catch (Exception e) {
                // TODO: handle exception
            }
        }
        
    }

    所需要的两个jar包我后面会贴出来

    http://pan.baidu.com/s/1nuFuDUL

  • 相关阅读:
    [问题2014A13] 解答
    [问题2014A12] 解答
    [问题2014A13] 复旦高等代数 I(14级)每周一题(第十五教学周)
    [问题2014A10] 解答
    php使用amqplib方式使用rabbitmq
    Ubuntu 16.04 源码编译安装PHP7+swoole
    Ubuntu apt-get更换阿里云源
    微信企业号网页授权
    nginx转发swoole以及nginx负载
    PHP 命名空间
  • 原文地址:https://www.cnblogs.com/wangxiangstudy/p/5850123.html
Copyright © 2020-2023  润新知