• 每日总结


    1.HttpClient抓数据示例

    //获得链接,模拟登录的实现:
    public int getConnect(String user, String key) throws Exception {
        // 先发送get请求 获取cookie值和__ViewState值
        HttpGet getLogin = new HttpGet(true_url);
        // 第一步:主要的HTML:
        String loginhtml = "";
        HttpResponse loginResponse = new DefaultHttpClient().execute(getLogin);
        if (loginResponse.getStatusLine().getStatusCode() == 200) {
            HttpEntity entity = loginResponse.getEntity();
            loginhtml = EntityUtils.toString(entity);
            // 获取响应的cookie值
            cookie = loginResponse.getFirstHeader("Set-Cookie").getValue();
            System.out.println("cookie= " + cookie);
        }
    
        // 第二步:模拟登录
        // 发送Post请求,禁止重定向
        HttpPost httpPost = new HttpPost(true_url);
        httpPost.getParams().setParameter(ClientPNames.HANDLE_REDIRECTS, false);
    
        // 设置Post提交的头信息的参数
        httpPost.setHeader("User-Agent",
                "Mozilla/5.0 (Windows NT 6.3; WOW64; Trident/7.0; rv:11.0) like Gecko");
        httpPost.setHeader("Referer", true_url);
        httpPost.setHeader("Cookie", cookie);
    
        // 设置请求数据
        List<NameValuePair> params = new ArrayList<NameValuePair>();
    
        params.add(new BasicNameValuePair("__VIEWSTATE",
                getViewState(loginhtml)));// __VIEWSTATE参数,如果变化可以动态抓取获取
        params.add(new BasicNameValuePair("Button1", ""));
        params.add(new BasicNameValuePair("hidPdrs", ""));
        params.add(new BasicNameValuePair("hidsc", ""));
        params.add(new BasicNameValuePair("lbLanguage", ""));
        params.add(new BasicNameValuePair("RadioButtonList1", "%D1%A7%C9%FA"));
        params.add(new BasicNameValuePair("txtUserName", user));
        params.add(new BasicNameValuePair("TextBox2", key));
        params.add(new BasicNameValuePair("txtSecretCode", "")); // ( ╯□╰ )逗比正方,竟然不需要验证码
    
        // 设置编码方式,响应请求,获取响应状态码:
        httpPost.setEntity(new UrlEncodedFormEntity(params, "gb2312"));
        HttpResponse response = new DefaultHttpClient().execute(httpPost);
        int Status = response.getStatusLine().getStatusCode();
        if(Status == 200)return Status;
        System.out.println("Status= " + Status);
    
        // 重定向状态码为302
        if (Status == 302 || Status == 301) {
            // 获取头部信息中Location的值
            location = response.getFirstHeader("Location").getValue();
            System.out.println(location);
            // 第三步:获取管理信息的主页面
            // Get请求
            HttpGet httpGet = new HttpGet(ip_url + location);// 带上location地址访问
            httpGet.setHeader("Referer", true_url);
            httpGet.setHeader("Cookie", cookie);
    
            // 主页的html
            mainhtml = "";
            HttpResponse httpResponseget = new DefaultHttpClient()
                    .execute(httpGet);
            if (httpResponseget.getStatusLine().getStatusCode() == 200) {
                HttpEntity entity = httpResponseget.getEntity();
                mainhtml = EntityUtils.toString(entity);}}returnStatus;}
  • 相关阅读:
    Ajax实现文件下载
    jquery easyui 插件开发
    Chrome谷歌浏览器首页被改为Hao123导航怎么办|附各类解决方法【转】
    查看mysql版本的四种方法
    IntelliJ IDEA 快捷键大全
    Java中判断字符串是否为数字的五种方法
    比数据分析更要命的是:数据质量
    Python绘制六种可视化图表详解,三维图最炫酷!你觉得呢?
    大数据需要好设计
    Python模块学习filecmp文件比较
  • 原文地址:https://www.cnblogs.com/chenghaixiang/p/14912255.html
Copyright © 2020-2023  润新知