• java假设模拟请求重新启动路由器(网络爬虫经常使用),还有java怎样下载图片


    我们假设在公司或家里使用网络爬虫去抓取自己索要的一些数据的时候,经常对方的站点有defence机制,会给你的http请求返回500错误,仅仅要是同样IP就请求不到数据,这时候我们仅仅能去重新启动路由器,这样IP地址会改变,网络爬虫就能正常工作了

    以下是通过发送Socket请求来模拟路由器的重新启动指令:

    protected void rebotadsl() {
    			try {
    				BufferedOutputStream sender = null;
    				String url = baseURL;
    				URL target = new URL(url);
    				InetAddress address = InetAddress.getByName(target.getHost());
    				Socket client = new Socket(address, 8080);
    				sender = new BufferedOutputStream(client.getOutputStream());
    				String str = "";
    				String cmd = "GET "
    						+ "/userRpm/StatusRpm.htm?Disconnect=%B6%CF%20%CF%DF&wan=1"
    						+ " HTTP/1.0
    " + "User-Agent: myselfHttp/1.0
    "
    						+ "Accept: www/source; text/html; image/gif; */*
    "
    						+ "Authorization: Basic" + " " + luyou + "
    "
    						+ "
    "; //luyou填写路由器的password,如YWRtaW46d2FuZzIwMDU=
    				sender.write(cmd.getBytes(), 0, cmd.length());
    				sender.flush();
    				System.out.println("因为重定向路由器断线了");
    
    
    			} catch (Exception ex) {
    				ex.printStackTrace();
    
    
    			}
    
    
    		}

    当然了,我们得写一个算法来使用这个函数,如两次重新启动路由器时间不能太短


    java下载图片:

    /**
    	 *发送图片信息到server下载图片,应用ISO8859-1 
    	 */
    	public void sendPic(String url,String story,String name){
    		setURL(url);
    		HttpClient http = new HttpClient();
    		http.getHttpConnectionManager().getParams().setConnectionTimeout(100000);   
    		GetMethod get=null;
    	      try{
    	       get = new GetMethod(url);
    	      }catch(IllegalArgumentException ex){
    	    	  Log.logException("url带有不规则字符", ex);
    	    	  setStatus(baseURL, ERROR);
    	    	  _body.setLength(0);
    	    	  return;
    	      }
    		get.getParams().setParameter(HttpMethodParams.SO_TIMEOUT,100000);  	
    		get.setFollowRedirects(false);	
    		 int er = 0;
    		try{
    			get.addRequestHeader("user-agent",useragent);
    			 er = http.executeMethod(get);
    		     System.out.println("server return code"+er);
    		}catch(Exception ex){
    			System.out.println("发送图片url到server訪问失败");
    			 try{
    	  			    Thread.sleep(120000);
    	  			 }catch(InterruptedException e){
    	  			   }
    	  			 try {
    					er = http.executeMethod(get);
    				} catch (Exception e) {
    					System.out.println("连不上server,系统将推出");
    					System.exit(0);
    				}
    		}
    		
    		 if (er == 200) {
    	    	InputStream is = null;
    	          //读取从server传过来的页面数据
    	    	   try {
    	    		   is = get.getResponseBodyAsStream();
    	   		} catch (Exception e) {
    	   		 System.out.println("读取server内容响应时错误发生");
    	   		}
    	   		byte buffer[] = new byte[20480];
    			byte tbuf[] = new byte[204800];
    			StringBuffer bf = new StringBuffer();
    			try {
    				int tl=0;
    				while (true) {
    					int l = is.read(buffer);
    					if (l < 0 || l+tl>204800)
    						break;
    					for(int jj=0;jj<l;jj++)
    					    tbuf[tl+jj]=buffer[jj];
    					tl+=l;
    				}
    				bf.append(new String(tbuf, 0, tl, "ISO8859-1"));
    			   _body.setLength(0);
    		       _body.append(bf.toString());
    		      }catch(IOException ex){
    		    	  System.out.println("将server的数据转换成String时错误发生");
    		      }
    	    	  
    		 }//end if(er == 200)
    		 //下载图片到硬盘上
    		 
    	
    		 File outputfile = new File(story,name);
    		 try{
    			 FileOutputStream fos = new FileOutputStream(outputfile);
    	         fos.write(_body.toString().getBytes("ISO8859-1"));
    	         fos.close();
    			}catch(IOException ex){
    				System.out.println("IO存本地错误发生");
    			}
    	}
    	
    
    	// 返回true表示该url在数据库中已存在
    	public boolean URLisExist(String url) {
    		ResultSet rs = null;
    		boolean b = true;
    		int count = 0;
    		try {
    			_prepGetCount.setString(1, url);
    			rs = _prepGetCount.executeQuery();
    			rs.next();
    			count = rs.getInt("qty");
    		} catch (Exception ex) {
    			System.out.println("URLisExist错误发生");
    			try {
    				if (rs != null) {
    					rs.close();
    				}
    			} catch (Exception e1) {
    				System.out.println("rs关闭时错误发生");
    			}
    		}
    		if (count < 1)
    			b = false;
    		return b;
    	}


  • 相关阅读:
    在线自动创建springboot工程
    java线程自带队列的使用以及线程阻塞
    如何分析java内存泄漏问题
    java接口入参模板化,适用于企业化服务远程调度模板化的场景,接口入参实现高度可配置化
    打造springboot高性能服务器(spring reactor的使用)
    docker 5 docker-阿里云加速配置
    docker 4 docker的三要素
    docker 3 docker安装
    docker 2 docker介绍
    docker 1 为什么要使用docker
  • 原文地址:https://www.cnblogs.com/zfyouxi/p/4182117.html
Copyright © 2020-2023  润新知