• 【大数据系列】windows环境下搭建hadoop开发环境使用api进行基本操作


    前言

    搭建完hadoop集群之后在windows环境下搭建java项目进行测试 操作hdfs中的文件

    版本一

    package com.slp.hadoop274.hdfs;
    
    import java.io.IOException;
    import java.io.InputStream;
    import java.net.URL;
    import java.net.URLConnection;
    
    import org.apache.hadoop.fs.FsUrlStreamHandlerFactory;
    import org.junit.Test;
    
    /**
     * 
     * @author sangliping
     *完成hdfs操作
     */
    public class TestHDFS {
    	
    
        /**
         * 读取hdfs文件
         * @throws IOException 
         */
    	@Test
    	public void readFile() throws IOException{
    		URL url = new URL("hdfs://192.168.181.201:8020/user/sanglp/hadoop/copyFromLocal");
    		URLConnection con = url.openConnection();
    		InputStream is = con.getInputStream();
    		byte[] buf = new byte[is.available()];
    		is.read(buf);
    		is.close();
    		String str = new String(buf,"UTF-8");
    		System.out.println(str);
    	}
    }
    

      以上运行测试的时候会报错,原因是URL无法识别hdfs协议

    版本二、

    package com.slp.hadoop274.hdfs;
    
    import java.io.IOException;
    import java.io.InputStream;
    import java.net.URL;
    import java.net.URLConnection;
    
    import org.apache.hadoop.fs.FsUrlStreamHandlerFactory;
    import org.junit.Test;
    
    /**
     * 
     * @author sangliping
     *完成hdfs操作
     */
    public class TestHDFS {
    	
    	static{
    		//注册hdfs协议否则URL无法识别该协议
    		URL.setURLStreamHandlerFactory(new FsUrlStreamHandlerFactory());
    	}
        /**
         * 读取hdfs文件
         * @throws IOException 
         */
    	@Test
    	public void readFile() throws IOException{
    		URL url = new URL("hdfs://192.168.181.201:8020/user/sanglp/hadoop/copyFromLocal");
    		URLConnection con = url.openConnection();
    		InputStream is = con.getInputStream();
    		byte[] buf = new byte[is.available()];
    		is.read(buf);
    		is.close();
    		String str = new String(buf,"UTF-8");
    		System.out.println(str);
    	}
    }
    

      这个时候就可以正确的打印出hdfs文件copyFromLocal的文件内容。

          附:可以将hadoop解压文件下etc中的log4j.properties文件放到项目文件src文件下使控制台打印更友好。

     版本三

            /**
    	 * 通过hadoop  api读取文件
    	 * @throws IOException 
    	 * java.lang.IllegalArgumentException: Wrong FS: hdfs://192.168.181.201:8020/user/sanglp/hadoop/copyFromLocal, expected: file:///
    	 * at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:649)
    	 */
    	@Test
    	public void readFileByApiWrong() throws IOException{
    		Configuration con = new Configuration();
    		FileSystem fs = FileSystem.get(con);
    		Path p = new Path("hdfs://192.168.181.201:8020/user/sanglp/hadoop/copyFromLocal");
    		FSDataInputStream fis = fs.open(p);
    		ByteArrayOutputStream baos = new ByteArrayOutputStream();
    		byte [] buf = new byte[1024];
    		int len = -1;
    		while((len=fis.read(buf))!=-1){
    			baos.write(buf,0,len);
    		}
    		fis.close();
    		baos.close();
    		System.out.println(new String(baos.toByteArray(),"UTF-8"));
    	}
    	
    

      此版本错误,因为未指定namenode

    版本四

            /**
    	 * 使用API用传统流读取hadoop文件
    	 * @throws IOException
    	 */
    	@Test
    	public void readFileByApi() throws IOException{
    		Configuration con = new Configuration();
    		con.set("fs.defaultFS", "hdfs://192.168.181.201:8020");
    		FileSystem fs = FileSystem.get(con);
    		//以下两种设置path的方法都可以
    		//Path p = new Path("hdfs://192.168.181.201:8020/user/sanglp/hadoop/copyFromLocal");
    		Path p = new Path("/user/sanglp/hadoop/copyFromLocal");
    
    		FSDataInputStream fis = fs.open(p);
    		ByteArrayOutputStream baos = new ByteArrayOutputStream();
    		byte [] buf = new byte[1024];
    		int len = -1;
    		while((len=fis.read(buf))!=-1){
    			baos.write(buf,0,len);
    		}
    		fis.close();
    		baos.close();
    		
    		System.out.println(new String(baos.toByteArray(),"UTF-8"));
    	}
    

      版本五

            /**
    	 * 使用API并用hadoop提供的IO工具读取hadoop文件
    	 * @throws IOException
    	 */
    	@Test
    	public void readFileByApiUsUtils() throws IOException{
    		Configuration con = new Configuration();
    		con.set("fs.defaultFS", "hdfs://192.168.181.201:8020");
    		FileSystem fs = FileSystem.get(con);
    		//以下两种设置path的方法都可以
    		//Path p = new Path("hdfs://192.168.181.201:8020/user/sanglp/hadoop/copyFromLocal");
    		Path p = new Path("/user/sanglp/hadoop/copyFromLocal");
    
    		FSDataInputStream fis = fs.open(p);
    		ByteArrayOutputStream baos = new ByteArrayOutputStream();
    		int buf = 1024;
    		IOUtils.copyBytes(fis, baos, buf);
    		System.out.println(new String(baos.toByteArray(),"UTF-8"));
    	}
    

      版本六

    /**
    	 * 使用API创建文件夹
    	 * @throws IOException
    	 * org.apache.hadoop.security.AccessControlException: Permission denied: user=hadoop, access=WRITE, inode="/user/sanglp":sanglp:supergroup:drwxr-xr-x
         * Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=hadoop, access=WRITE, inode="/user/sanglp":sanglp:supergroup:drwxr-xr-x
    	 */
    	@Test
    	public void makeDir() throws IOException{
    		Configuration con = new Configuration();
    		con.set("fs.defaultFS", "hdfs://192.168.181.201:8020");
    		FileSystem fs = FileSystem.get(con);
    		fs.mkdirs(new Path("/user/sanglp/myhadoop"));
    	}
    

      直接使用上诉API会出现没有权限的问题,需要修改权限

    hadoop fs -chmod 777 /user/sanglp
    

      版本七

     /**
         * 使用API创建文件
         * @throws IOException
         */
        @Test
        public void putFile() throws IOException{
            Configuration con = new Configuration();
            con.set("fs.defaultFS", "hdfs://192.168.181.201:8020");
            FileSystem fs = FileSystem.get(con);
           FSDataOutputStream out =  fs.create(new Path("/user/sanglp/myhadoop/a.txt"));
           out.write("test put file on myhadoop ".getBytes());
           out.close();
        }
    

      

  • 相关阅读:
    平衡二叉树 JAVA实现 亲测可用
    八大排序算法 JAVA实现 亲自测试 可用!
    Java 函数传入参数后,究竟发生了什么?java函数传参数原理解析
    运行错误:应用程序无法启动因为并行配置不正确。the application has failed to start because its side-by-side configuration is incorrect 解决方法
    mysql在linux下修改存储路径
    redis订阅与发布系统
    PHP Math常量
    PHP Math函数
    php 字符串函数
    PHP数组函数
  • 原文地址:https://www.cnblogs.com/dream-to-pku/p/7930241.html
Copyright © 2020-2023  润新知