• Ubuntu 14.10 下Eclipse安装Hadoop插件


    准备环境

      1 安装好了Hadoop,之前安装了Hadoop 2.5.0,安装参考http://www.cnblogs.com/liuchangchun/p/4097286.html

      2 安装Eclipse,这个直接在其官网下载即可

    安装步骤

      1 下载Eclipse插件,我找的是Hadoop 2.2 的插件,在Hadoop 2.5 下可以正常用,获取插件这里有两种方式

        1.1 一是自己下载源码自己编译,过程如下

        首先,下载eclipse-hadoop的插件,网址是https://github.com/winghc/hadoop2x-eclipse-plugin,你可以点击网页右下方的Download ZIP下载。下载之后,解压缩,。

        然后,进入到 hadoop2x-eclipse-plugin-master/src/contrib/eclipse-plugin文件夹里面,执行命令

        ant jar -Declipse.home=/usr/local/eclipse -Dhadoop.home=~/Downloads/hadoop-2.2.0 -Dversion=2.5.0

        编译顺利通过,生成的插件在hadoop2x-eclipse-plugin-master/build/contrib/eclipse-plugin目录下。

        1.2 或是直接下载编译好的插件,下载地址http://pan.baidu.com/s/1mgiHFok

      2 将下载好的插件复制到eclipse/plugins目录下,需要重启Eclipse

      3 配置Hadoop installation directory   

        3.1 如果插件安装成功,打开Windows—Preferences后,在窗口左侧会有Hadoop Map/Reduce选项,点击此选项,在窗口右侧设置Hadoop安装路径。

        3.2 配置Map/Reduce Locations打开Windows—Open Perspective—Other  选择Map/Reduce,点击OK

        3.3 点击Map/Reduce Location选项卡,点击右边小象图标,打开Hadoop Location配置窗口:输入Location Name,任意名称即可.配置Map/Reduce Master和DFS Mastrer,Host和Port配置成与core-    site.xml的设置一致即可。如果没有自己修改端口,那么一个是9001,一个是9000

        3.4 点击左侧的DFSLocations—>Location Name(上一步配置的location name),如能看到Hadoop下的文件,那么表示安装成功。

      4 测试MapReduce。Eclipse中,File—>Project,选择Map/Reduce Project,输入项目名称WordCount等。然后新建一个类,代码拷贝下

    import java.io.IOException;
    import java.util.StringTokenizer;
    
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.IntWritable;
    import org.apache.hadoop.io.Text;
    import org.apache.hadoop.mapreduce.Job;
    import org.apache.hadoop.mapreduce.Mapper;
    import org.apache.hadoop.mapreduce.Reducer;
    import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
    import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
    import org.apache.hadoop.util.GenericOptionsParser;
    
    public class WordCount {
    
        public static class TokenizerMapper extends
                Mapper<Object, Text, Text, IntWritable> {
            private final static IntWritable one = new IntWritable(1);
            private Text word = new Text();
    
            public void map(Object key, Text value, Context context)
                    throws IOException, InterruptedException {
                StringTokenizer itr = new StringTokenizer(value.toString());
                while (itr.hasMoreTokens()) {
                    word.set(itr.nextToken());
                    context.write(word, one);
                }
            }
        }
    
        public static class IntSumReducer extends
                Reducer<Text, IntWritable, Text, IntWritable> {
            private IntWritable result = new IntWritable();
    
            public void reduce(Text key, Iterable<IntWritable> values,
                    Context context) throws IOException, InterruptedException {
                int sum = 0;
                for (IntWritable val : values) {
                    sum += val.get();
                }
                result.set(sum);
                context.write(key, result);
            }
        }
    
        public static void main(String[] args) throws Exception {
            Configuration conf = new Configuration();
            String[] otherArgs = new GenericOptionsParser(conf, args)
                    .getRemainingArgs();
            if (otherArgs.length != 2) {
                System.err.println("Usage: wordcount <in> <out>");
                System.exit(2);
            }
            Job job = new Job(conf, "word count");
            job.setJarByClass(WordCount.class);
            job.setMapperClass(TokenizerMapper.class);
            job.setCombinerClass(IntSumReducer.class);
            job.setReducerClass(IntSumReducer.class);
            job.setOutputKeyClass(Text.class);
            job.setOutputValueClass(IntWritable.class);
            FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
            FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
            System.exit(job.waitForCompletion(true) ? 0 : 1);
        }
    }

      5 运行项目,先需要做些准备工作  

      5.1、在HDFS上创建目录input

            hadoop fs -mkdir input

      5.2 、随便拷贝本地README.txt到HDFS的input里

             hadoop fs -copyFromLocal /usr/local/hadoop/README.txt input

           5.3、点击WordCount.java,右键,点击Run As—>Run Configurations,配置运行参数,即输入和输出文件夹

      hdfs://localhost:9000/user/hadoop/input hdfs://localhost:9000/user/hadoop/output

      5.4 注意,输入目录output不要在Hadoop中建立,否则会报错

      6 查看结果,可以直接在DFS Locations刷新下就会看到多个目录,里面就有结果

    ----------------------------------------------------------------------------------------------------------------------------------------

      WordCount程序上面是写在一个类里面,规范一点是Map类,Reduce类,MapRedcueDriver分开建立,低耦合

      1 新建Map/Reduce工程wordcount。

      2 新建Mapper.java,选择File——>New——>Mapper,输入包名及类名。

      3 新建Reduccer.java,选择File——>New——>Reducer,输入包名及类名。

      4 建立Map/Reduce Driver,选择File——>New——>MapReduce Driver,输入包名及类名。

      5 运行,同上面

      

     

      

     

  • 相关阅读:
    HDU 5583 Kingdom of Black and White 水题
    HDU 5578 Friendship of Frog 水题
    Codeforces Round #190 (Div. 2) E. Ciel the Commander 点分治
    hdu 5594 ZYB's Prime 最大流
    hdu 5593 ZYB's Tree 树形dp
    hdu 5592 ZYB's Game 树状数组
    hdu 5591 ZYB's Game 博弈论
    HDU 5590 ZYB's Biology 水题
    cdoj 1256 昊昊爱运动 预处理/前缀和
    cdoj 1255 斓少摘苹果 贪心
  • 原文地址:https://www.cnblogs.com/liuchangchun/p/4121817.html
Copyright © 2020-2023  润新知