• mac上eclipse上运行word count


    1.打开eclipse之后,建立wordcount项目

    package wordcount;
    import java.io.IOException;  
    import java.util.StringTokenizer;  
    import org.apache.hadoop.conf.Configuration;  
    import org.apache.hadoop.fs.Path;  
    import org.apache.hadoop.io.IntWritable;  
    import org.apache.hadoop.io.LongWritable;  
    import org.apache.hadoop.io.Text;  
    import org.apache.hadoop.mapreduce.Job;  
    import org.apache.hadoop.mapreduce.Mapper;  
    import org.apache.hadoop.mapreduce.Reducer;  
    import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;  
    import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;  
    public class WordCount {  
        public static class TokenizerMapper extends Mapper<LongWritable, Text, Text, IntWritable>{  
            private final static IntWritable one = new IntWritable(1);  
            private Text word = new Text();  
            public void map(LongWritable key, Text value, Context context)  
                    throws IOException, InterruptedException {  
                StringTokenizer itr = new StringTokenizer(value.toString());  
                while (itr.hasMoreTokens()) {  
                    word.set(itr.nextToken());  
                    context.write(word, one);  
                }  
            }  
        }  
        public static class IntSumReducer extends Reducer<Text, IntWritable, Text, IntWritable> {  
            private IntWritable result = new IntWritable();  
            public void reduce(Text key, Iterable<IntWritable> values, Context context)  
                    throws IOException, InterruptedException {  
                int sum = 0;  
                for (IntWritable val : values) {  
                    sum += val.get();  
                }  
                result.set(sum);  
                context.write(key, result);  
            }  
        }  
        public static void main(String[] args) throws Exception {  
            Configuration conf = new Configuration();  
            if (args.length != 2) {  
                System.err.println("Usage: wordcount  ");  
                System.exit(2);  
            }  
            Job job = new Job(conf, "word count");  
            job.setJarByClass(WordCount.class);  
            job.setMapperClass(TokenizerMapper.class);  
            job.setReducerClass(IntSumReducer.class);  
            job.setMapOutputKeyClass(Text.class);  
            job.setMapOutputValueClass(IntWritable.class);  
            job.setOutputKeyClass(Text.class);  
            job.setOutputValueClass(IntWritable.class);  
            FileInputFormat.addInputPath(job, new Path(args[0]));  
            FileOutputFormat.setOutputPath(job, new Path(args[1]));  
            System.exit(job.waitForCompletion(true) ? 0 : 1);  
        }  
    }  

    2.配置hadoop路径。

    把需要运行的文件放进input文件夹,如何在eclipse上的run configuration上配置需要运行的文件路径和运行结果路径,中间用一个空格隔开,如何点击apply-run,开始跑。

     

    3.用终端查看结果

    JIAS-MacBook-Pro:output jia$ cat part-r-00000 
    do    2
    excuse    1
    fine    1
    hello    2
    how    1
    me    1
    thank    2
    you    3
  • 相关阅读:
    面向对象编程思想(一)
    IT第十九天
    IT第十八天
    关于面试,来自无锡一位尊者的建议
    IT第十一天、第十二天、第十三天
    数据结构 3动态规划
    java 零碎1
    数据结构 2.迭代与递归
    数据结构 1.算法分析
    java 字符串(正则表达式)未完
  • 原文地址:https://www.cnblogs.com/aijianiula/p/3854372.html
Copyright © 2020-2023  润新知