• hadoop--mapreduce--自定义key类型


    问题:

    输入文件A的样例如下(注意文件以tab为分隔符,粘贴时请检查):

    20170101     x

    20170102     y

    20170103     x

    20170104     y

    20170105     z

    20170106     x

     

     

     

     

     

     

    输入文件B的样例如下:

    20170101      y

    20170102      y

    20170103      x

    20170104      z

    20170105      y

     

     

     

     

     

     

    根据输入文件A和B合并得到的输出文件C的样例如下:

    20170101      x

    20170101      y

    20170102      y

    20170103      x

    20170104      y

    20170104      z

    20170105      y  20170105      z

    20170106      x

     

     

     

     

     

     

     

     

    代码实现:

     1 import org.
    apache.hadoop.fs.Path;
    2 import org.apache.hadoop.io.DoubleWritable; 3 import org.apache.hadoop.io.IntWritable; 4 import org.apache.hadoop.io.LongWritable; 5 import org.apache.hadoop.io.Text; 6 import org.apache.hadoop.mapreduce.Job; 7 import org.apache.hadoop.mapreduce.Mapper; 8 import org.apache.hadoop.mapreduce.Reducer; 9 import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; 10 import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; 11 import org.apache.hadoop.util.GenericOptionsParser; 12 13 public class Task1 { 14 public static class MapClass extends Mapper<LongWritable, Text, Text, Text>{ 15 public void map(LongWritable key,Text value,Context context) throws IOException, InterruptedException { 16 context.write(value, new Text("")); 17 } 18 } 19 public static class ReduceClass extends Reducer<Text,Text,Text,Text>{ 20 public void reduce( Text key, Iterable<Text> values,Context context) throws IOException, InterruptedException { 21 context.write(key, new Text("")); 22 } 23 } 24 public static void main(String args[]) throws IOException, ClassNotFoundException, InterruptedException { 25 Configuration conf = new Configuration(); 26 Job job = new Job(conf); 27 job.setJarByClass(Task1.class); 28 job.setMapperClass(MapClass.class); 29 job.setReducerClass(ReduceClass.class); 30 job.setOutputKeyClass(Text.class); 31 job.setOutputValueClass(Text.class); 32 33 FileInputFormat.addInputPath(job, new Path("C:\Users\Administrator\Desktop\新建文件夹\input2.txt") ); 34 FileInputFormat.addInputPath(job, new Path("C:\Users\Administrator\Desktop\\新建文件夹\input1.txt") ); 35 FileOutputFormat.setOutputPath(job, new Path("C:\Users\Administrator\Desktop\新建文件夹\output")); 36 37 System.exit(job.waitForCompletion(true)?0:1); 38 } 39 }

     结果:

     

     踩过的坑:

      reduce不执行的原因:

        1、程序出现过异常,可以通过日志来debug;

        2、参数类型不匹配;

        等

  • 相关阅读:
    【logback】认识logback
    【mybatis】认识selectKey
    【Mybatis】Insert批量操作
    JS事件委托
    android studio cmd获取SHA1 + java环境配置
    View的setOnClickListener的添加方法
    android apk 反编译 包括解密xml文件 资源文件 源代码
    localdb 2014 添加实例 v12.0 及IIS设置
    win10 离线安装 net 2.0 3.5
    c# json使用集
  • 原文地址:https://www.cnblogs.com/z-bear/p/9846089.html
Copyright © 2020-2023  润新知