• MapReduce实例——查询缺失扑克牌


    问题:

     解决:

    首先分为两个过程,Map过程将<=10的牌去掉,然后只针对于>10的牌进行分类,Reduce过程,将Map传过来的键值对进行统计,然后计算出少于3张牌的的花色

    1.代码

    1) Map代码

    1     String line = value.toString();
    2     String[] strs = line.split("-");
    3     if(strs.length == 2){
    4         int number = Integer.valueOf(strs[1]);
    5         if(number > 10){
    6             context.write(new Text(strs[0]), value);
    7         }
    8     }

    2) Reduce代码

    1      Iterator<Text> iter = values.iterator();
    2      int count = 0;
    3      while(iter.hasNext()){
    4         iter.next();
    5         count ++;
    6     }
    7     if(count < 3){
    8         context.write(key, NullWritable.get());
    9     }

    3) Runner代码

     1     Configuration conf = new Configuration();
     2     Job job = Job.getInstance(conf);
     3     job.setJobName("poker mr");
     4     job.setJarByClass(pokerRunner.class);
     5             
     6     job.setMapperClass(pakerMapper.class);
     7     job.setReducerClass(pakerRedue.class);
     8             
     9     job.setMapOutputKeyClass(Text.class);
    10     job.setMapOutputValueClass(Text.class);
    11             
    12     job.setOutputKeyClass(Text.class);
    13     job.setOutputValueClass(NullWriter.class);
    14             
    15     FileInputFormat.addInputPath(job, new Path(args[0]));
    16     FileOutputFormat.setOutputPath(job, new Path(args[1]));
    17             
    18     job.waitForCompletion(true);

    2.运行结果

    File System Counters

          FILE: Number of bytes read=87

          FILE: Number of bytes written=211167

          FILE: Number of read operations=0

          FILE: Number of large read operations=0

          FILE: Number of write operations=0

          HDFS: Number of bytes read=366

          HDFS: Number of bytes written=6

          HDFS: Number of read operations=6

          HDFS: Number of large read operations=0

          HDFS: Number of write operations=2

       Job Counters

          Launched map tasks=1

          Launched reduce tasks=1

          Data-local map tasks=1

          Total time spent by all maps in occupied slots (ms)=109577

          Total time spent by all reduces in occupied slots (ms)=42668

          Total time spent by all map tasks (ms)=109577

          Total time spent by all reduce tasks (ms)=42668

          Total vcore-seconds taken by all map tasks=109577

          Total vcore-seconds taken by all reduce tasks=42668

          Total megabyte-seconds taken by all map tasks=112206848

          Total megabyte-seconds taken by all reduce tasks=43692032

       Map-Reduce Framework

          Map input records=49

          Map output records=9

          Map output bytes=63

          Map output materialized bytes=87

          Input split bytes=110

          Combine input records=0

          Combine output records=0

          Reduce input groups=4

          Reduce shuffle bytes=87

          Reduce input records=9

          Reduce output records=3

          Spilled Records=18

          Shuffled Maps =1

          Failed Shuffles=0

          Merged Map outputs=1

          GC time elapsed (ms)=992

          CPU time spent (ms)=3150

          Physical memory (bytes) snapshot=210063360

          Virtual memory (bytes) snapshot=652480512

          Total committed heap usage (bytes)=129871872

       Shuffle Errors

          BAD_ID=0

          CONNECTION=0

          IO_ERROR=0

          WRONG_LENGTH=0

          WRONG_MAP=0

          WRONG_REDUCE=0

       File Input Format Counters

          Bytes Read=256

       File Output Format Counters

          Bytes Written=6

    3.运行方法

    在Eclipse里编译好,生出jar包,然后上传到linux系统上,在集群上运行该文件

    运行命令:bin/hadoop **.jar 类包名 /

    例如:bin/hadoop **.jar com.test.mr /

  • 相关阅读:
    树莓派写Python程序输入不了#
    树莓派系统安装、HDMI显示
    网络七层协议
    TCP协议中的三次握手和四次挥手(图解)(转载http://blog.csdn.net/whuslei/article/details/6667471)
    英语积累
    读《淘宝技术这十年》--笔记
    关于imx6核心板qt系统U盘挂载
    《赢在测试2》-- 推荐的阅读书籍
    三月,关于团队管理的重要性
    自动化测试及工具的一点理解
  • 原文地址:https://www.cnblogs.com/langgj/p/6612566.html
Copyright © 2020-2023  润新知