• Hadoop中OutputFormat解析


    一、OutputFormat

    OutputFormat描述的是MapReduce的输出格式,它主要的任务是:

      1.验证job输出格式的有效性,如:检查输出的目录是否存在。

      2.通过实现RecordWriter,将输出的结果写到文件系统的文件中。

    OutputFormat的主要是由三个抽象方法组成,下面根据源代码介绍每个方法的功能,源代码详解如下:

     1 public abstract class OutputFormat<K, V> {
     2 
     3   /** 
     4    * Get the {@link RecordWriter} for the given task. 
     5    *  得到给定任务的K-V对,即RecordWriter。
     6    * @param context the information about the current task.
     7    * @return a {@link RecordWriter} to write the output for the job.
     8    * @throws IOException
     9    */
    10   public abstract RecordWriter<K, V> getRecordWriter(TaskAttemptContext context) 
    11           throws IOException, InterruptedException;
    12 
    13   /** 
    14    * Check for validity of the output-specification for the job.
    15    * 为job检查输出格式的有效性。
    16    * <p>This is to validate the output specification for the job when it is
    17    * a job is submitted.  Typically checks that it does not already exist,
    18    * throwing an exception when it already exists, so that output is not
    19    * overwritten.</p>
    20    * 这里,当job被提交时验证输出格式。实际上检查输出目录是否已经存在,当存在时抛出exception。
    21    * 以至于原来的输出不会被覆盖。
    22    * @param context information about the job
    23    * @throws IOException when output should not be attempted
    24    */
    25   public abstract void checkOutputSpecs(JobContext context) throws IOException, InterruptedException;
    26 
    27   /**
    28    * Get the output committer for this output format. This is responsible
    29    * for ensuring the output is committed correctly.
    30    * 获得一个OutPutCommitter对象。这是用来确保输出被正确的提交。
    31    * @param context the task context
    32    * @return an output committer
    33    * @throws IOException
    34    * @throws InterruptedException
    35    */
    36   public abstract OutputCommitter getOutputCommitter(TaskAttemptContext context)
    37           throws IOException, InterruptedException;
    38 }
  • 相关阅读:
    Android上传图片到PHP服务器并且支持浏览器上传文件(word、图片、音乐等)
    Android+PHP服务器+MySQL实现安卓端的登录
    Win7重装系统遇到的问题以及MysQL的问题解决
    PHP学习之登录以及后台商品展示
    PHP学习之输出语句、注释、算数运算符
    利用Dreamweaver配置PHP服务器的站点
    WAMP集成环境的安装
    暑假计划
    Android提交数据到JavaWeb服务器实现登录
    Android之滑屏动画和自定义控件
  • 原文地址:https://www.cnblogs.com/rolly-yan/p/3704060.html
Copyright © 2020-2023  润新知