• 6.3 MRUnit写Mapper和Reduce的单元测试


    1.1  MRUnit写单元测试

     作用:一旦MapReduce项目提交到集群之后,若是出现问题是很难定位和修改的,只能通过打印日志的方式进行筛选。又如果数据和项目较大时,修改起来则更加麻烦。所以,在将MapReduce项目提交到集群上之前,我们需要先对其进行单元测试。单元测试需要用到mrunit库,这个库中包含MapDriver、ReduceDriver、MapReduceDriver,可以通过三个类,输入简单的数据进行测试map和reduce的逻辑是否正确。

    1.1.1         Mapper单元测试

    (1)包含测试驱动库mrunit

    在pom.xml文件中加入mrunit的依赖,保存会自动下载mrunit库。

    <dependency>
        <groupId>org.apache.mrunit</groupId>
        <artifactId>mrunit</artifactId>
        <version>1.1.0</version>
        <!--<scope>test</scope>-->
        <!--不加导包可能失败-->
        <classifier>hadoop2</classifier>
    </dependency>

    (2)TemperatureMapper

    package Temperature;


    import org.apache.hadoop.io.IntWritable;
    import org.apache.hadoop.io.LongWritable;
    import org.apache.hadoop.io.Text;
    //import org.apache.hadoop.mapred.MapReduceBase;
    //import org.apache.hadoop.mapred.Mapper;
    //import org.apache.hadoop.mapred.OutputCollector;
    //import org.apache.hadoop.mapred.Reporter;
    import org.apache.hadoop.mapred.Mapper;
    import org.apache.hadoop.mapred.MapReduceBase;
    import org.apache.hadoop.mapred.OutputCollector;
    import org.apache.hadoop.mapred.Reporter;
    //import org.apache.hadoop.mapreduce.Mapper;

    import java.io.IOException;

    //public class TemperatureMapper extends Mapper<LongWritable, Text, Text, IntWritable> {
        public class TemperatureMapper extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> {


        private static final int MISSING=9999;
        public void map(LongWritable longWritable, Text text, OutputCollector<Text, IntWritable> outputCollector, Reporter reporter) throws IOException {
            String line=text.toString();
            String year=line.substring(15,19);
            int airTemperture=MISSING;
            if(line.charAt(87)=='+'){
                airTemperture=Integer.parseInt(line.substring(88,92));
            }else{
                airTemperture=Integer.parseInt(line.substring(87,92));
            }
            String quality=line.substring(92,93);
            if(airTemperture!=MISSING&&quality.matches("[01459]")){
                outputCollector.collect(new Text(year),new IntWritable(airTemperture));
            }
        }
    }

    (3)maper测试类

    package Temperature;

    import org.apache.hadoop.io.IntWritable;
    import org.apache.hadoop.io.LongWritable;
    import org.apache.hadoop.io.Text;
    import org.apache.hadoop.mrunit.MapDriver;
    import org.junit.Test;

    import java.io.IOException;


    public class TemperatureMapperTest {

       @Test//注解表示为测试类
        public void TestMapper() throws IOException,InterruptedException{
           Text value=new Text("0057332130999991950010103004+51317+028783FM-12+017199999V0203201N00721004501CN0100001N9-01281-01391102681");//一行测试数据
          new MapDriver<LongWritable, Text, Text, IntWritable>()
                  .withMapper(new TemperatureMapper())//传入要测试mapper
           .withInput(new LongWritable(0), value)//输入值
           .withOutput(new Text("1950"), new IntWritable(-128))//验证输出值是否这个,不是则测试出错
           .runTest();//开始测试
       }
    }

    (4)执行测试

    右键TemperatureMapperTest.java,单击选项run TemperatureMapperTest。如果没有run选项,需要单击文件夹,点击Create run configuration按钮,创建run测试。再次右击TemperatureMapperTest.java就会出现run按钮。

     

    单击run按钮就会运行测试程序,成功会显示tests passed

     

    如果将-128改为-118,在运行测试,就会出现test failed

     

    java.lang.AssertionError: 1 Error(s): (Missing expected output (1950, -118) at position 0, got (1950, -128).)

    (5)新旧mapper

    新旧Mapper和测试类型import要匹配,否则会出现错误。

    旧的mapper

    import org.apache.hadoop.mapred.Mapper;
    public class TemperatureMapper extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> {

    旧的测试Driver

    import org.apache.hadoop.mrunit.MapDriver;

    新的mapper

    import org.apache.hadoop.mapreduce.Mapper;
    public class TemperatureMapperNew extends Mapper<LongWritable, Text, Text, IntWritable> {

    新的测试Driver

    import org.apache.hadoop.mrunit.mapreduce.MapDriver;

    (6)@Test的作用

    @Test的使用是该方法可以不用main方法调用就可以测试出运行结果,是一种测试方法,一般函数都需要有main方法调用才能执行,注意被测试的方法必须是public修饰的。

    1.1.2         Reduce单元测试

    Reduce测试也需要依赖mrunit的库,

    (1)reduce类

    package Temperature;

    import org.apache.hadoop.io.IntWritable;
    import org.apache.hadoop.io.Text;
    import org.apache.hadoop.mapred.MapReduceBase;
    import org.apache.hadoop.mapred.OutputCollector;
    import org.apache.hadoop.mapred.Reducer;
    import org.apache.hadoop.mapred.Reporter;

    import java.io.IOException;
    import java.util.Iterator;

    public class MaxTempertureReduce extends MapReduceBase implements Reducer<Text, IntWritable,Text,IntWritable> {
        public void reduce(Text text, Iterator<IntWritable> iterator, OutputCollector<Text, IntWritable> outputCollector, Reporter reporter) throws IOException {
            int MaxValue = Integer.MIN_VALUE;
            while (iterator.hasNext()) {
                MaxValue = Math.max(MaxValue, iterator.next().get());
            }
            outputCollector.collect(text, new IntWritable(MaxValue));
        }
    }

    (1)Reduce测试类

    package Temperature;
    import org.apache.hadoop.io.IntWritable;
    import org.apache.hadoop.io.Text;
    import org.apache.hadoop.mrunit.ReduceDriver;
    import org.junit.Test;
    import java.io.IOException;
    import java.util.Arrays;
    public class MaxtemperatureReduceTest {
        @Test
        public void ReduceTest() throws IOException{
            new ReduceDriver<Text, IntWritable, Text, IntWritable>()
                    .withReducer(new MaxTempertureReduce())
                    .withInput(new Text("1950"), Arrays.asList(new IntWritable(10),new IntWritable(5)))
                    .withOutput(new Text("1950"),new IntWritable(10) )
                    .runTest();
        }
    }

    自己开发了一个股票智能分析软件,功能很强大,需要的点击下面的链接获取:

    https://www.cnblogs.com/bclshuai/p/11380657.html

  • 相关阅读:
    P1135 奇怪的电梯
    pycharm设置快捷键在keymap下拉列表没有eclipse怎么办
    记录selenium简单实现自动点击操作
    selenium 批量下载文件,json,重命名
    python3.6+selenium使用chrome浏览器自动将文件下载到指定路径
    selenium + Java 设置文件默认下载路径
    详解介绍Selenium常用API的使用Java语言(完整版)
    Pycharm安装robot framework运行插件
    Python之robotframework+pycharm测试框架!
    基于Python3 Robot framework环境搭建
  • 原文地址:https://www.cnblogs.com/bclshuai/p/11905653.html
Copyright © 2020-2023  润新知