6.3 MRUnit写Mapper和Reduce的单元测试

1.1 MRUnit写单元测试

　作用：一旦MapReduce项目提交到集群之后，若是出现问题是很难定位和修改的，只能通过打印日志的方式进行筛选。又如果数据和项目较大时，修改起来则更加麻烦。所以，在将MapReduce项目提交到集群上之前，我们需要先对其进行单元测试。单元测试需要用到mrunit库，这个库中包含MapDriver、ReduceDriver、MapReduceDriver，可以通过三个类，输入简单的数据进行测试map和reduce的逻辑是否正确。

1.1.1 Mapper单元测试

（1）包含测试驱动库mrunit

在pom.xml文件中加入mrunit的依赖，保存会自动下载mrunit库。

<dependency>

    <groupId>org.apache.mrunit</groupId>

    <artifactId>mrunit</artifactId>

    <version>1.1.0</version>

    <!--<scope>test</scope>-->

    <!--不加导包可能失败-->

    <classifier>hadoop2</classifier>

</dependency>

（2）TemperatureMapper类

package Temperature;





import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.LongWritable;

import org.apache.hadoop.io.Text;

//import org.apache.hadoop.mapred.MapReduceBase;

//import org.apache.hadoop.mapred.Mapper;

//import org.apache.hadoop.mapred.OutputCollector;

//import org.apache.hadoop.mapred.Reporter;

import org.apache.hadoop.mapred.Mapper;

import org.apache.hadoop.mapred.MapReduceBase;

import org.apache.hadoop.mapred.OutputCollector;

import org.apache.hadoop.mapred.Reporter;

//import org.apache.hadoop.mapreduce.Mapper;



import java.io.IOException;



//public class TemperatureMapper extends Mapper<LongWritable, Text, Text, IntWritable> {

    public class TemperatureMapper extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> {





    private static final int MISSING=9999;

    public void map(LongWritable longWritable, Text text, OutputCollector<Text, IntWritable> outputCollector, Reporter reporter) throws IOException {

        String line=text.toString();

        String year=line.substring(15,19);

        int airTemperture=MISSING;

        if(line.charAt(87)=='+'){

            airTemperture=Integer.parseInt(line.substring(88,92));

        }else{

            airTemperture=Integer.parseInt(line.substring(87,92));

        }

        String quality=line.substring(92,93);

        if(airTemperture!=MISSING&&quality.matches("[01459]")){

            outputCollector.collect(new Text(year),new IntWritable(airTemperture));

        }

    }

}

（3）maper测试类

package Temperature;



import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.LongWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mrunit.MapDriver;

import org.junit.Test;



import java.io.IOException;





public class TemperatureMapperTest {



   @Test//注解表示为测试类

    public void TestMapper() throws IOException,InterruptedException{

       Text value=new Text("0057332130999991950010103004+51317+028783FM-12+017199999V0203201N00721004501CN0100001N9-01281-01391102681");//一行测试数据

      new MapDriver<LongWritable, Text, Text, IntWritable>()

              .withMapper(new TemperatureMapper())//传入要测试mapper

       .withInput(new LongWritable(0), value)//输入值

       .withOutput(new Text("1950"), new IntWritable(-128))//验证输出值是否这个，不是则测试出错

       .runTest();//开始测试

   }

}

（4）执行测试

右键TemperatureMapperTest.java，单击选项run TemperatureMapperTest。如果没有run选项，需要单击文件夹，点击Create run configuration按钮，创建run测试。再次右击TemperatureMapperTest.java就会出现run按钮。

单击run按钮就会运行测试程序，成功会显示tests passed

如果将-128改为-118，在运行测试，就会出现test failed

java.lang.AssertionError: 1 Error(s): (Missing expected output (1950, -118) at position 0, got (1950, -128).)

（5）新旧mapper

新旧Mapper和测试类型import要匹配，否则会出现错误。

旧的mapper

import org.apache.hadoop.mapred.Mapper;

public class TemperatureMapper extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> {

旧的测试Driver

import org.apache.hadoop.mrunit.MapDriver;

新的mapper

import org.apache.hadoop.mapreduce.Mapper;

public class TemperatureMapperNew extends Mapper<LongWritable, Text, Text, IntWritable> {

新的测试Driver

import org.apache.hadoop.mrunit.mapreduce.MapDriver;

（6）@Test的作用

@Test的使用是该方法可以不用main方法调用就可以测试出运行结果，是一种测试方法，一般函数都需要有main方法调用才能执行，注意被测试的方法必须是public修饰的。

1.1.2 Reduce单元测试

Reduce测试也需要依赖mrunit的库，

（1）reduce类

package Temperature;



import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapred.MapReduceBase;

import org.apache.hadoop.mapred.OutputCollector;

import org.apache.hadoop.mapred.Reducer;

import org.apache.hadoop.mapred.Reporter;



import java.io.IOException;

import java.util.Iterator;



public class MaxTempertureReduce extends MapReduceBase implements Reducer<Text, IntWritable,Text,IntWritable> {

    public void reduce(Text text, Iterator<IntWritable> iterator, OutputCollector<Text, IntWritable> outputCollector, Reporter reporter) throws IOException {

        int MaxValue = Integer.MIN_VALUE;

        while (iterator.hasNext()) {

            MaxValue = Math.max(MaxValue, iterator.next().get());

        }

        outputCollector.collect(text, new IntWritable(MaxValue));

    }

}

（1）Reduce测试类

package Temperature;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mrunit.ReduceDriver;

import org.junit.Test;

import java.io.IOException;

import java.util.Arrays;

public class MaxtemperatureReduceTest {

    @Test

    public void ReduceTest() throws IOException{

        new ReduceDriver<Text, IntWritable, Text, IntWritable>()

                .withReducer(new MaxTempertureReduce())

                .withInput(new Text("1950"), Arrays.asList(new IntWritable(10),new IntWritable(5)))

                .withOutput(new Text("1950"),new IntWritable(10) )

                .runTest();

    }

}

自己开发了一个股票智能分析软件，功能很强大，需要的点击下面的链接获取：

https://www.cnblogs.com/bclshuai/p/11380657.html

相关阅读:
P1135 奇怪的电梯
 pycharm设置快捷键在keymap下拉列表没有eclipse怎么办
 记录selenium简单实现自动点击操作
 selenium 批量下载文件,json,重命名
 python3.6+selenium使用chrome浏览器自动将文件下载到指定路径
 selenium + Java 设置文件默认下载路径
 详解介绍Selenium常用API的使用Java语言（完整版）
Pycharm安装robot framework运行插件
 Python之robotframework+pycharm测试框架！
基于Python3 Robot framework环境搭建
原文地址：https://www.cnblogs.com/bclshuai/p/11905653.html

最新文章
GetModuleBaseName用法
 053454
053453
053452
053451
053450
053449
053448
053447
053446