MRUnit使用技巧

来源：动视网责编：小采时间：2020-11-09 13:17:07

MRUnit使用技巧

MRUnit使用技巧:导读为了能测试编写的hadoop组件和MapReduce程序，一般有下面三种思路：一、使用hadoop-eclipse插件来调试MapReduce程序，不过这在hadoop比较新的版本里已经不再提供了；二、是配置jvm参数远程调试hadoop组件。这种方式用于读hadoop源代码比较适

推荐度：

点击下载本文 文档为doc格式

导读MRUnit使用技巧:导读为了能测试编写的hadoop组件和MapReduce程序，一般有下面三种思路：一、使用hadoop-eclipse插件来调试MapReduce程序，不过这在hadoop比较新的版本里已经不再提供了；二、是配置jvm参数远程调试hadoop组件。这种方式用于读hadoop源代码比较适

导读为了能测试编写的hadoop组件和MapReduce程序，一般有下面三种思路：一、使用hadoop-eclipse插件来调试MapReduce程序，不过这在hadoop比较新的版本里已经不再提供了；二、是配置jvm参数远程调试hadoop组件。这种方式用于读hadoop源代码比较适合，而如

导读

为了能测试编写的hadoop组件和MapReduce程序，一般有下面三种思路：

一、使用hadoop-eclipse插件来调试MapReduce程序，不过这在hadoop比较新的版本里已经不再提供了；

二、是配置jvm参数远程调试hadoop组件。这种方式用于读hadoop源代码比较适合，而如果用于远程调试MapReduce还是有点麻烦的；

详细参考的文档有：

http://blog.javachen.com/hadoop/2013/08/01/remote-debug-hadoop/

http://zhangjie.me/eclipse-debug-hadoop/

三、最后我选择了MRuinit来用于主要开发调试MapReduce应用程序。

MRunit简介

MRunit是用于做MapReduce单元测试的java库。使用apache发布，下载地址是：http://mrunit.apache.org/general/downloads.html

MRUnit测试框架是基于JUnit的。我们可以方便的测试Map ?Reduce程序。它适用于?0.20 , 0.23.x , 1.0.x , 2.x 等 Hadoop版本。

下面我们来做些MRunit的使用官方例子（SMS CDR (call details record) analysis）：

使用记录如下

CDRID;CDRType;Phone1;Phone2;SMS Status Code
655209;1;7967372490213;8044229381158;6
353415;0;356857119806206;287572231184798;4
835699;1;252280313968413;8717902341635;0

需要做的事情是查找所有CDRType 为1的记录和它相关的状态码（SMS Status Code）
Map输出应该是：
6, 1
0, 1

代码如下：

public class SMSCDRMapper extends Mapper {
 private Text status = new Text();
 private final static IntWritable addOne = new IntWritable(1);
 /**
 * Returns the SMS status code and its count
 */
 protected void map(LongWritable key, Text value, Context context)
 throws java.io.IOException, InterruptedException {
 //655209;1;7967372490213;8044229381158;6 is the Sample record format
 String[] line = value.toString().split(";");
 // If record is of SMS CDR
 if (Integer.parseInt(line[1]) == 1) {
 status.set(line[4]);
 context.write(status, addOne);
 }
 }
}

Reduce 程序把最后的结果相加，程序如下：

public class SMSCDRReducer extends
 Reducer {
 protected void reduce(Text key, Iterable values, Context context) throws java.io.IOException, InterruptedException {
 int sum = 0;
 for (IntWritable value : values) {
 sum += value.get();
 }
 context.write(key, new IntWritable(sum));
 }
}

MRunit的测试程序如下：

import java.util.ArrayList;
import java.util.List;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mrunit.mapreduce.MapDriver;
import org.apache.hadoop.mrunit.mapreduce.MapReduceDriver;
import org.apache.hadoop.mrunit.mapreduce.ReduceDriver;
import org.junit.Before;
import org.junit.Test;
public class SMSCDRMapperReducerTest {
 MapDriver mapDriver;
 ReduceDriver reduceDriver;
 MapReduceDriver mapReduceDriver;
 @Before
 public void setUp() {
 SMSCDRMapper mapper = new SMSCDRMapper();
 SMSCDRReducer reducer = new SMSCDRReducer();
 mapDriver = MapDriver.newMapDriver(mapper);;
 reduceDriver = ReduceDriver.newReduceDriver(reducer);
 mapReduceDriver = MapReduceDriver.newMapReduceDriver(mapper, reducer);
 }
 @Test
 public void testMapper() {
 mapDriver.withInput(new LongWritable(), new Text(
 "655209;1;7967372490213;8044229381158;6"));
 mapDriver.withOutput(new Text("6"), new IntWritable(1));
 mapDriver.runTest();
 }
 @Test
 public void testReducer() {
 List values = new ArrayList();
 values.add(new IntWritable(1));
 values.add(new IntWritable(1));
 reduceDriver.withInput(new Text("6"), values);
 reduceDriver.withOutput(new Text("6"), new IntWritable(2));
 reduceDriver.runTest();
 }
}

使用过JUnit的就应该知道怎么运行上面的代码了，这里就不重复了。

MRUint可以测试单个Map，单个Reduce和一个MapReduce或者多个MapReduce程序。
详细的可以参考官网文档：MRUnit Tutorial

参考：http://www.cnblogs.com/gpcuster/archive/2009/10/04/1577921.html

原文地址：MRUnit使用技巧, 感谢原作者分享。