mrunit for wordcount demo

import java.io.IOException;

import java.util.ArrayList;

import java.util.List;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.LongWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mrunit.mapreduce.MapDriver;

import org.apache.hadoop.mrunit.types.Pair;

import org.junit.Test;

public class wordcountest {

    @SuppressWarnings({ "rawtypes", "unchecked" })

    @Test

    public void test() throws IOException {

        // fail("Not yet implemented");

        Text value = new Text("hello world hello hadoop");

        List<Pair<Text, IntWritable>> outputs = new ArrayList<Pair<Text, IntWritable>>();

        outputs.add(new Pair(new Text("hello"), new IntWritable(1)));

        outputs.add(new Pair(new Text("world"), new IntWritable(1)));

        outputs.add(new Pair(new Text("hello"), new IntWritable(1)));

        outputs.add(new Pair(new Text("hadoop"), new IntWritable(1)));

        // Temperature ^^^^^

        new MapDriver<Object, Text, Text, IntWritable>()

                .withMapper(new WordCount.TokenizerMapper())

                .withInput(new LongWritable(0), value).withAllOutput(outputs)

                .runTest();

    }

}

在hadoop权威指南的第六章有详细讲解这方面的示例，可以参照以完成自己的需求，如果需要更多的理解，可以多看下TestDriver类来满足自己的需求和修改。

模拟小集群测试的时候需要扩展ClusterMapReduceTestCase此类，可以参照hadoop权威指南中的示例。

建议在开发mapreduce的时候，写好自己的单元测试，这样对数据的质量有进一步的保证，而且要让写mrunit成为编写mapreduce程序的一种标准。

秒客网

mrunit for wordcount demo

相关文章