
修改陆喜恒. Hadoop实战(第2版)5.3排序的代码时遇到IO异常。
环境:Mac OS X 10.9.5, IntelliJ IDEA 13.1.5, Hadoop 1.2.1
异常具体信息如下
14/10/06 03:08:51 INFO mapred.JobClient: Task Id : attempt_201410021756_0043_m_000000_0, Status : FAILED
java.io.IOException: Type mismatch in value from map: expected org.apache.hadoop.io.IntWritable, recieved org.apache.hadoop.io.Text
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1024)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:690)
at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at main.ch5.ReSort$Map.map(ReSort.java:51)
at main.ch5.ReSort$Map.map(ReSort.java:43)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
相关代码如下
public static class Map extends Mapper<LongWritable, Text, IntWritable, Text> {
// ...
} public static class Reduce extends Reducer<IntWritable, Text, IntWritable, IntWritable> {
// ...
} public static void main(String[] args){
// ...
job.setOutputFormatClass(TextOutputFormat.class);
job.setOutputKeyClass(IntWritable.class);
job.setOutputValueClass(IntWritable.class);
// ...
}
Map的输出与Reduce的输入类型相符。但是,根据错误信息,Map的value值预期为IntWritable,接受到的却是Text,两者类型不匹配。另外,错误提示与代码定义也不同,因为定义声明了map阶段输出值为<IntWritable, Text>。
造成这个问题的原因等以后阅读源码时再分析,先处理问题。配置作业的MapOutputKeyClass和MapOutputValueClass参数,将其设置成Map输出类型即可。
job.setMapOutputKeyClass(/*K2*/IntWritable.class);
job.setMapOutputValueClass(/*V2*/Text.class);