可能有部分重复: $HADOOP_HOME/share/hadoop/common $HADOOP_HOME/share

时间:2022-05-01 04:08:22

Windows7: Eclipse + JDK1.8.0

Ubuntu14.04:Hadoop2.9.0

二、筹备事情——导入JAR包

1. 建一个Hadoop专用的事情空间

2. 在事情空间的目录下建一个专门用来存放开发MapReduce措施所需的Hadoop依赖的JAR包的文件夹

所需的JAR包在Ubuntu中$HADOOP_HOME/share/hadoop下,,将JAR包复制到刚刚建好的文件夹中

需要的JAR包如下,可能有部分反复:

$HADOOP_HOME/share/hadoop/common

$HADOOP_HOME/share/hadoop/hdfs

$HADOOP_HOME/share/hadoop/mapreduce

$HADOOP_HOME/share/hadoop/tools/lib

可能有部分重复: $HADOOP_HOME/share/hadoop/common $HADOOP_HOME/share

$HADOOP_HOME/share/hadoop/yarn

可能有部分重复: $HADOOP_HOME/share/hadoop/common $HADOOP_HOME/share

3. 新建用户库

Windows → Preference → Java → Build Path → User Libraries → New...

看到如下界面:

可能有部分重复: $HADOOP_HOME/share/hadoop/common $HADOOP_HOME/share

点击OK后看到如下界面:

可能有部分重复: $HADOOP_HOME/share/hadoop/common $HADOOP_HOME/share

点击Add External JARs... → 在刚刚建好的文件夹中选中所有JAR包 → 打开 → OK

用户库创建告成!

三、创建一个Java工程

File → New → Java Project

除了红框的内容,其他选项默认

可能有部分重复: $HADOOP_HOME/share/hadoop/common $HADOOP_HOME/share

右击项目名 → Build Path → Add Libraries... → User Library → 选中建好的用户库

可能有部分重复: $HADOOP_HOME/share/hadoop/common $HADOOP_HOME/share

四、MapReduce代码的实现

1. WordMapper类

可能有部分重复: $HADOOP_HOME/share/hadoop/common $HADOOP_HOME/share

可能有部分重复: $HADOOP_HOME/share/hadoop/common $HADOOP_HOME/share

package wordCount; import java.io.IOException; import java.util.StringTokenizer; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Mapper; public class WordMapper extends Mapper<Object, Text, Text, IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(Object key, Text value, Context context) throws IOException, InterruptedException { StringTokenizer itr = new StringTokenizer(value.toString()); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one); } } }

2. WordReducer类

可能有部分重复: $HADOOP_HOME/share/hadoop/common $HADOOP_HOME/share

可能有部分重复: $HADOOP_HOME/share/hadoop/common $HADOOP_HOME/share

package wordCount; import java.io.IOException; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Reducer; public class WordReducer extends Reducer<Text, IntWritable, Text, IntWritable> { private IntWritable result = new IntWritable(); public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { int sum = 0; for (IntWritable val : values) { sum += val.get(); } result.set(sum); context.write(key, result); } }

3. WordMain驱动类

可能有部分重复: $HADOOP_HOME/share/hadoop/common $HADOOP_HOME/share