命令行下编译Wordcount

时间:2023-12-24 16:36:37

1. 编辑WordCount.java文件,在下载的hadoop安装包里有WordCount的例子

http://mirrors.hust.edu.cn/apache/hadoop/common/hadoop-2.2.0/hadoop-2.2.0.tar.gz

2. 把WordCount编译成一个jar文件

 mkdir wordcount_classes
javac -cp classpath -d wordcount_classes WordCount.java
jar -cvf wordcount.jar -C wordcount_classes/ . 
这里的classpath和之前的hadoop版本有所区别,需要按照新的设置方法,这一点网上很少提及!
新的classpath为:
$HADOOP_HOME/share/hadoop/common/hadoop-common-2.2.0.jar:$HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.2.0.jar:$HADOOP_HOME/share/hadoop/common/lib/commons-cli-1.2.jar

3.创建HDFS文件夹

hadoop fs -mkdir wordCount
hadoop fs -mkdir wordCount/input
echo "Hello World Bye World" > file0
echo "Hello Hadoop Goodbye Hadoop" > file1
hadoop fs -put file* wordCount/input

4.运行

hadoop jar wordcount.jar org.myorg.WordCount wordCount/input wordCount/output
hadoop fs -cat /user/cloudera/wordcount/output/part-00000