1. 编辑WordCount.java文件,在下载的hadoop安装包里有WordCount的例子
http://mirrors.hust.edu.cn/apache/hadoop/common/hadoop-2.2.0/hadoop-2.2.0.tar.gz
2. 把WordCount编译成一个jar文件
mkdir wordcount_classes
javac -cp classpath -d wordcount_classes WordCount.java
jar -cvf wordcount.jar -C wordcount_classes/ .
这里的classpath和之前的hadoop版本有所区别,需要按照新的设置方法,这一点网上很少提及!
新的classpath为:
$HADOOP_HOME/share/hadoop/common/hadoop-common-2.2.0.jar:$HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.2.0.jar:$HADOOP_HOME/share/hadoop/common/lib/commons-cli-1.2.jar
3.创建HDFS文件夹
hadoop fs -mkdir wordCount
hadoop fs -mkdir wordCount/input
echo "Hello World Bye World" > file0
echo "Hello Hadoop Goodbye Hadoop" > file1
hadoop fs -put file* wordCount/input
4.运行
hadoop jar wordcount.jar org.myorg.WordCount wordCount/input wordCount/output
hadoop fs -cat /user/cloudera/wordcount/output/part-00000