Hadoop学习1_在使用命令行运行WordCount时,遇到的jar命令说明

时间:2021-04-02 17:47:16

1. jar cf WordCount.jar WordCount*.class

使用情况:

编译WordCount.java文件,编译java文件的命令为javac,截图如下:

Hadoop学习1_在使用命令行运行WordCount时,遇到的jar命令说明编译WordCount.java

此时,在workspace文件夹下将会出现生成三个class文件,

Hadoop学习1_在使用命令行运行WordCount时,遇到的jar命令说明编译后生成class文件

编译成功后,即可将三个class文件打包成jar文件,

Hadoop学习1_在使用命令行运行WordCount时,遇到的jar命令说明打包class成jar文件

执行成功后,在workspace文件下生成了WordCount.jar文件,

Hadoop学习1_在使用命令行运行WordCount时,遇到的jar命令说明打包jar完成


jar cf WordCount.jar WordCount*.class

-c:创建新的jar文件包;

-f:指定jar文件名;

WordCount.jar:[jar-文件] 即需要生成、查看、更新或者解开的 JAR 文件包,它是 -f 参数的附属参数 ;

WordCount*.class:名字可以简写为WordCount的.class文件;

2.bin/hadoop jar workspace/WordCount.jar WordCount input output

使用情况:

在/usr/local/hadoop文件夹下新建一个input文件夹,用于存放数据,

Hadoop学习1_在使用命令行运行WordCount时,遇到的jar命令说明创建input文件夹

接着cd 到input文件下,执行以下命令,就是将’Hello World Bye World’写进file01文件,将’Hello Hadoop Goodbye Hadoop’ 写进file02文件

Hadoop学习1_在使用命令行运行WordCount时,遇到的jar命令说明创建输入数据

最后运行程序,

Hadoop学习1_在使用命令行运行WordCount时,遇到的jar命令说明运行程序


bin/hadoop jar workspace/WordCount.jar WordCount input output

类似的:

hadoop jar WordCount.jar WordCount input output

hadoop jar WordCount.jar WordCount /tmp/input /tmp/output

/usr/local/hadoop/bin/hadoop jar WordCount.jar WordCount input output

/usr/local/hadoop/bin/hadoop jar WordCount.jarorg.apache.hadoop.examples.WordCount input output

(因为某些程序中声明了 package ,所以在命令中也要 org.apache.hadoop.examples 写完整,这些程序的第一行代码就是:package org.apache.hadoop.examples)


bin/hadoop:/usr/local/hadoop/bin/hadoop,这是一个hadoop文件的位置,不是文件夹,是对java命令的又一层封装,可以认为是hadoop在shell端的脚本;

jar:执行一个作业任务,其数据在jar中;

workspace/WordCount.jar:WordCount.jar的详细位置,结合前面的参数/usr/local/hadoop,详细位置为/usr/local/hadoop/workspace/WordCount.jar;

WordCount:

input:在hdfs中的数据输入目录;

output:在hdfs中的数据输出目录;


代码示例:

  1. /**
  2. * Licensed to the Apache Software Foundation (ASF) under one
  3. * or more contributor license agreements. See the NOTICE file
  4. * distributed with this work for additional information
  5. * regarding copyright ownership. The ASF licenses this file
  6. * to you under the Apache License, Version 2.0 (the
  7. * "License"); you may not use this file except in compliance
  8. * with the License. You may obtain a copy of the License at
  9. *
  10. * http://www.apache.org/licenses/LICENSE-2.0
  11. *
  12. * Unless required by applicable law or agreed to in writing, software
  13. * distributed under the License is distributed on an "AS IS" BASIS,
  14. * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  15. * See the License for the specific language governing permissions and
  16. * limitations under the License.
  17. */
  18. package org.apache.hadoop.examples;
  19. import java.io.IOException;
  20. import java.util.StringTokenizer;
  21. import org.apache.hadoop.conf.Configuration;
  22. import org.apache.hadoop.fs.Path;
  23. import org.apache.hadoop.io.IntWritable;
  24. import org.apache.hadoop.io.Text;
  25. import org.apache.hadoop.mapreduce.Job;
  26. import org.apache.hadoop.mapreduce.Mapper;
  27. import org.apache.hadoop.mapreduce.Reducer;
  28. import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
  29. import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
  30. import org.apache.hadoop.util.GenericOptionsParser;
  31. publicclassWordCount{
  32. publicstaticclassTokenizerMapper
  33. extendsMapper<Object,Text,Text,IntWritable>{
  34. privatefinalstaticIntWritable one=newIntWritable(1);
  35. privateText word=newText();
  36. publicvoid map(Object key,Text value,Context context
  37. )throwsIOException,InterruptedException{
  38. StringTokenizer itr =newStringTokenizer(value.toString());
  39. while(itr.hasMoreTokens()){
  40. word.set(itr.nextToken());
  41. context.write(word, one);
  42. }
  43. }
  44. }
  45. publicstaticclassIntSumReducer
  46. extendsReducer<Text,IntWritable,Text,IntWritable>{
  47. privateIntWritable result=newIntWritable();
  48. publicvoid reduce(Text key,Iterable<IntWritable> values,
  49. Context context
  50. )throwsIOException,InterruptedException{
  51. int sum =0;
  52. for(IntWritable val: values){
  53. sum+= val.get();
  54. }
  55. result.set(sum);
  56. context.write(key, result);
  57. }
  58. }
  59. publicstaticvoid main(String[] args)throwsException{
  60. Configuration conf =newConfiguration();
  61. String[] otherArgs =newGenericOptionsParser(conf, args).getRemainingArgs();
  62. if(otherArgs.length!=2){
  63. System.err.println("Usage: wordcount <in> <out>");
  64. System.exit(2);
  65. }
  66. Job job =newJob(conf,"word count");
  67. job.setJarByClass(WordCount.class);
  68. job.setMapperClass(TokenizerMapper.class);
  69. job.setCombinerClass(IntSumReducer.class);
  70. job.setReducerClass(IntSumReducer.class);
  71. job.setOutputKeyClass(Text.class);
  72. job.setOutputValueClass(IntWritable.class);
  73. FileInputFormat.addInputPath(job,newPath(otherArgs[0]));
  74. FileOutputFormat.setOutputPath(job,newPath(otherArgs[1]));
  75. System.exit(job.waitForCompletion(true)?0:1);
  76. }
  77. }


参考资料:

http://blog.csdn.net/wang_zhenwei/article/details/47403825

http://dblab.xmu.edu.cn/blog/hadoop-build-project-by-shell/