Hadoop中的Yarn的Tool接口案例、Yarn 案例实操(四)

时间:2022-10-18 15:54:47

17. Yarn 案例实操

17.4 Yarn的Tool接口案例

17.4.1 回顾

[summer@hadoop102 hadoop-3.1.3]$ hadoop jar wc.jar com.summer.mapreduce.wordcount2.WordCountDriver /testinput /testoutput/output1

Hadoop中的Yarn的Tool接口案例、Yarn 案例实操(四)期望可以动态传参,结果报错,误认为是第一个输入参数。

[summer@hadoop102 hadoop-3.1.3]$ hadoop jar wc.jar com.summer.mapreduce.wordcount1.WordCountDriver -Dmapreduce.job.queuename=root.test /testinput /testoutput/output1

Hadoop中的Yarn的Tool接口案例、Yarn 案例实操(四)

17.4.2 需求

自己写的程序也可以动态修改参数。编写Yarn的Tool接口。

17.4.3 具体步骤

17.4.3.1 新建Maven项目YarnDemo,pom如下

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.summer</groupId>
    <artifactId>YarnDemo</artifactId>
    <version>1.0-SNAPSHOT</version>

    <properties>
        <maven.compiler.source>8</maven.compiler.source>
        <maven.compiler.target>8</maven.compiler.target>
    </properties>

    <dependencies>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-client</artifactId>
            <version>3.1.3</version>
        </dependency>
    </dependencies>
</project>

Hadoop中的Yarn的Tool接口案例、Yarn 案例实操(四)
添加依赖后记得刷新一下

17.4.3.2 新建com.summer.yarn包名

Hadoop中的Yarn的Tool接口案例、Yarn 案例实操(四)

17.4.3.3 创建类WordCount并实现Tool接口

Hadoop中的Yarn的Tool接口案例、Yarn 案例实操(四)

package com.summer.yarn;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.io.Writable;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.yarn.webapp.hamlet2.Hamlet;

import java.io.IOException;

/**
 * @author Redamancy
 * @create 2022-10-15 15:00
 */
public class WordCount implements Tool {
    private Configuration conf;

    @Override
    public int run(String[] args) throws Exception {
        Job job = Job.getInstance(conf);

        job.setJarByClass(WordCountDriver.class);

        job.setMapperClass(WordCountMapper.class);
        job.setReducerClass(WordCountReducer.class);

        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(IntWritable.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);

        FileInputFormat.setInputPaths(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));

        return job.waitForCompletion(true) ? 0 : 1;
    }

    @Override
    public void setConf(Configuration configuration) {
        this.conf = configuration;
    }

    @Override
    public Configuration getConf() {
        return conf;
    }

    // maper
    public static class WordCountMapper extends Mapper<LongWritable, Text, Text, IntWritable>{
        private IntWritable outV = new IntWritable(1);
        private Text outK = new Text();

        @Override
        protected void map(LongWritable key, Text value, Mapper<LongWritable, Text, Text, IntWritable>.Context context) throws IOException, InterruptedException {
            String line = value.toString();
            String[] words = line.split(" ");
            for (String word : words) {
                outK.set(word);

                context.write(outK, outV);
            }
        }
    }

    //reducer
    public static class WordCountReducer extends Reducer<Text, IntWritable,Text, IntWritable>{
        private IntWritable outV = new IntWritable();

        @Override
        protected void reduce(Text key, Iterable<IntWritable> values, Reducer<Text, IntWritable, Text, IntWritable>.Context context) throws IOException, InterruptedException {
            int sum = 0;
            for (IntWritable value : values) {
                sum += value.get();
            }
            outV.set(sum);

            context.write(key, outV);
        }
    }
}

17.4.3.4 新建WordCountDriver

Hadoop中的Yarn的Tool接口案例、Yarn 案例实操(四)

package com.summer.yarn;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

import java.util.Arrays;

/**
 * @author Redamancy
 * @create 2022-10-15 15:25
 */
public class WordCountDriver {

    private static Tool tool;

    public static void main(String[] args) throws Exception {
        //1 创建配置文件
        Configuration conf = new Configuration();

        //2 判断是否有Tool接口

        switch (args[0]){
            case "wordcount":
                tool = new WordCount();
                break;
            default:
                throw new RuntimeException("no such tool: " + args[0]);
        }
        //3 用Tool执行程序
        //Arrays.copyOfRange将老数组的元素放到新元素数组里面
        int run = ToolRunner.run(conf, tool, Arrays.copyOfRange(args, 1, args.length));

        System.exit(run);
    }
}

17.4.3.5 将包进行打包

Hadoop中的Yarn的Tool接口案例、Yarn 案例实操(四)点击reload from Disk进行刷新

17.4.3.6 将jar包上传到Linux里

Hadoop中的Yarn的Tool接口案例、Yarn 案例实操(四)
Hadoop中的Yarn的Tool接口案例、Yarn 案例实操(四)将这个复制到桌面
Hadoop中的Yarn的Tool接口案例、Yarn 案例实操(四)
改名为YarnDemo,然后进行上传到Linux上

Hadoop中的Yarn的Tool接口案例、Yarn 案例实操(四)将jar包拖拽到XShell里面

Hadoop中的Yarn的Tool接口案例、Yarn 案例实操(四)

17.4.4 在HDFS上准备输入文件,假设为/testinput目录,向集群提交该Jar包

[summer@hadoop102 hadoop-3.1.3]$ yarn jar YarnDemo.jar com.summer.yarn.WordCountDriver wordcount /testinput /testoutputoutput1

Hadoop中的Yarn的Tool接口案例、Yarn 案例实操(四)Hadoop中的Yarn的Tool接口案例、Yarn 案例实操(四)

  注意此时提交的3个参数,第一个用于生成特定的Tool,第二个和第三个为输入输出目录。此时如果我们希望加入设置参数,可以在wordcount后面添加参数,例如:

[summer@hadoop102 hadoop-3.1.3]$ yarn jar YarnDemo.jar com.summer.yarn.WordCountDriver wordcount -Dmapreduce.job.queuename=root.test /testinput /testoutput/output1

Hadoop中的Yarn的Tool接口案例、Yarn 案例实操(四)

Hadoop中的Yarn的Tool接口案例、Yarn 案例实操(四)
Hadoop中的Yarn的Tool接口案例、Yarn 案例实操(四)

注:以上操作全部做完过后,快照回去或者手动将配置文件修改成之前的状态,因为本身资源就不够,分成了这么多,不方便以后测试。