hadoop[5]-java客户端基本操作

时间:2021-08-31 08:31:08

一、准备

保证win机器能ping通虚拟机中设置的hadoop-server-00/01/02,创建一个maven项目,pom中加入依赖:

<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-client</artifactId>
    <version>2.4.1</version>
</dependency>

上传一个文件到hdfs,代码如下:

package com.wange;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.junit.Test;
import java.net.URI;

public class HdfsClientTest {
    @Test
    public void test1() throws Exception {
        // 避免异常:HADOOP_HOME or hadoop.home.dir are not set.
        System.setProperty("hadoop.home.dir", "E:/soft/hadoop-2.4.1");
        Configuration conf = new Configuration();
        FileSystem fs = FileSystem.get(new URI("hdfs://hadoop-server-00:9000"), conf, "root");
        fs.copyFromLocalFile(new Path("E:/soft/apache-maven-3.5.4-bin.zip"), new Path("/"));
        fs.close();
    }
}

第一次允许可能会报错,HADOOP_HOME or hadoop.home.dir are not set. 需要在本机解压hadoop的压缩包,查看解压后的bin中有没有文件(winutils.exe),如果没有可以到github上下载(https://github.com/srccodes/hadoop-common-2.2.0-bin),备用地址:(https://pan.baidu.com/s/1qAd7VRE3MqGgxAInWdwiZQ,提取码:z5u0),这里的root是指以root的身份上传的,如果不设置默认是本机的当前用户名,会出现异常。

上传与下载文件:

hadoop[5]-java客户端基本操作hadoop[5]-java客户端基本操作
public class HdfsClientTest {
    FileSystem fs = null;

    @Before
    public void init() throws Exception {
        System.setProperty("hadoop.home.dir", "E:/soft/hadoop-2.4.1");
        fs = FileSystem.get(new URI("hdfs://hadoop-server-00:9000"), new Configuration(), "root");
    }

    @Test
    public void testGet() throws Exception {
        fs.copyToLocalFile(false, new Path("/apache-maven-3.5.4-bin.zip"), new Path("E:/"), true);
        fs.close();
    }

    @Test
    public void testPut() throws Exception {
        fs.copyFromLocalFile(new Path("E:/soft/apache-maven-3.5.4-bin.zip"), new Path("/"));
        fs.close();
    }
}
View Code

二、目录操作

@Test
public void testDir() throws Exception {
    boolean flag1 = fs.mkdirs(new Path("/home"));
    System.out.println("创建目录/home是否成功:"+ flag1);

    boolean flag2 = fs.exists(new Path("/home"));
    System.out.println("目录/home是否存在:"+ flag2);

    boolean delete = fs.delete(new Path("/test"), true);
    System.out.println("目录/test是否删除成功:"+ delete);
    fs.close();
}

三、文件信息查看

@Test
public void testFileInfo() throws Exception {
    RemoteIterator<LocatedFileStatus> listFiles = fs.listFiles(new Path("/"), true);
    while (listFiles.hasNext()){
        LocatedFileStatus file = listFiles.next();
        System.out.println(file.getPath().getName());
    }
    fs.close();
}

四、IO流操作文件

FSDataInputStream in = fs.open(new Path("/apache-maven-3.5.4-bin.zip"));
FileOutputStream out = new FileOutputStream("E:/apache-maven-3.5.4-bin.zip");
IOUtils.copyBytes(in, out, new Configuration());

 

操作和linux命令类似,更多方法可参考:http://hadoop.apache.org/docs/current/api/index.html