HDFS基本命令与Hadoop MapReduce程序的执行

　　一、HDFS基本命令

　　1.创建目录：-mkdir

[jun@master ~]$ hadoop fs -mkdir /test

[jun@master ~]$ hadoop fs -mkdir /test/input

　　2.查看文件列表：-ls

[jun@master ~]$ hadoop fs -ls /

Found  items

drwxr-xr-x   - jun supergroup           -- : /test

[jun@master ~]$ hadoop fs -ls /test

Found  items

drwxr-xr-x   - jun supergroup           -- : /test/input

　　3.上传文件到HDFS

　　在/home/jun下新建两个文件jun.dat和jun.txt

　　（1）使用-put将文件从本地复制到HDFS集群

[jun@master ~]$ hadoop fs -put /home/jun/jun.dat /test/input/jun.dat

　　（2）使用-copyFromLocal将文件从本地复制到HDFS集群

[jun@master ~]$ hadoop fs -copyFromLocal -f /home/jun/jun.txt  /test/input/jun.txt

　　（3）查看是否复制成功

[jun@master ~]$ hadoop fs -ls /test/input

Found  items

-rw-r--r--    jun supergroup          -- : /test/input/jun.dat

-rw-r--r--    jun supergroup          -- : /test/input/jun.txt

　　4.下载文件到本地

　　（1）使用-get将文件从HDFS集群复制到本地

[jun@master ~]$ hadoop fs -get /test/input/jun.dat /home/jun/jun1.dat

　　（2）使用-copyToLocal将文件从HDFS集群复制到本地

[jun@master ~]$ hadoop fs -copyToLocal /test/input/jun.txt /home/jun/jun1.txt

　　（3）查看是否复制成功

[jun@master ~]$ ls -l /home/jun/

total

drwxr-xr-x.   jun jun    Jul  : Desktop

drwxr-xr-x.   jun jun    Jul  : Documents

drwxr-xr-x.   jun jun    Jul  : Downloads

drwxr-xr-x.  jun jun  Jul  : hadoop

drwxrwxr-x.   jun jun   Jul  : hadoopdata

-rw-r--r--.   jun jun   Jul  : jun1.dat

-rw-r--r--.   jun jun   Jul  : jun1.txt

-rw-rw-r--.   jun jun   Jul  : jun.dat

-rw-rw-r--.   jun jun   Jul  : jun.txt

drwxr-xr-x.   jun jun    Jul  : Music

drwxr-xr-x.   jun jun    Jul  : Pictures

drwxr-xr-x.   jun jun    Jul  : Public

drwxr-xr-x.   jun jun    Jul  : Resources

drwxr-xr-x.   jun jun    Jul  : Templates

drwxr-xr-x.   jun jun    Jul  : Videos

　　5.查看HDFS集群中的文件

[jun@master ~]$ hadoop fs -cat /test/input/jun.txt

This is the txt file.

[jun@master ~]$ hadoop fs -text /test/input/jun.txt

This is the txt file.

[jun@master ~]$ hadoop fs -tail /test/input/jun.txt

This is the txt file.

　　6.删除HDFS文件

[jun@master ~]$ hadoop fs -rm /test/input/jun.txt

Deleted /test/input/jun.txt

[jun@master ~]$ hadoop fs -ls /test/input

Found  items

-rw-r--r--    jun supergroup          -- : /test/input/jun.dat

　　7.也可以在slave节点上执行命令

[jun@slave0 ~]$ hadoop fs -ls /test/input

Found  items

-rw-r--r--    jun supergroup          -- : /test/input/jun.dat

　　二、在Hadoop集群中运行程序

　　Hadoop安装文件中有一个MapReduce示例程序，该程序用来计算圆周率pi的Java程序包，

　　参数说明：pi（类名）、10（Map次数）、10（随机生成点的次数）

[jun@master ~]$ hadoop jar /home/jun/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8..jar pi

Number of Maps  =

Samples per Map =

Wrote input for Map #

Wrote input for Map #

Wrote input for Map #

Wrote input for Map #

Wrote input for Map #

Wrote input for Map #

Wrote input for Map #

Wrote input for Map #

Wrote input for Map #

Wrote input for Map #

Starting Job

// :: INFO client.RMProxy: Connecting to ResourceManager at master/192.168.1.100:

// :: INFO input.FileInputFormat: Total input files to process :

// :: INFO mapreduce.JobSubmitter: number of splits:

// :: INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1532226440522_0001

// :: INFO impl.YarnClientImpl: Submitted application application_1532226440522_0001

// :: INFO mapreduce.Job: The url to track the job: http://master:18088/proxy/application_1532226440522_0001/

// :: INFO mapreduce.Job: Running job: job_1532226440522_0001

// :: INFO mapreduce.Job: Job job_1532226440522_0001 running in uber mode : false

// :: INFO mapreduce.Job:  map % reduce %

// :: INFO mapreduce.Job:  map % reduce %

// :: INFO mapreduce.Job:  map % reduce %

// :: INFO mapreduce.Job:  map % reduce %

// :: INFO mapreduce.Job:  map % reduce %

// :: INFO mapreduce.Job:  map % reduce %

// :: INFO mapreduce.Job: Job job_1532226440522_0001 completed successfully

// :: INFO mapreduce.Job: Counters:

    File System Counters

        FILE: Number of bytes read=

        FILE: Number of bytes written=

        FILE: Number of read operations=

        FILE: Number of large read operations=

        FILE: Number of write operations=

        HDFS: Number of bytes read=

        HDFS: Number of bytes written=

        HDFS: Number of read operations=

        HDFS: Number of large read operations=

        HDFS: Number of write operations=

    Job Counters

        Launched map tasks=

        Launched reduce tasks=

        Data-local map tasks=

        Total time spent by all maps in occupied slots (ms)=

        Total time spent by all reduces in occupied slots (ms)=

        Total time spent by all map tasks (ms)=

        Total time spent by all reduce tasks (ms)=

        Total vcore-milliseconds taken by all map tasks=

        Total vcore-milliseconds taken by all reduce tasks=

        Total megabyte-milliseconds taken by all map tasks=

        Total megabyte-milliseconds taken by all reduce tasks=

    Map-Reduce Framework

        Map input records=

        Map output records=

        Map output bytes=

        Map output materialized bytes=

        Input split bytes=

        Combine input records=

        Combine output records=

        Reduce input groups=

        Reduce shuffle bytes=

        Reduce input records=

        Reduce output records=

        Spilled Records=

        Shuffled Maps =

        Failed Shuffles=

        Merged Map outputs=

        GC time elapsed (ms)=

        CPU time spent (ms)=

        Physical memory (bytes) snapshot=

        Virtual memory (bytes) snapshot=

        Total committed heap usage (bytes)=

    Shuffle Errors

        BAD_ID=

        CONNECTION=

        IO_ERROR=

        WRONG_LENGTH=

        WRONG_MAP=

        WRONG_REDUCE=

    File Input Format Counters

        Bytes Read=

    File Output Format Counters

        Bytes Written=

Job Finished in 88.689 seconds

Estimated value of Pi is 3.20000000000000000000

　　最后可以看到，得到的结果近似为3.2。

秒客网

HDFS基本命令与Hadoop MapReduce程序的执行

相关文章