Hadoop-hdfs和mapreduce
一、初始hdfs
1、HDFS管理界面。
http://192.168.0.205:50070/
上传文件
hadoop fs -put jdk-7u65-linux-i586.tar.gzhdfs://node1:9000/
再次网页上刷新查看。
点击可以下载
命令行下载
hadoop fs -gethdfs://node1:9000/jdk-7u65-linux-i586.tar.gz
二、初始mapreduce
在安装目录下有个share目录下面的hadoop里面有默认的一些jar包。
计算圆周率运行jar
hadoop jar hadoop-mapreduce-examples-2.4.1.jar pi 5 5
Number of Maps = 5
Samples per Map = 5
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Wrote input for Map #3
Wrote input for Map #4
Starting Job
16/01/25 13:18:15 INFO client.RMProxy:Connecting to ResourceManager at node1/192.168.0.205:8032
16/01/25 13:18:16 INFOinput.FileInputFormat: Total input paths to process : 5
16/01/25 13:18:16 INFOmapreduce.JobSubmitter: number of splits:5
16/01/25 13:18:17 INFOmapreduce.JobSubmitter: Submitting tokens for job: job_1453697466705_0001
16/01/25 13:18:18 INFO impl.YarnClientImpl:Submitted application application_1453697466705_0001
16/01/25 13:18:18 INFO mapreduce.Job: Theurl to track the job: http://node1:8088/proxy/application_1453697466705_0001/
16/01/25 13:18:18 INFO mapreduce.Job:Running job: job_1453697466705_0001
16/01/25 13:18:26 INFO mapreduce.Job: Jobjob_1453697466705_0001 running in uber mode : false
16/01/25 13:18:26 INFO mapreduce.Job: map 0% reduce 0%
16/01/25 13:18:54 INFO mapreduce.Job: map 100% reduce 0%
16/01/25 13:19:15 INFO mapreduce.Job: map 100% reduce 100%
16/01/25 13:19:16 INFO mapreduce.Job: Jobjob_1453697466705_0001 completed successfully
16/01/25 13:19:17 INFO mapreduce.Job:Counters: 49
File System Counters
FILE: Number of bytes read=116
FILE: Number of byteswritten=559761
FILE: Number of readoperations=0
FILE: Number of large readoperations=0
FILE: Number of writeoperations=0
HDFS: Number of bytes read=1305
HDFS: Number of byteswritten=215
HDFS: Number of readoperations=23
HDFS: Number of large readoperations=0
HDFS: Number of writeoperations=3
Job Counters
Launched map tasks=5
Launched reduce tasks=1
Data-local map tasks=5
Total time spent by all maps inoccupied slots (ms)=151444
Total time spent by all reducesin occupied slots (ms)=12933
Total time spent by all maptasks (ms)=151444
Total time spent by all reducetasks (ms)=12933
Total vcore-seconds taken byall map tasks=151444
Total vcore-seconds taken byall reduce tasks=12933
Total megabyte-seconds taken byall map tasks=155078656
Total megabyte-seconds taken byall reduce tasks=13243392
Map-Reduce Framework
Map input records=5
Map output records=10
Map output bytes=90
Map output materializedbytes=140
Input split bytes=715
Combine input records=0
Combine output records=0
Reduce input groups=2
Reduce shuffle bytes=140
Reduce input records=10
Reduce output records=0
Spilled Records=20
Shuffled Maps =5
Failed Shuffles=0
Merged Map outputs=5
GC time elapsed (ms)=321
CPU time spent (ms)=7610
Physical memory (bytes)snapshot=1077932032
Virtual memory (bytes)snapshot=3148177408
Total committed heap usage(bytes)=850395136
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=590
File Output Format Counters
Bytes Written=97
Job Finished in 62.334 seconds
Estimated value of Pi is3.68000000000000000000
三、基本命令
创建目录命令(中间红色部分可以省略)
hadoop fs -mkdir hdfs://node1:9000/wordcout
hadoop fs -mkdir /wordcout/input
新建一个test.txt文件
vi test.txt
hello world
hello tom
hello jim
hello kitty
hello angelabay
hello wolf
hello laolan
上传test.txt
hadoop fs -put test.txt /wordcout/input
计算
hadoop jarhadoop-mapreduce-examples-2.4.1.jar wordcount /wordcout/input /wordcunt/output
跑完了。
使用命令查看目录
hadoop fs -ls /wordcout/input
hadoop fs -ls /wordcunt/output
查看hdfs上面的文件(上面的计算结果)
hadoop fs -cat/wordcunt/output/part-r-00000
命令详解
hadoop fs
如:
hadoop fs -ls /
1表示副本数量。
修改权限及组
hadoop fs -chown angelababy:mygirls/jdk-7u65-linux-i586.tar.gz
查看结果
Hadoop权限检查比较弱
修改权限
fs -chmod 777 /jdk-7u65-linux-i586.tar.gz
Copy文件(和put一样)
hadoop fs -copyFromLocal./hadoop-mapreduce-client-app-2.4.1.jar /
在hdfs上面cp
hadoop fs -cp/hadoop-mapreduce-client-app-2.4.1.jar /wordcout
查看空间大小
hadoop fs -df -h /
查看文件大小
hadoop fs -du -s -h /
hadoop fs -du -s -h hdfs://node1:9000/*
创建目录
hadoop fs -mkdir -p /aa/bb
删除
hadoop fs -rm -r /aa/bb
可以看到删除的时候是到回收站,有提示。
hadoop jar
hdfs不支持修改,可以追加。
四、hdfs读写过程讲解及思想
1、读取过程(多个副本)
2、基本思想