通过我们前面的操作,已经可以编译并且打包产生适合本机的hadoop包,目录是/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0。
使用root用户登录
配置文件位于/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/etc/hadoop目录下。
编辑文件hadoop-env.sh,修改export JAVA_HOME=/usr/local/jdk1.7.0_71
(1)编辑文件core-site.xml,内容如下:
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/tmp/hadoop-${user.name}</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://admin:9000</value>
</property>
创建目录/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/tmp
执行命令,重命名文件mv mapred-site.xml.template mapred-site.xml
(2)编辑文件mapred-site.xml,内容如下:
<property>
<name>mapred.job.tracker</name>
<value>admin:9001</value>
</property>
(3)编辑文件hdfs-site.xml,内容如下:
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
(4)编辑文件yarn-site.xml,内容如下:
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
(5)执行格式化命令,对namedoe进行格式化,输出如下信息:
[root@admin hadoop-2.2.0]# pwd
/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0
[root@admin hadoop-2.2.0]# bin/hdfs namenode -format
14/12/23 15:04:06 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = admin.lan/192.168.199.118
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 2.2.0
STARTUP_MSG: classpath = /usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/etc/hadoop:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/guava-11.0.2.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/commons-codec-1.4.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jsch-0.1.42.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/log4j-1.2.17.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jackson-core-asl-1.8.8.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/commons-el-1.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/xmlenc-0.52.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/commons-beanutils-core-1.8.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/commons-httpclient-3.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jsp-api-2.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/activation-1.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jetty-util-6.1.26.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/commons-lang-2.5.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/mockito-all-1.8.5.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/slf4j-api-1.7.5.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jettison-1.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jersey-core-1.9.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/asm-3.2.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/hadoop-auth-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/commons-digester-1.8.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/commons-configuration-1.6.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jackson-mapper-asl-1.8.8.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/avro-1.7.4.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/netty-3.6.2.Final.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/zookeeper-3.4.5.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/hadoop-annotations-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jsr305-1.3.9.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/snappy-java-1.0.4.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jasper-compiler-5.5.23.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jets3t-0.6.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jersey-json-1.9.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jackson-xc-1.8.8.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/junit-4.8.2.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/commons-beanutils-1.7.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/commons-collections-3.2.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/xz-1.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jetty-6.1.26.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/commons-compress-1.4.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jackson-jaxrs-1.8.8.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jasper-runtime-5.5.23.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/commons-logging-1.1.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/jersey-server-1.9.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/commons-net-3.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/servlet-api-2.5.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/commons-cli-1.2.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/paranamer-2.3.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/stax-api-1.0.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/commons-io-2.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/lib/commons-math-2.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/hadoop-common-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/hadoop-common-2.2.0-tests.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/common/hadoop-nfs-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/guava-11.0.2.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/commons-codec-1.4.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/log4j-1.2.17.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/jackson-core-asl-1.8.8.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/commons-el-1.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/xmlenc-0.52.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/protobuf-java-2.5.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/jsp-api-2.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/jetty-util-6.1.26.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/commons-lang-2.5.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/jersey-core-1.9.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/asm-3.2.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/jackson-mapper-asl-1.8.8.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/netty-3.6.2.Final.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/jsr305-1.3.9.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/jetty-6.1.26.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/jasper-runtime-5.5.23.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/commons-logging-1.1.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/jersey-server-1.9.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/servlet-api-2.5.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/commons-cli-1.2.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/lib/commons-io-2.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/hadoop-hdfs-2.2.0-tests.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/hadoop-hdfs-nfs-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/hdfs/hadoop-hdfs-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/aopalliance-1.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/javax.inject-1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/log4j-1.2.17.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/jackson-core-asl-1.8.8.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/protobuf-java-2.5.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/guice-3.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/hamcrest-core-1.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/guice-servlet-3.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/jersey-core-1.9.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/asm-3.2.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/jackson-mapper-asl-1.8.8.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/avro-1.7.4.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/netty-3.6.2.Final.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/jersey-guice-1.9.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/hadoop-annotations-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/junit-4.10.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/snappy-java-1.0.4.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/xz-1.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/commons-compress-1.4.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/jersey-server-1.9.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/paranamer-2.3.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/lib/commons-io-2.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-site-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-api-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-client-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-tests-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-common-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-common-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/aopalliance-1.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/javax.inject-1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/log4j-1.2.17.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/jackson-core-asl-1.8.8.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/protobuf-java-2.5.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/guice-3.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/hamcrest-core-1.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/guice-servlet-3.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/jersey-core-1.9.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/asm-3.2.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/jackson-mapper-asl-1.8.8.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/avro-1.7.4.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/netty-3.6.2.Final.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/jersey-guice-1.9.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/hadoop-annotations-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/junit-4.10.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/snappy-java-1.0.4.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/xz-1.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/commons-compress-1.4.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/jersey-server-1.9.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/paranamer-2.3.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/lib/commons-io-2.1.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.2.0-tests.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.2.0.jar:/contrib/capacity-scheduler/*.jar
STARTUP_MSG: build = Unknown -r Unknown; compiled by 'root' on 2014-12-18T09:20Z
STARTUP_MSG: java = 1.7.0_71
************************************************************/
14/12/23 15:04:06 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
Formatting using clusterid: CID-c7f77023-a884-4886-a313-bc9a671aaeb5
14/12/23 15:04:08 INFO namenode.HostFileManager: read includes:
HostSet(
)
14/12/23 15:04:08 INFO namenode.HostFileManager: read excludes:
HostSet(
)
14/12/23 15:04:08 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000
14/12/23 15:04:08 INFO util.GSet: Computing capacity for map BlocksMap
14/12/23 15:04:08 INFO util.GSet: VM type = 32-bit
14/12/23 15:04:08 INFO util.GSet: 2.0% max memory = 966.7 MB
14/12/23 15:04:08 INFO util.GSet: capacity = 2^22 = 4194304 entries
14/12/23 15:04:08 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false
14/12/23 15:04:08 INFO blockmanagement.BlockManager: defaultReplication = 1
14/12/23 15:04:08 INFO blockmanagement.BlockManager: maxReplication = 512
14/12/23 15:04:08 INFO blockmanagement.BlockManager: minReplication = 1
14/12/23 15:04:08 INFO blockmanagement.BlockManager: maxReplicationStreams = 2
14/12/23 15:04:08 INFO blockmanagement.BlockManager: shouldCheckForEnoughRacks = false
14/12/23 15:04:08 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000
14/12/23 15:04:08 INFO blockmanagement.BlockManager: encryptDataTransfer = false
14/12/23 15:04:08 INFO namenode.FSNamesystem: fsOwner = root (auth:SIMPLE)
14/12/23 15:04:08 INFO namenode.FSNamesystem: supergroup = supergroup
14/12/23 15:04:08 INFO namenode.FSNamesystem: isPermissionEnabled = true
14/12/23 15:04:08 INFO namenode.FSNamesystem: HA Enabled: false
14/12/23 15:04:08 INFO namenode.FSNamesystem: Append Enabled: true
14/12/23 15:04:09 INFO util.GSet: Computing capacity for map INodeMap
14/12/23 15:04:09 INFO util.GSet: VM type = 32-bit
14/12/23 15:04:09 INFO util.GSet: 1.0% max memory = 966.7 MB
14/12/23 15:04:09 INFO util.GSet: capacity = 2^21 = 2097152 entries
14/12/23 15:04:09 INFO namenode.NameNode: Caching file names occuring more than 10 times
14/12/23 15:04:09 INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
14/12/23 15:04:09 INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0
14/12/23 15:04:09 INFO namenode.FSNamesystem: dfs.namenode.safemode.extension = 30000
14/12/23 15:04:09 INFO namenode.FSNamesystem: Retry cache on namenode is enabled
14/12/23 15:04:09 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
14/12/23 15:04:09 INFO util.GSet: Computing capacity for map Namenode Retry Cache
14/12/23 15:04:09 INFO util.GSet: VM type = 32-bit
14/12/23 15:04:09 INFO util.GSet: 0.029999999329447746% max memory = 966.7 MB
14/12/23 15:04:09 INFO util.GSet: capacity = 2^16 = 65536 entries
14/12/23 15:04:09 INFO common.Storage: Storage directory /usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/tmp/hadoop-root/dfs/name has been successfully formatted.
14/12/23 15:04:09 INFO namenode.FSImage: Saving image file /usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/tmp/hadoop-root/dfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
14/12/23 15:04:09 INFO namenode.FSImage: Image file /usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/tmp/hadoop-root/dfs/name/current/fsimage.ckpt_0000000000000000000 of size 196 bytes saved in 0 seconds.
14/12/23 15:04:09 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
14/12/23 15:04:09 INFO util.ExitUtil: Exiting with status 0
14/12/23 15:04:09 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at admin.lan/192.168.199.118
************************************************************/
[root@hadoop10 hadoop-2.2.0]#
(6)启动hdfs,执行命令结果如下
[root@hadoop10 hadoop-2.2.0]# sbin/start-dfs.sh
Starting namenodes on [hadoop10]
hadoop10: starting namenode, logging to /usr/local/hadoop-dist/target/hadoop-2.2.0/logs/hadoop-root-namenode-hadoop10.out
localhost: starting datanode, logging to /usr/local/hadoop-dist/target/hadoop-2.2.0/logs/hadoop-root-datanode-hadoop10.out
Starting secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
RSA key fingerprint is 3d:56:ae:31:73:66:9c:21:02:02:bc:5a:6b:bd:bf:75.
Are you sure you want to continue connecting (yes/no)? yes
0.0.0.0: Warning: Permanently added '0.0.0.0' (RSA) to the list of known hosts.
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop-dist/target/hadoop-2.2.0/logs/hadoop-root-secondarynamenode-hadoop10.out
[root@hadoop10 hadoop-2.2.0]# jps
5256 SecondaryNameNode
5015 NameNode
5123 DataNode
5352 Jps
[root@hadoop10 hadoop-2.2.0]#
(7)启动yarn,执行命令结果如下
[root@hadoop10 hadoop-2.2.0]# sbin/start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop-dist/target/hadoop-2.2.0/logs/yarn-root-resourcemanager-hadoop10.out
localhost: starting nodemanager, logging to /usr/local/hadoop-dist/target/hadoop-2.2.0/logs/yarn-root-nodemanager-hadoop10.out
格式化完成后,开始启动hadoop 程序。
hadoop 启动的三种方式:
第一种,一次性全部启动:
执行start-all.sh 启动hadoop,观察控制台的输出,可以看到正在启动进程,分别是namenode、datanode、secondarynamenode、jobtracker、tasktracker,一共5 个,待执行完毕后,并不意味着这5 个进程成功启动,上面仅仅表示系统正在启动进程而已。我们使用jdk 的命令jps 查看进程是否已经正确启动。执行以下jps,如果看到了这5 个进程,说明hadoop 真的启动成功了。如果缺少一个或者多个,那就进入到“Hadoop的常见启动错误”章节寻找原因了。关闭hadoop 的命令是stop-all.sh。
[root@hadoop10 hadoop-2.2.0]# jps
5496 NodeManager
5524 Jps
5256 SecondaryNameNode
5015 NameNode
5123 DataNode
5410 ResourceManager
[root@hadoop10 hadoop-2.2.0]#
上面的命令是最简单的,可以一次性把所有节点都启动、关闭。除此之外,还有其他命令,是分别启动的。
第二种,分别启动HDFS 和MapReduce:
执行命令start-dfs.sh,是单独启动hdfs。执行完该命令后,通过jps 能够看到NameNode、DataNode、SecondaryNameNode 三个进程启动了,该命令适合于只执行hdfs存储不使用MapReduce 做计算的场景。关闭的命令就是stop-dfs.sh 了。
执行命令start-mapred.sh,可以单独启动MapReduce 的两个进程。关闭的命令就是stop-mapred.sh 了。当然,也可以先启动MapReduce,再启动HDFS。这说明,HDFS 和mapReduce的进程之间是互相独立的,没有依赖关系。
第三种,分别启动各个进程,单个增加、删除节点:
hadoop-daemon.sh start namenode
hadoop-daemon.sh start datanode
hadoop-daemon.sh start secondarynamenode
hadoop-daemon.sh start jobtracker
hadoop-daemon.sh start tasktracker
(8)看到这5个java进程,就表示启动成功了。通过浏览器查看一下:
节点查看:http://192.168.199.118:50070
8042端口查看节点管理器资源状态
查看集群状态:./bin/hdfs dfsadmin –report
查看文件块组成: ./bin/hdfs fsck / -files -blocks
部署完成,这里我们就是实现了伪分布式单机hadoop的开发环境。后续发力会出hadoop 分布式文件系统,感受一下google的文件系统,其实和linux的GFS差不多。
hadoop三种运行模式:
单机模式(standalone):单机模式是Hadoop的默认模式。当首次解压Hadoop的源码包时,Hadoop无法了解硬件安装环境,便保守地选择了最小配置。在这种默认模式下所有3个XML文件均为空。当配置文件为空时,Hadoop会完全运行在本地。因为不需要与其他节点交互,单机模式就不使用HDFS,也不加载任何Hadoop的守护进程。该模式主要用于开发调试MapReduce程序的应用逻辑。
伪分布模式(Pseudo-Distributed Mode):伪分布模式在“单节点集群”上运行Hadoop,其中所有的守护进程都运行在同一台机器上。该模式在单机模式之上增加了代码调试功能,允许你检查内存使用情况,HDFS输入输出,以及其他的守护进程交互。
全分布模式(Fully Distributed Mode)。
(9) 添加HADOOP_HOME环境变量:
export HADOOP_HOME=/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/
export PATH=.:/usr/local/protoc/bin:$FINDBUGS_HOME/bin:$MAVEN_HOME/bin:$HADOOP_HOME/sbin:$HADOOP_HOME/bin:$PATH
(10)集群验证:
我们使用Hadoop自带的WordCount例子进行验证。该程序是统计文件中单词的出现次数的.先在HDFS创建几个数据目录:
hadoop fs -mkdir -p /data/wordcount
hadoop fs -mkdir -p /output/
目录/data/wordcount用来存放Hadoop自带的WordCount例子的数据文件,运行这个MapReduce任务的结果输出到/output/wordcount目录中。
将本地文件上传到HDFS中:
hadoop fs -put /usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/etc/hadoop/*.xml /data/wordcount
可以查看上传后的文件情况,执行如下命令:
hadoop fs -
ls
/data/wordcount
可以看到上传到HDFS中的文件。
下面,运行WordCount例子,执行如下命令:
hadoop jar /usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar wordcount /data/wordcount /output/wordcount
可以看到控制台输出程序运行的信息:
[root@admin hadoop-2.2.0]# hadoop jar /usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar wordcount /data/wordcount /output/wordcount
14/12/23 16:59:26 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
14/12/23 16:59:26 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
14/12/23 16:59:27 INFO input.FileInputFormat: Total input paths to process : 7
14/12/23 16:59:27 INFO mapreduce.JobSubmitter: number of splits:7
14/12/23 16:59:27 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name
14/12/23 16:59:27 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
14/12/23 16:59:27 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
14/12/23 16:59:27 INFO Configuration.deprecation: mapreduce.combine.class is deprecated. Instead, use mapreduce.job.combine.class
14/12/23 16:59:27 INFO Configuration.deprecation: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
14/12/23 16:59:27 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
14/12/23 16:59:27 INFO Configuration.deprecation: mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class
14/12/23 16:59:27 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
14/12/23 16:59:27 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
14/12/23 16:59:27 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
14/12/23 16:59:27 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
14/12/23 16:59:27 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
14/12/23 16:59:27 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1742380566_0001
14/12/23 16:59:27 WARN conf.Configuration: file:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/tmp/hadoop-root/mapred/staging/root1742380566/.staging/job_local1742380566_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
14/12/23 16:59:27 WARN conf.Configuration: file:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/tmp/hadoop-root/mapred/staging/root1742380566/.staging/job_local1742380566_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
14/12/23 16:59:27 WARN conf.Configuration: file:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/tmp/hadoop-root/mapred/local/localRunner/root/job_local1742380566_0001/job_local1742380566_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
14/12/23 16:59:27 WARN conf.Configuration: file:/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/tmp/hadoop-root/mapred/local/localRunner/root/job_local1742380566_0001/job_local1742380566_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
14/12/23 16:59:28 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
14/12/23 16:59:28 INFO mapreduce.Job: Running job: job_local1742380566_0001
14/12/23 16:59:28 INFO mapred.LocalJobRunner: OutputCommitter set in config null
14/12/23 16:59:28 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
14/12/23 16:59:28 INFO mapred.LocalJobRunner: Waiting for map tasks
14/12/23 16:59:28 INFO mapred.LocalJobRunner: Starting task: attempt_local1742380566_0001_m_000000_0
14/12/23 16:59:28 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
14/12/23 16:59:28 INFO mapred.MapTask: Processing split: hdfs://admin:9000/data/wordcount/hadoop-policy.xml:0+9257
14/12/23 16:59:28 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
14/12/23 16:59:28 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
14/12/23 16:59:28 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
14/12/23 16:59:28 INFO mapred.MapTask: soft limit at 83886080
14/12/23 16:59:28 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
14/12/23 16:59:28 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
14/12/23 16:59:28 INFO mapred.LocalJobRunner:
14/12/23 16:59:28 INFO mapred.MapTask: Starting flush of map output
14/12/23 16:59:28 INFO mapred.MapTask: Spilling map output
14/12/23 16:59:28 INFO mapred.MapTask: bufstart = 0; bufend = 12916; bufvoid = 104857600
14/12/23 16:59:28 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26210084(104840336); length = 4313/6553600
14/12/23 16:59:28 INFO mapred.MapTask: Finished spill 0
14/12/23 16:59:28 INFO mapred.Task: Task:attempt_local1742380566_0001_m_000000_0 is done. And is in the process of committing
14/12/23 16:59:28 INFO mapred.LocalJobRunner: map
14/12/23 16:59:28 INFO mapred.Task: Task 'attempt_local1742380566_0001_m_000000_0' done.
14/12/23 16:59:28 INFO mapred.LocalJobRunner: Finishing task: attempt_local1742380566_0001_m_000000_0
14/12/23 16:59:28 INFO mapred.LocalJobRunner: Starting task: attempt_local1742380566_0001_m_000001_0
14/12/23 16:59:28 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
14/12/23 16:59:28 INFO mapred.MapTask: Processing split: hdfs://admin:9000/data/wordcount/capacity-scheduler.xml:0+3560
14/12/23 16:59:28 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
14/12/23 16:59:28 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
14/12/23 16:59:28 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
14/12/23 16:59:28 INFO mapred.MapTask: soft limit at 83886080
14/12/23 16:59:28 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
14/12/23 16:59:28 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
14/12/23 16:59:28 INFO mapred.LocalJobRunner:
14/12/23 16:59:28 INFO mapred.MapTask: Starting flush of map output
14/12/23 16:59:28 INFO mapred.MapTask: Spilling map output
14/12/23 16:59:28 INFO mapred.MapTask: bufstart = 0; bufend = 4457; bufvoid = 104857600
14/12/23 16:59:28 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26213132(104852528); length = 1265/6553600
14/12/23 16:59:28 INFO mapred.MapTask: Finished spill 0
14/12/23 16:59:28 INFO mapred.Task: Task:attempt_local1742380566_0001_m_000001_0 is done. And is in the process of committing
14/12/23 16:59:28 INFO mapred.LocalJobRunner: map
14/12/23 16:59:28 INFO mapred.Task: Task 'attempt_local1742380566_0001_m_000001_0' done.
14/12/23 16:59:28 INFO mapred.LocalJobRunner: Finishing task: attempt_local1742380566_0001_m_000001_0
14/12/23 16:59:28 INFO mapred.LocalJobRunner: Starting task: attempt_local1742380566_0001_m_000002_0
14/12/23 16:59:28 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
14/12/23 16:59:28 INFO mapred.MapTask: Processing split: hdfs://admin:9000/data/wordcount/yarn-site.xml:0+1000
14/12/23 16:59:28 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
14/12/23 16:59:28 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
14/12/23 16:59:28 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
14/12/23 16:59:28 INFO mapred.MapTask: soft limit at 83886080
14/12/23 16:59:28 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
14/12/23 16:59:28 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
14/12/23 16:59:28 INFO mapred.LocalJobRunner:
14/12/23 16:59:28 INFO mapred.MapTask: Starting flush of map output
14/12/23 16:59:28 INFO mapred.MapTask: Spilling map output
14/12/23 16:59:28 INFO mapred.MapTask: bufstart = 0; bufend = 1322; bufvoid = 104857600
14/12/23 16:59:28 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26213988(104855952); length = 409/6553600
14/12/23 16:59:28 INFO mapred.MapTask: Finished spill 0
14/12/23 16:59:28 INFO mapred.Task: Task:attempt_local1742380566_0001_m_000002_0 is done. And is in the process of committing
14/12/23 16:59:28 INFO mapred.LocalJobRunner: map
14/12/23 16:59:28 INFO mapred.Task: Task 'attempt_local1742380566_0001_m_000002_0' done.
14/12/23 16:59:28 INFO mapred.LocalJobRunner: Finishing task: attempt_local1742380566_0001_m_000002_0
14/12/23 16:59:28 INFO mapred.LocalJobRunner: Starting task: attempt_local1742380566_0001_m_000003_0
14/12/23 16:59:28 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
14/12/23 16:59:28 INFO mapred.MapTask: Processing split: hdfs://admin:9000/data/wordcount/core-site.xml:0+910
14/12/23 16:59:28 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
14/12/23 16:59:28 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
14/12/23 16:59:28 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
14/12/23 16:59:28 INFO mapred.MapTask: soft limit at 83886080
14/12/23 16:59:28 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
14/12/23 16:59:28 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
14/12/23 16:59:28 INFO mapred.LocalJobRunner:
14/12/23 16:59:28 INFO mapred.MapTask: Starting flush of map output
14/12/23 16:59:28 INFO mapred.MapTask: Spilling map output
14/12/23 16:59:28 INFO mapred.MapTask: bufstart = 0; bufend = 1298; bufvoid = 104857600
14/12/23 16:59:28 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26213988(104855952); length = 409/6553600
14/12/23 16:59:28 INFO mapred.MapTask: Finished spill 0
14/12/23 16:59:28 INFO mapred.Task: Task:attempt_local1742380566_0001_m_000003_0 is done. And is in the process of committing
14/12/23 16:59:28 INFO mapred.LocalJobRunner: map
14/12/23 16:59:28 INFO mapred.Task: Task 'attempt_local1742380566_0001_m_000003_0' done.
14/12/23 16:59:28 INFO mapred.LocalJobRunner: Finishing task: attempt_local1742380566_0001_m_000003_0
14/12/23 16:59:28 INFO mapred.LocalJobRunner: Starting task: attempt_local1742380566_0001_m_000004_0
14/12/23 16:59:28 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
14/12/23 16:59:28 INFO mapred.MapTask: Processing split: hdfs://admin:9000/data/wordcount/hdfs-site.xml:0+843
14/12/23 16:59:28 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
14/12/23 16:59:28 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
14/12/23 16:59:28 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
14/12/23 16:59:28 INFO mapred.MapTask: soft limit at 83886080
14/12/23 16:59:28 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
14/12/23 16:59:28 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
14/12/23 16:59:28 INFO mapred.LocalJobRunner:
14/12/23 16:59:28 INFO mapred.MapTask: Starting flush of map output
14/12/23 16:59:28 INFO mapred.MapTask: Spilling map output
14/12/23 16:59:28 INFO mapred.MapTask: bufstart = 0; bufend = 1239; bufvoid = 104857600
14/12/23 16:59:28 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26213980(104855920); length = 417/6553600
14/12/23 16:59:28 INFO mapred.MapTask: Finished spill 0
14/12/23 16:59:28 INFO mapred.Task: Task:attempt_local1742380566_0001_m_000004_0 is done. And is in the process of committing
14/12/23 16:59:28 INFO mapred.LocalJobRunner: map
14/12/23 16:59:28 INFO mapred.Task: Task 'attempt_local1742380566_0001_m_000004_0' done.
14/12/23 16:59:28 INFO mapred.LocalJobRunner: Finishing task: attempt_local1742380566_0001_m_000004_0
14/12/23 16:59:28 INFO mapred.LocalJobRunner: Starting task: attempt_local1742380566_0001_m_000005_0
14/12/23 16:59:28 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
14/12/23 16:59:28 INFO mapred.MapTask: Processing split: hdfs://admin:9000/data/wordcount/mapred-site.xml:0+838
14/12/23 16:59:28 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
14/12/23 16:59:29 INFO mapreduce.Job: Job job_local1742380566_0001 running in uber mode : false
14/12/23 16:59:29 INFO mapreduce.Job: map 100% reduce 0%
14/12/23 16:59:29 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
14/12/23 16:59:29 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
14/12/23 16:59:29 INFO mapred.MapTask: soft limit at 83886080
14/12/23 16:59:29 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
14/12/23 16:59:29 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
14/12/23 16:59:29 INFO mapred.LocalJobRunner:
14/12/23 16:59:29 INFO mapred.MapTask: Starting flush of map output
14/12/23 16:59:29 INFO mapred.MapTask: Spilling map output
14/12/23 16:59:29 INFO mapred.MapTask: bufstart = 0; bufend = 1230; bufvoid = 104857600
14/12/23 16:59:29 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26213984(104855936); length = 413/6553600
14/12/23 16:59:29 INFO mapred.MapTask: Finished spill 0
14/12/23 16:59:29 INFO mapred.Task: Task:attempt_local1742380566_0001_m_000005_0 is done. And is in the process of committing
14/12/23 16:59:29 INFO mapred.LocalJobRunner: map
14/12/23 16:59:29 INFO mapred.Task: Task 'attempt_local1742380566_0001_m_000005_0' done.
14/12/23 16:59:29 INFO mapred.LocalJobRunner: Finishing task: attempt_local1742380566_0001_m_000005_0
14/12/23 16:59:29 INFO mapred.LocalJobRunner: Starting task: attempt_local1742380566_0001_m_000006_0
14/12/23 16:59:29 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
14/12/23 16:59:29 INFO mapred.MapTask: Processing split: hdfs://admin:9000/data/wordcount/httpfs-site.xml:0+620
14/12/23 16:59:29 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
14/12/23 16:59:29 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
14/12/23 16:59:29 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
14/12/23 16:59:29 INFO mapred.MapTask: soft limit at 83886080
14/12/23 16:59:29 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
14/12/23 16:59:29 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
14/12/23 16:59:29 INFO mapred.LocalJobRunner:
14/12/23 16:59:29 INFO mapred.MapTask: Starting flush of map output
14/12/23 16:59:29 INFO mapred.MapTask: Spilling map output
14/12/23 16:59:29 INFO mapred.MapTask: bufstart = 0; bufend = 939; bufvoid = 104857600
14/12/23 16:59:29 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214060(104856240); length = 337/6553600
14/12/23 16:59:29 INFO mapred.MapTask: Finished spill 0
14/12/23 16:59:29 INFO mapred.Task: Task:attempt_local1742380566_0001_m_000006_0 is done. And is in the process of committing
14/12/23 16:59:29 INFO mapred.LocalJobRunner: map
14/12/23 16:59:29 INFO mapred.Task: Task 'attempt_local1742380566_0001_m_000006_0' done.
14/12/23 16:59:29 INFO mapred.LocalJobRunner: Finishing task: attempt_local1742380566_0001_m_000006_0
14/12/23 16:59:29 INFO mapred.LocalJobRunner: Map task executor complete.
14/12/23 16:59:29 INFO mapred.Task: Using ResourceCalculatorProcessTree : [ ]
14/12/23 16:59:29 INFO mapred.Merger: Merging 7 sorted segments
14/12/23 16:59:29 INFO mapred.Merger: Down to the last merge-pass, with 7 segments left of total size: 13662 bytes
14/12/23 16:59:29 INFO mapred.LocalJobRunner:
14/12/23 16:59:29 INFO Configuration.deprecation: mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
14/12/23 16:59:29 INFO mapred.Task: Task:attempt_local1742380566_0001_r_000000_0 is done. And is in the process of committing
14/12/23 16:59:29 INFO mapred.LocalJobRunner:
14/12/23 16:59:29 INFO mapred.Task: Task attempt_local1742380566_0001_r_000000_0 is allowed to commit now
14/12/23 16:59:29 INFO output.FileOutputCommitter: Saved output of task 'attempt_local1742380566_0001_r_000000_0' to hdfs://admin:9000/output/wordcount/_temporary/0/task_local1742380566_0001_r_000000
14/12/23 16:59:29 INFO mapred.LocalJobRunner: reduce > reduce
14/12/23 16:59:29 INFO mapred.Task: Task 'attempt_local1742380566_0001_r_000000_0' done.
14/12/23 16:59:30 INFO mapreduce.Job: map 100% reduce 100%
14/12/23 16:59:30 INFO mapreduce.Job: Job job_local1742380566_0001 completed successfully
14/12/23 16:59:30 INFO mapreduce.Job: Counters: 32
File System Counters
FILE: Number of bytes read=2203023
FILE: Number of bytes written=4000234
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=116652
HDFS: Number of bytes written=6042
HDFS: Number of read operations=105
HDFS: Number of large read operations=0
HDFS: Number of write operations=10
Map-Reduce Framework
Map input records=448
Map output records=1896
Map output bytes=23401
Map output materialized bytes=13732
Input split bytes=794
Combine input records=1896
Combine output records=815
Reduce input groups=352
Reduce shuffle bytes=0
Reduce input records=815
Reduce output records=352
Spilled Records=1630
Shuffled Maps =0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=237
CPU time spent (ms)=0
Physical memory (bytes) snapshot=0
Virtual memory (bytes) snapshot=0
Total committed heap usage (bytes)=1231712256
File Input Format Counters
Bytes Read=17028
File Output Format Counters
Bytes Written=6042
查看结果,执行如下命令:
hadoop fs -
cat
/output/wordcount/part-r-00000 |
head
[root@admin hadoop-2.2.0]# hadoop fs -cat /output/wordcount/part-r-00000 | head
[root@admin hadoop-2.2.0]# hadoop fs -text /output/wordcount/part-r-00000
"*" 17
"AS 7
"License"); 7
"alice,bob 17
(ASF) 1
(root 1
(the 7
--> 13
-1. 1
0.0 1
cat: Unable to write to output stream.
登录到Web控制台,访问链接http://admin:8088/可以看到任务记录情况。
可见,我们的HDFS能够存储数据,而YARN集群也能够运行MapReduce任务。
(11) 运行简单的MapReduce 计算
在/usr/local/hadoop-2.2.0-src/hadoop-dist/target/hadoop-2.2.0/share/hadoop/mapreduce 下有个jar 包,叫hadoop-mapreduce-examples-2.2.0.jar,这里面含有框架提供的很多例子.我们现在学习一下如何运行其中的例子吧.
执行命令:
[root@admin mapreduce]# hadoop jar ./hadoop-mapreduce-examples-2.2.0.jar
可以看到输出信息,可以看到18 个输出信息,都是内置的例子程序.
(12)问题及总结
- 需要知道的默认配置
在Hadoop 2.2.0中,YARN框架有很多默认的参数值,如果你是在机器资源比较不足的情况下,需要修改这些默认值,来满足一些任务需要。
NodeManager和ResourceManager都是在yarn-site.xml文件中配置的,而运行MapReduce任务时,是在mapred-site.xml中进行配置的。
下面看一下相关的参数及其默认值情况:
参数名称 | 默认值 | 进程名称 | 配置文件 | 含义说明 |
yarn.nodemanager.resource.memory-mb | 8192 | NodeManager | yarn-site.xml | 从节点所在物理主机的可用物理内存总量 |
yarn.nodemanager.resource.cpu-vcores | 8 | NodeManager | yarn-site.xml | 节点所在物理主机的可用虚拟CPU资源总数(core) |
yarn.nodemanager.vmem-pmem-ratio | 2.1 | NodeManager | yarn-site.xml | 使用1M物理内存,最多可以使用的虚拟内存数量 |
yarn.scheduler.minimum-allocation-mb | 1024 | ResourceManager | yarn-site.xml | 一次申请分配内存资源的最小数量 |
yarn.scheduler.maximum-allocation-mb | 8192 | ResourceManager | yarn-site.xml | 一次申请分配内存资源的最大数量 |
yarn.scheduler.minimum-allocation-vcores | 1 | ResourceManager | yarn-site.xml | 一次申请分配虚拟CPU资源最小数量 |
yarn.scheduler.maximum-allocation-vcores | 8 | ResourceManager | yarn-site.xml | 一次申请分配虚拟CPU资源最大数量 |
mapreduce.framework.name | local | MapReduce | mapred-site.xml | 取值local、classic或yarn其中之一,如果不是yarn,则不会使用YARN集群来实现资源的分配 |
mapreduce.map.memory.mb | 1024 | MapReduce | mapred-site.xml | 每个MapReduce作业的map任务可以申请的内存资源数量 |
mapreduce.map.cpu.vcores | 1 | MapReduce | mapred-site.xml | 每个MapReduce作业的map任务可以申请的虚拟CPU资源的数量 |
mapreduce.reduce.memory.mb | 1024 | MapReduce | mapred-site.xml | 每个MapReduce作业的reduce任务可以申请的内存资源数量 |
yarn.nodemanager.resource.cpu-vcores | 8 | MapReduce | mapred-site.xml | 每个MapReduce作业的reduce任务可以申请的虚拟CPU资源的数量 |
参考链接
- http://hadoop.apache.org/docs/current/
- http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/ClusterSetup.html
- http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/CommandsManual.html
- http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YarnCommands.html
- http://dongxicheng.org/mapreduce-nextgen/hadoop-yarn-problems-vs-solutions/
- http://dongxicheng.org/mapreduce-nextgen/hadoop-yarn-configurations-resourcemanager-nodemanager/
- http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-common/yarn-default.xml
- http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml