总体介绍
虚拟机4台,分布在1个物理机上,配置基于hadoop的集群中包括4个节点: 1个 Master, 3个 Salve,i p分布为:
10.10.96.33 hadoop1 (Master)
10.10.96.59 hadoop2 (Slave)
10.10.96.65 hadoop3 (Slave)
10.10.96.64 hadoop4 (Slave)
操作系统为Red Hat Enterprise Linux Server release 6.4,GNU/Linux 2.6.32
Master机器主要配置NameNode和JobTracker的角色,负责总管分布式数据和分解任务的执 行;3个Salve机器配置DataNode和TaskTracker的角色,负责分布式数据存储以及任务的执行。
环境准备
创建账户
使用root登陆所有机器后,所有节点上创建 hadoop 用户
useradd hadoop
passwd hadoop
此时在 /home/ 下就会生成一个hadoop目录 ,目录路径为 /home/hadoop
创建相关目录
在所有节点上使用hadoop用户登录并创建相关的目录
定义需要数据及目录的存放路径:
mkdir -p /home/hadoop/source
以root用户登录
定义数据节点存放的路径到根目录下的hadoop文件夹, 这里是数据节点存放目录,需要有足够的空间存放
mkdir -p /hadoop/hdfs
mkdir -p /hadoop/tmp
设置可写权限
chmod -R 777 /hadoop
安装JDK
在每个节点上安装JDK并设置环境变量
将下载好的jdk-7u40-linux-x64.rpm通过 SSH 上传到 /root 下
执行rpm –ivh jdk-7u40-linux-x64.rpm
配置环境变量,vi /etc/profile ,在行末尾添加
export JAVA_HOME=/usr/java/jdk1.7.0_40
export CLASSPATH=.:$JAVA_HOME/lib/tools.jar:/lib/dt.jar
export PATH=$JAVA_HOME/bin:$PATH
使配置立即生效
source /etc/profile
执行 java -version 查看是否安装成功
修改主机名
在所有节点上修改主机名,具体步骤如下:
连接到主节点10.10.96.33 ,修改 network , 执行 vi /etc/sysconfig/network ,修改 HOSTNAME=hadoop1
修改 hosts 文件,vi /etc/hosts ,在行末尾添加 :
10.10.96.33 hadoop1
10.10.96.59 hadoop2
10.10.96.65 hadoop3
10.10.96.64 hadoop4
执行 hostname hadoop1
执行 exit 后重新 连接可看到主机名以修改 OK
其他节点也修改主机名后添加 Host, 或者 host 文件可以在后面执行 scp 覆盖操作
scp /etc/hosts root@hadoop2:/etc/hosts
scp /etc/hosts root@hadoop3:/etc/hosts
scp /etc/hosts root@hadoop4:/etc/hosts
配置SSH无密码登陆
SSH 无密 码原理简介 :
首先在 hadoop1 上生成一个密钥对,包括一个公钥和一个私钥,并将公钥复制到所有的 slave(hadoop2-hadoop4)机器上。
然后当 master 通过SSH连接slave时,slave就会生成一个随机数并用master的公钥对随机数进行加密,并发送给master。
最后,master收到加密数之后再用私钥解密,并将解密数回传给slave,slave确认解密数无误之后就允许master不输入密码进行连接了
具体步骤(在root用户)
执行命令 ssh-keygen -t rsa 之后一路回 车,查看刚生成的无密码钥对:
把 id_rsa.pub 追加到授权的 key 里面去
执行命令 cat ~/.ssh/id_rsa.pub >>~/.ssh/authorized_keys
ssh localhost (务必执行)
如果没有需要密码即可
将authorized_keys复制到所有的slave机器上
scp ~/.ssh/ authorized_keys hadoop2:~/.ssh/ authorized_keys
然后输入yes
最后输入slave机器的密码
验证命令
在master机器上执行ssh hadoop2发现主机名由hadoop1变成hadoop2即成功:
按照以上步骤分别配置hadoop3,hadoop4,要求每个都可以无密码登录
Hadoop安装
HADOOP 版本
下载最新版本 hadoop-2.2.0安装包为 hadoop-2.2.0.tar.gz到 /home/hadoop/source 目录下
解压目录
以hadoop用户登录
tar zxvf hadoop-2.2.0.tar.gz
创建软连接
cd /home/hadoop
ln -s /home/hadoop/source/hadoop-2.2.0/ ./hadoop
源码配置修改
/etc/profile
配置环境变量
vi /etc/profile
添加
export HADOOP_HOME=/home/hadoop/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_MAPARED_HOME=${HADOOP_HOME}
export HADOOP_COMMON_HOME=${HADOOP_HOME}
export HADOOP_HDFS_HOME=${HADOOP_HOME}
export YARN_HOME=${HADOOP_HOME}
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export HDFS_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export YARN_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_HOME}/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
生效配置:
source /etc/profile
进入/etc/hadoop目录中
cd /home/hadoop/hadoop/etc/hadoop
编辑core-site.xml的配置
vi core-site.xml
在 configuration 节点里面添加属性
<property>
<name>hadoop.tmp.dir</name>
<value>/hadoop/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://10.10.96.33:9000</value>
</property>
-- 添加 httpfs 的选项
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>10.10.96.33</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
master配置
vi masters
10.10.96.33
slave配置
vi slaves
添加 slave 的 IP
10.10.96.59
10.10.96.65
10.10.96.64
配置hdfs-site.xml
vi hdfs-site.xml
添加节点
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/hadoop/hdfs/name</value>
<final>true</final>
</property>
<property>
<name>dfs.federation.nameservice.id</name>
<value>ns1</value>
</property>
<property>
<name>dfs.namenode.backup.address.ns1</name>
<value>10.10.96.33:50100</value>
</property>
<property>
<name>dfs.namenode.backup.http-address.ns1</name>
<value>10.10.96.33:50105</value>
</property>
<property>
<name>dfs.federation.nameservices</name>
<value>ns1</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns1</name>
<value>10.10.96.33:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.ns1</name>
<value>10.10.96.33:23001</value>
</property>
<property>
<name>dfs.dataname.data.dir</name>
<value>file:/hadoop/hdfs/data</value>
<final>true</final>
</property>
<property>
<name>dfs.namenode.secondary.http-address.ns1</name>
<value>10.10.96.33:23002</value>
</property>
<property>
配置yarn-site.xml
vi yarn-site.xml
添加节点
<property>
<name>yarn.resourcemanager.address</name>
<value>10.10.96.33:18040</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>10.10.96.33:18030</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>10.10.96.33:18088</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>10.10.96.33:18025</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>10.10.96.33:18141</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
同步代码到其他机器
同步配置代码
先在slaves的机器上也创建目录/home/hadoop/source
在master上执行:
[root@hadoop1 ~]# ssh hadoop2 "mkdir -p /home/hadoop/source"
[root@hadoop1 ~]# ssh hadoop3 "mkdir -p /home/hadoop/source"
[root@hadoop1 ~]# ssh hadoop4 "mkdir -p /home/hadoop/source"
部署hadoop代码,创建软连接,然后只要同步修改过的etc/hadoop下的配置文件即可
到hadoop2,hadoop3,hadoop4上执行如下命令:
[root@hadoop2 ~]# cp hadoop-2.2.0.tar.gz /home/hadoop/source
[root@hadoop2 ~]# cd /home/hadoop/source
[root@hadoop2 hadoop]# tar zxvf hadoop-2.2.0.tar.gz
[root@hadoop2 hadoop]# cd /home/hadoop
[root@hadoop2 hadoop]# ln -s /home/hadoop/source/hadoop-2.2.0/ ./hadoop
[root@hadoop1 source]# scp /home/hadoop/hadoop/etc/hadoop/core-site.xml root@hadoop2:/home/hadoop/hadoop/etc/hadoop/core-site.xml
[root@hadoop1 source]# scp /home/hadoop/hadoop/etc/hadoop/hdfs-site.xml root@hadoop2:/home/hadoop/hadoop/etc/hadoop/hdfs-site.xml
[root@hadoop1 source]# scp /home/hadoop/hadoop/etc/hadoop/yarn-site.xml root@hadoop2:/home/hadoop/hadoop/etc/hadoop/yarn-site.xml
[root@hadoop1 source]# scp /home/hadoop/hadoop/etc/hadoop/core-site.xml root@hadoop3:/home/hadoop/hadoop/etc/hadoop/core-site.xml
[root@hadoop1 source]# scp /home/hadoop/hadoop/etc/hadoop/hdfs-site.xml root@hadoop3:/home/hadoop/hadoop/etc/hadoop/hdfs-site.xml
[root@hadoop1 source]# scp /home/hadoop/hadoop/etc/hadoop/yarn-site.xml root@hadoop3:/home/hadoop/hadoop/etc/hadoop/yarn-site.xml
[root@hadoop1 source]# scp /home/hadoop/hadoop/etc/hadoop/core-site.xml root@hadoop4:/home/hadoop/hadoop/etc/hadoop/core-site.xml
[root@hadoop1 source]# scp /home/hadoop/hadoop/etc/hadoop/hdfs-site.xml root@hadoop4:/home/hadoop/hadoop/etc/hadoop/hdfs-site.xml
[root@hadoop1 source]# scp /home/hadoop/hadoop/etc/hadoop/yarn-site.xml root@hadoop4:/home/hadoop/hadoop/etc/hadoop/yarn-site.xml
同步 /etc/profile和/etc/hosts
scp -r /etc/profile root@hadoop2:/etc/profile
scp -r /etc/profile root@hadoop3:/etc/profile
scp -r /etc/profile root@hadoop4:/etc/profile
Hadoop启动
格式化集群
以下用hadoop用户执行
hadoop namenode -format -clusterid clustername
一下是日志:
[hadoop@hadoop1 bin]$ hadoop namenode -format -clusterid clustername
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
13/12/26 11:09:13 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = hadoop1/10.10.96.33
STARTUP_MSG: args = [-format, -clusterid, clustername]
STARTUP_MSG: version = 2.2.0
STARTUP_MSG: classpath = /home/hadoop/hadoop/etc/hadoop:/home/hadoop/hadoop/share/hadoop/common/lib/hadoop-auth-2.2.0.jar:/home/hadoop/hadoop/share/hadoop/common/lib/jersey-core-1.9.jar:/home/hadoop/hadoop/share/hadoop/common/lib/jackson-core-asl-1.8.8.jar:/home/hadoop/hadoop/share/hadoop/common/lib/commons-compress-1.4.1.jar:/home/hadoop/hadoop/share/hadoop/common/lib/commons-beanutils-core-1.8.0.jar:/home/hadoop/hadoop/share/hadoop/common/lib/jettison-1.1.jar:/home/hadoop/hadoop/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/home/hadoop/hadoop/share/hadoop/common/lib/commons-cli-1.2.jar:/home/hadoop/hadoop/share/hadoop/common/lib/guava-11.0.2.jar:/home/hadoop/hadoop/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/home/hadoop/hadoop/share/hadoop/common/lib/slf4j-api-1.7.5.jar:/home/hadoop/hadoop/share/hadoop/common/lib/jasper-compiler-5.5.23.jar:/home/hadoop/hadoop/share/hadoop/common/lib/junit-4.8.2.jar:/home/hadoop/hadoop/share/hadoop/common/lib/jetty-util-6.1.26.jar:/home/hadoop/hadoop/share/hadoop/common/lib/avro-1.7.4.jar:/home/hadoop/hadoop/share/hadoop/common/lib/commons-httpclient-3.1.jar:/home/hadoop/hadoop/share/hadoop/common/lib/servlet-api-2.5.jar:/home/hadoop/hadoop/share/hadoop/common/lib/jersey-json-1.9.jar:/home/hadoop/hadoop/share/hadoop/common/lib/jsr305-1.3.9.jar:/home/hadoop/hadoop/share/hadoop/common/lib/jasper-runtime-5.5.23.jar:/home/hadoop/hadoop/share/hadoop/common/lib/commons-math-2.1.jar:/home/hadoop/hadoop/share/hadoop/common/lib/jetty-6.1.26.jar:/home/hadoop/hadoop/share/hadoop/common/lib/zookeeper-3.4.5.jar:/home/hadoop/hadoop/share/hadoop/common/lib/commons-net-3.1.jar:/home/hadoop/hadoop/share/hadoop/common/lib/paranamer-2.3.jar:/home/hadoop/hadoop/share/hadoop/common/lib/jackson-jaxrs-1.8.8.jar:/home/hadoop/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar:/home/hadoop/hadoop/share/hadoop/common/lib/netty-3.6.2.Final.jar:/home/hadoop/hadoop/share/hadoop/common/lib/jets3t-0.6.1.jar:/home/hadoop/hadoop/share/hadoop/common/lib/log4j-1.2.17.jar:/home/hadoop/hadoop/share/hadoop/common/lib/commons-lang-2.5.jar:/home/hadoop/hadoop/share/hadoop/common/lib/commons-el-1.0.jar:/home/hadoop/hadoop/share/hadoop/common/lib/hadoop-annotations-2.2.0.jar:/home/hadoop/hadoop/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/home/hadoop/hadoop/share/hadoop/common/lib/commons-io-2.1.jar:/home/hadoop/hadoop/share/hadoop/common/lib/stax-api-1.0.1.jar:/home/hadoop/hadoop/share/hadoop/common/lib/xmlenc-0.52.jar:/home/hadoop/hadoop/share/hadoop/common/lib/mockito-all-1.8.5.jar:/home/hadoop/hadoop/share/hadoop/common/lib/commons-beanutils-1.7.0.jar:/home/hadoop/hadoop/share/hadoop/common/lib/commons-logging-1.1.1.jar:/home/hadoop/hadoop/share/hadoop/common/lib/commons-configuration-1.6.jar:/home/hadoop/hadoop/share/hadoop/common/lib/commons-codec-1.4.jar:/home/hadoop/hadoop/share/hadoop/common/lib/jackson-xc-1.8.8.jar:/home/hadoop/hadoop/share/hadoop/common/lib/jackson-mapper-asl-1.8.8.jar:/home/hadoop/hadoop/share/hadoop/common/lib/activation-1.1.jar:/home/hadoop/hadoop/share/hadoop/common/lib/snappy-java-1.0.4.1.jar:/home/hadoop/hadoop/share/hadoop/common/lib/xz-1.0.jar:/home/hadoop/hadoop/share/hadoop/common/lib/commons-digester-1.8.jar:/home/hadoop/hadoop/share/hadoop/common/lib/asm-3.2.jar:/home/hadoop/hadoop/share/hadoop/common/lib/jsp-api-2.1.jar:/home/hadoop/hadoop/share/hadoop/common/lib/jersey-server-1.9.jar:/home/hadoop/hadoop/share/hadoop/common/lib/jsch-0.1.42.jar:/home/hadoop/hadoop/share/hadoop/common/lib/commons-collections-3.2.1.jar:/home/hadoop/hadoop/share/hadoop/common/hadoop-common-2.2.0-tests.jar:/home/hadoop/hadoop/share/hadoop/common/hadoop-common-2.2.0.jar:/home/hadoop/hadoop/share/hadoop/common/hadoop-nfs-2.2.0.jar:/home/hadoop/hadoop/share/hadoop/hdfs:/home/hadoop/hadoop/share/hadoop/hdfs/lib/jersey-core-1.9.jar:/home/hadoop/hadoop/share/hadoop/hdfs/lib/jackson-core-asl-1.8.8.jar:/home/hadoop/hadoop/share/hadoop/hdfs/lib/protobuf-java-2.5.0.jar:/home/hadoop/hadoop/share/hadoop/hdfs/lib/commons-cli-1.2.jar:/home/hadoop/hadoop/share/hadoop/hdfs/lib/guava-11.0.2.jar:/home/hadoop/hadoop/share/hadoop/hdfs/lib/jetty-util-6.1.26.jar:/home/hadoop/hadoop/share/hadoop/hdfs/lib/servlet-api-2.5.jar:/home/hadoop/hadoop/share/hadoop/hdfs/lib/jsr305-1.3.9.jar:/home/hadoop/hadoop/share/hadoop/hdfs/lib/jasper-runtime-5.5.23.jar:/home/hadoop/hadoop/share/hadoop/hdfs/lib/jetty-6.1.26.jar:/home/hadoop/hadoop/share/hadoop/hdfs/lib/netty-3.6.2.Final.jar:/home/hadoop/hadoop/share/hadoop/hdfs/lib/log4j-1.2.17.jar:/home/hadoop/hadoop/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar:/home/hadoop/hadoop/share/hadoop/hdfs/lib/commons-lang-2.5.jar:/home/hadoop/hadoop/share/hadoop/hdfs/lib/commons-el-1.0.jar:/home/hadoop/hadoop/share/hadoop/hdfs/lib/commons-io-2.1.jar:/home/hadoop/hadoop/share/hadoop/hdfs/lib/xmlenc-0.52.jar:/home/hadoop/hadoop/share/hadoop/hdfs/lib/commons-logging-1.1.1.jar:/home/hadoop/hadoop/share/hadoop/hdfs/lib/commons-codec-1.4.jar:/home/hadoop/hadoop/share/hadoop/hdfs/lib/jackson-mapper-asl-1.8.8.jar:/home/hadoop/hadoop/share/hadoop/hdfs/lib/asm-3.2.jar:/home/hadoop/hadoop/share/hadoop/hdfs/lib/jsp-api-2.1.jar:/home/hadoop/hadoop/share/hadoop/hdfs/lib/jersey-server-1.9.jar:/home/hadoop/hadoop/share/hadoop/hdfs/hadoop-hdfs-2.2.0.jar:/home/hadoop/hadoop/share/hadoop/hdfs/hadoop-hdfs-nfs-2.2.0.jar:/home/hadoop/hadoop/share/hadoop/hdfs/hadoop-hdfs-2.2.0-tests.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/yarn/lib/guice-3.0.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/yarn/lib/junit-4.10.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/yarn/lib/jersey-core-1.9.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/yarn/lib/jackson-core-asl-1.8.8.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/yarn/lib/jersey-guice-1.9.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/yarn/lib/guice-servlet-3.0.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/yarn/lib/commons-compress-1.4.1.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/yarn/lib/aopalliance-1.0.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/yarn/lib/protobuf-java-2.5.0.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/yarn/lib/avro-1.7.4.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/yarn/lib/javax.inject-1.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/yarn/lib/paranamer-2.3.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/yarn/lib/netty-3.6.2.Final.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/yarn/lib/log4j-1.2.17.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/yarn/lib/hadoop-annotations-2.2.0.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/yarn/lib/hamcrest-core-1.1.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/yarn/lib/commons-io-2.1.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/yarn/lib/jackson-mapper-asl-1.8.8.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/yarn/lib/snappy-java-1.0.4.1.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/yarn/lib/xz-1.0.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/yarn/lib/asm-3.2.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/yarn/lib/jersey-server-1.9.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-common-2.2.0.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-tests-2.2.0.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-client-2.2.0.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.2.0.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.2.0.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-api-2.2.0.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-site-2.2.0.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.2.0.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-common-2.2.0.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.2.0.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.2.0.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/mapreduce/lib/guice-3.0.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/mapreduce/lib/junit-4.10.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/mapreduce/lib/jersey-core-1.9.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/mapreduce/lib/jackson-core-asl-1.8.8.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/mapreduce/lib/jersey-guice-1.9.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/mapreduce/lib/guice-servlet-3.0.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/mapreduce/lib/commons-compress-1.4.1.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/mapreduce/lib/aopalliance-1.0.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/mapreduce/lib/protobuf-java-2.5.0.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/mapreduce/lib/avro-1.7.4.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/mapreduce/lib/javax.inject-1.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/mapreduce/lib/paranamer-2.3.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/mapreduce/lib/netty-3.6.2.Final.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/mapreduce/lib/log4j-1.2.17.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/mapreduce/lib/hadoop-annotations-2.2.0.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/mapreduce/lib/hamcrest-core-1.1.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/mapreduce/lib/commons-io-2.1.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/mapreduce/lib/jackson-mapper-asl-1.8.8.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/mapreduce/lib/snappy-java-1.0.4.1.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/mapreduce/lib/xz-1.0.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/mapreduce/lib/asm-3.2.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/mapreduce/lib/jersey-server-1.9.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.2.0.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.2.0.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.2.0.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.2.0.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.2.0.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.2.0.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.2.0-tests.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-2.2.0.jar:/home/hadoop/source/hadoop-2.2.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar:/home/hadoop/hadoop/contrib/capacity-scheduler/*.jar:/home/hadoop/hadoop/contrib/capacity-scheduler/*.jar
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common -r 1529768; compiled by 'hortonmu' on 2013-10-07T06:28Z
STARTUP_MSG: java = 1.7.0_40
************************************************************/
13/12/26 11:09:13 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /home/hadoop/source/hadoop-2.2.0/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
13/12/26 11:09:13 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Formatting using clusterid: clustername
13/12/26 11:09:13 INFO namenode.HostFileManager: read includes:
HostSet(
)
13/12/26 11:09:13 INFO namenode.HostFileManager: read excludes:
HostSet(
)
13/12/26 11:09:13 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000
13/12/26 11:09:13 INFO util.GSet: Computing capacity for map BlocksMap
13/12/26 11:09:13 INFO util.GSet: VM type = 64-bit
13/12/26 11:09:13 INFO util.GSet: 2.0% max memory = 889 MB
13/12/26 11:09:13 INFO util.GSet: capacity = 2^21 = 2097152 entries
13/12/26 11:09:13 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false
13/12/26 11:09:13 INFO blockmanagement.BlockManager: defaultReplication = 3
13/12/26 11:09:13 INFO blockmanagement.BlockManager: maxReplication = 512
13/12/26 11:09:13 INFO blockmanagement.BlockManager: minReplication = 1
13/12/26 11:09:13 INFO blockmanagement.BlockManager: maxReplicationStreams = 2
13/12/26 11:09:13 INFO blockmanagement.BlockManager: shouldCheckForEnoughRacks = false
13/12/26 11:09:13 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000
13/12/26 11:09:13 INFO blockmanagement.BlockManager: encryptDataTransfer = false
13/12/26 11:09:13 INFO namenode.FSNamesystem: fsOwner = hadoop (auth:SIMPLE)
13/12/26 11:09:13 INFO namenode.FSNamesystem: supergroup = supergroup
13/12/26 11:09:13 INFO namenode.FSNamesystem: isPermissionEnabled = true
13/12/26 11:09:13 INFO namenode.FSNamesystem: Determined nameservice ID: ns1
13/12/26 11:09:13 INFO namenode.FSNamesystem: HA Enabled: false
13/12/26 11:09:13 INFO namenode.FSNamesystem: Append Enabled: true
13/12/26 11:09:13 INFO util.GSet: Computing capacity for map INodeMap
13/12/26 11:09:13 INFO util.GSet: VM type = 64-bit
13/12/26 11:09:13 INFO util.GSet: 1.0% max memory = 889 MB
13/12/26 11:09:13 INFO util.GSet: capacity = 2^20 = 1048576 entries
13/12/26 11:09:13 INFO namenode.NameNode: Caching file names occuring more than
13/12/26 11:09:13 INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pc
13/12/26 11:09:13 INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanode
13/12/26 11:09:13 INFO namenode.FSNamesystem: dfs.namenode.safemode.extension
13/12/26 11:09:13 INFO namenode.FSNamesystem: Retry cache on namenode is enabled
13/12/26 11:09:13 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total
13/12/26 11:09:13 INFO util.GSet: Computing capacity for map Namenode Retry Cach
13/12/26 11:09:13 INFO util.GSet: VM type = 64-bit
13/12/26 11:09:13 INFO util.GSet: 0.029999999329447746% max memory = 889 MB
13/12/26 11:09:13 INFO util.GSet: capacity = 2^15 = 32768 entries
13/12/26 11:09:13 INFO common.Storage: Storage directory /hadoop/hdfs/name has b
13/12/26 11:09:14 INFO namenode.FSImage: Saving image file /hadoop/hdfs/name/cur
13/12/26 11:09:14 INFO namenode.FSImage: Image file /hadoop/hdfs/name/current/fs
13/12/26 11:09:14 INFO namenode.NNStorageRetentionManager: Going to retain 1 ima
13/12/26 11:09:14 INFO util.ExitUtil: Exiting with status 0
13/12/26 11:09:14 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at hadoop1/10.10.96.33
************************************************************/
启动hdfs
执行
start-dfs.sh
开启 hadoop dfs服务
启动Yarn
开启 yarn 资源管理服务
start-yarn.sh
启动httpfs
开启 httpfs 服 务
httpfs.sh start
使得对外可以提高http的restful接口服务
Http安装结果验证
验证hdfs
在各台机器执行jps看进程是否都已经启动了
[root@hadoop1 hadoop]# jps
9993 NameNode
10322 ResourceManager
10180 SecondaryNameNode
31918 Bootstrap
15754 Jps
[root@hadoop2 log]# jps
1867 Jps
30889 NodeManager
30794 DataNode
[root@hadoop3 log]# jps
1083 Jps
30040 DataNode
30136 NodeManager
[root@hadoop4 logs]# jps
30556 NodeManager
30459 DataNode
1530 Jps
进程启动正常
验证是否可以登陆
hadoop fs -ls hdfs://hadoop1:9000/
hadoop fs -mkdir hdfs://hadoop1:9000/testfolder
hadoop fs -copyFromLocal /testfolder hdfs://168.5.15.112:9000/testfolder (前提为本机已创建/testfolder目录)
hadoop fs -ls hdfs://168.5.15.112:9000/testfolder
看以上执行是否正常
验证map/reduce
在 hadoop1 上, 创建输入目录 :
hadoop fs -mkdir hdfs://hadoop1:9000/input
将一些txt文件复制到 hdfs 分布式文件系统的目录里,执行以下命令
hadoop fs -put hadoop-root-namenode-hadoop1.log/*.txt hdfs:// hadoop1:9000/input
在 hadoop1 上, 执行 HADOOP 自带的例子,wordcount 包,命令如下
cd $HADOOP_HOME/share/hadoop/mapreduce
hadoop jar hadoop-mapreduce-examples-2.2.0.jar wordcount hdfs://hadoop1:9000/input hdfs:// hadoop1:9000/output
在hadoop1上,查看结果命令如下 :
[root@hadoop1 hadoop]# hadoop fs -ls hdfs:// hadoop1:9000/output
Found 2 items
-rw-r--r-- 3 root supergroup 0 2013-12-26 16:02 /output/_SUCCESS
-rw-r--r-- 3 root supergroup 160597 2013-12-26 16:02 /output/part-r-00000
[root@hadoop1 hadoop]# hadoop fs -cat hdfs://hadoop1:9000/output/part-r-00000 即可看到每个单词的数量
验证WEB管理功能
查看hdfs web界面(注意由于是虚拟机的原因,需要设置代理服务器)
查看nodemanager web界面
附录
RHEL使用yum安装软件
1,加载光盘:
cd /media
mkdir iso
mount /dev/hdc iso
2,在/etc/yum.repos.d/路径下找到 *.repo,输入以下文本内容:
[base]
name=Base RPM Repository for RHEL6.4
baseurl=file:///media/iso/Server/
enabled=1
gpgcheck=0
3,修改/usr/lib/python2.6/site-packages/yum/路径下的yumRepo.py文件(可以看到,RHEL5.0
的系统代码是用Python开发的!),将其中由 remote = url + '/' + relative 修改为 remote
= "file:///mnt/iso/Server/" + '/' + relative 就可以了。
(vi通过:/remote = ulr即可找到)
通过yum install可以安装,如安装gcc,输入yum install gcc即可
一般尽量不要使用yum
remove
另外,如果安装光盘的版本比系统版本低,一般无法正常安装,必须找到匹配版本进行安装
Hadoop开启关闭调试信息
修改$HADOOP_CONF_DIR/log4j.properties文件 hadoop.root.logger=ALL,console
or:
开启:export
HADOOP_ROOT_LOGGER=DEBUG,console
关闭:export
HADOOP_ROOT_LOGGER=INFO,console
实时查看和修改Hadoop日志级别
hadoop daemonlog -getlevel
<host:port> <name>
hadoop daemonlog --setlevel
<host:port> <name> <level>
<name>为类名,如:TaskTracker
<level> 为日志级别,如:debug和 info等
另外,在WEB界面也可以参看