1.安装环境
1.1jdk版本
jdk-6u31-linux-x64
1.2主机配置
Linux主机IP地址 |
原主机名 |
改后主机名 |
10.137.169.148 |
bigdata-01 |
Master(namenode) |
10.137.169.149 |
bigdata-02 |
Slavel(datanode) |
10.137.169.150 |
bigdata-03 |
Slave2(datanode) |
2.安装前配置
2.1修改主机名
以各个root用户执行
bigdata-01:~ # hostname master |
bigdata-02:~ # hostname slave1 |
bigdata-03:~ # hostname slave2 |
Ps:主机名在重启后生效
2.2设置系统/tmp目录有读写权限,
修改每个节点,以root用户执行,
master:~ # chmod a=rwx /tmp/ master:~ # chmod a=rwx /tmp/* |
2.3建立IP和主机名之间对应关系
以各个root用户分别在三台机器上执行
master:~ # vi /etc/hosts |
在hosts文件下添加主机名和域名之间对应关系(红色字体为添加部分)
# # hosts This file describes a number of hostname-to-address # mappings for the TCP/IP subsystem. It is mostly # used at boot time, when no name servers are running. # On small systems, this file can be used instead of a # "named" name server. # Syntax: # # IP-Address Full-Qualified-Hostname Short-Hostname 127.0.0.1 localhosts 10.137.169.148 master 10.137.169.149 slavel 10.137.169.150 slave2 # special IPv6 addresses ::1 bigdata-01 ipv6-localhost ipv6-loopback
fe00::0 ipv6-localnet
ff00::0 ipv6-mcastprefix ff02::1 ipv6-allnodes ff02::2 ipv6-allrouters ff02::3 ipv6-allhosts |
再分别在其他两台slave节点上作同样配置
slave1:~ # vi /etc/hosts |
slave2:~ # vi /etc/hosts |
2.4创建用户
以root用户执行,在各个节点分别创建名为Hadoop的普通用户
master:~ #useradd -m -d /home/hadoop -s /bin/bash hadoop master:~ #chmod -R a+rwx hadoop |
设置Hadoop密码
master:~ # passwd hadoop Changing password for hadoop. New Password: Password changed. |
在slave1,和slave2上重复上面步骤
3.SSH免密码验证配置
分别在节点10.137.169.148、10.137.169.149、10.137.169.150执行如下步骤
1.在master上以用户hadoop执行,红色为手输部分
master:~ #su hadoop hadoop@master:/> ssh-keygen -t dsa Generating public/private dsa key pair. Enter file in which to save the key (/home/hadoop/.ssh/id_dsa)://按下Enter Enterpassphrase (empty for no passphrase): //按下Enter Enter same passphrase again: //按下Enter Your identification has been saved in /home/hadoop/.ssh/id_dsa. Your public key has been saved in /home/hadoop/.ssh/id_dsa.pub. The key fingerprint is: a9:4d:a6:2b:bf:09:8c:b2:30:aa:c1:05:be:0a:27:09 hadoop@bigdata-01 The key's randomart image is: +--[ DSA 1024]----+ | | | | | . | |. . . | |E. . S | |o.oo * | |Ooo o o . | |=B .. o | |* o=. | +-----------------+ //将id_dsa.pub拷贝到authorized_keys中 hadoop@master:/> cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys |
2.分别在各个slave上执行上述同样步骤
ssh-keygen -t dsa cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys |
3.在master节点上执行,红色为手输部分
//将其他slave节点的公钥拷贝到master节点中的authorized_keys, //有几个slave节点就需要运行几次命令,slave1,slave2是节点主机名 hadoop@master:/> ssh slave1 cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys The authenticity of host 'slave1 (10.137.169.149)' can't be established. RSA key fingerprint is 0f:5d:31:ba:dc:7a:84:15:6a:aa:20:a1:85:ec:c8:60. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'slave1,10.137.169.149' (RSA) to the list of known hosts. Password: //填写之前设置的hadoop用户密码 hadoop@master:/> ssh slave2 cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys The authenticity of host 'slave2 (10.137.169.150)' can't be established. RSA key fingerprint is 0f:5d:31:ba:dc:7a:84:15:6a:aa:20:a1:85:ec:c8:60. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'slave2,10.137.169.150' (RSA) to the list of known hosts. Password: //填写之前设置的hadoop用户密码 //将authorized_keys文件拷贝回每一个节点,slave1,slave2是节点名称 hadoop@master:/> scp ~/.ssh/authorized_keys slave1:~/.ssh/authorized_keys hadoop@master:/> scp ~/.ssh/authorized_keys slave2:~/.ssh/authorized_keys |
4.权限设置
对master 和各个slave节点的hadoop目录进行权限更改
hadoop@master:/>cd home hadoop@master:/home> chmod 755 hadoop |
在slave1节点上执行:
hadoop@slave1:/home> chmod 755 hadoop
在slave2节点上执行:
hadoop@slave2:/home> chmod 755 hadoop
5.验证无密码访问是否生效,如果不提示输入密码登陆成功,说明配置成功
hadoop@master:/> ssh slave1 Last login: Wed Jul 31 00:13:58 2013 from bigdata-01 hadoop@slave1:~> |
4.JDK安装
4.1文件上传
以hadoop用户ftp上传jdk-6u31-linux-x64.bin到各个节点的/home/hadoop目录
//这一步仅限于在windows平台下使用secureCRT进行上传操作
点击secureCRT左上方选择Connect FTP Tab
sftp> lcd sftp> put D:/jdk-6u31-linux-x64.bin //本地路径名不要过长,不要含有中文,推荐放在D盘根目录下 Uploading jdk-6u31-linux-x64.bin to /root/jdk-6u31-linux-x64.bin 100% 83576KB 6964KB/s 00:00:12 |
文件上传到root目录下,以root用户将jdk安装文件移到/home/hadoop下
master:~ # mv jdk-6u31-linux-x64.bin /home/hadoop |
4.2 JDK安装
分别在节点192.168.30.45、192.168.30.47、192.168.30.48安装JDK
以root用户,赋予“jdk-6u31-linux-x64.bin”可执行权限,然后执行
master:/home/hadoop # chmod u+x jdk-6u31-linux-x64.bin master:/home/hadoop #./jdk-6u31-linux-x64.bin #生成目录“/home/hadoop/jdk1.6.0_31” |
在slave1,slave2 ,以同样方法分别安装jdk
5.Hadoop 安装
说明:是先在master主机上配置好Hadoop后再将配置好的Hadoop包复制到各个slave节点中,
所以5.1,5.2,5.3 只在master主机上执行。
5.1文件解压
将hadoop-2.0.1.tar.gz上传到master/home/hadoop目录,方法参考4.1
在节点192.168.30.45解压hadoop安装包
以hadoop用户赋予“hadoop-2.0.1.tar.gz”可执行权限
hadoop@master:~> chmod u+x hadoop-2.0.1.tar.gz hadoop@master:~> tar -xvf hadoop-2.0.1.tar.gz #生成目录“/home/hadoop/hadoop-2.0.1” |
5.2环境变量配置
1.在master节点下配置home/haoop/.profile文件,以hadoop用户执行
hadoop@master:>vi /home/haoop/.profile #在.profile文件末尾添加下面设置代码 export JAVA_HOME=/home/hadoop/jdk1.6.0_31/ export HADOOP_HOME=/home/hadoop/hadoop-2.0.1 export PATH=${PATH}:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin: ${JAVA_HOME}/bin:${JAVA_HOME}/jre/bin (注意复制PATH路径时不要换行,即PATH路径添加到.profile时应为一行) |
hadoop@master:> source /home/haoop/.profile //使环境变量生效 |
2.分别在节点192.168.30.45、192.168.30.47、192.168.30.48配置hadoop环境变量
master:/ #vi /etc/profile #在profile文件末尾添加下面设置代码 export JAVA_HOME=/home/hadoop/jdk1.6.0_31/ export HADOOP_HOME=/home/hadoop/hadoop-2.0.1 export PATH=${PATH}:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin: ${JAVA_HOME}/bin:${JAVA_HOME}/jre/bin (注意复制PATH路径时不要换行,即PATH路径添加到.profile时应为一行) |
3.检查环境变量是否配置成功
echo $JAVA_HOME //显示/home/hadoop/jdk1.6.0_31/
echo $HADOOP_HOME //显示/home/hadoop/hadoop-2.0.1
5.3修改配置文件
只在master下进行
- 修改目录下/home/hadoop/hadoop-2.0.1/etc/hadoop配置文件core-site.xml、mapred-site.xml、yarn-site.xml同时检查端口是否冲突,参考配置如下。
core-site.xml
<configuration> <property> <name>hadoop.tmp.dir</name> <value>/home/hadoop/hadoop-2.0.1/hadoop_tmp</value> <description>A base for other temporary directories.</description> </property> <property> <name>fs.default.name</name> <value>hdfs://10.137.169.148:9001</value> </property> </configuration> |
mapred-site.xml
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property><name>mapreduce.shuffle.port</name><value>8082</value></property> </configuration> |
yarn-site.xml
<configuration> <!-- Site specific YARN configuration properties --> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>10.137.169.148:8050</value> </property> <property> <description>The address of the scheduler interface.</description> <name>yarn.resourcemanager.scheduler.address</name> <value>10.137.169.148:8051</value> </property> <property> <description>The address of the applications manager interface in the RM.</description> <name>yarn.resourcemanager.address</name> <value>10.137.169.148:8052</value> </property> <property> <description>The address of the RM admin interface.</description> <name>yarn.resourcemanager.admin.address</name> <value>10.137.169.148:8053</value> </property> <property> <description>Address where the localizer IPC is.</description> <name>yarn.nodemanager.localizer.address</name> <value>0.0.0.0:8054</value> </property> <!-- run mapreduce job need config --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce.shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <!-- default Directory: /tmp/logs is not writable--> <property> <description> Where to store container logs. An application's localized log directory will be found in ${yarn.nodemanager.log-dirs}/application_${appid}. Individual containers' log directories will be below this, in directories named container_{$contid}. Each container directory will contain the files stderr, stdin, and syslog generated by that container. </description> <name>yarn.nodemanager.log-dirs</name> <value>/home/hadoop/hadoop-2.0.1/logs</value> </property> <property> <description>Where to aggregate logs to.</description> <name>yarn.nodemanager.remote-app-log-dir</name> <value>/home/hadoop/hadoop-2.0.1/logs</value> </property> <property> <description>List of directories to store localized files in. An application's localized file directory will be found in: ${yarn.nodemanager.local-dirs}/usercache/${user}/appcache/application_${appid}. Individual containers' work directories, called container_${contid}, will be subdirectories of this. </description> <name>yarn.nodemanager.local-dirs</name> <value>/home/hadoop/hadoop-2.0.1/logs</value> </property> </configuration> |
2.修改hadoop-env.sh
hadoop@master:/> cd /home/hadoop hadoop@ master:~> vi hadoop-2.0.1/etc/hadoop/ hadoop-env.sh #在该文件中添加下面代码 export JAVA_HOME=/home/hadoop/jdk1.6.0_31/ export HADOOP_HOME=/home/hadoop/hadoop-2.0.1 |
3.修改salves
hadoop@ master:/> cd /home/hadoop hadoop@ master:~> vi hadoop-2.0.1/etc/hadoop/slaves #将文件数据改为如下 10.137.169.149 10.137.169.150 |
Slaves中默认配置为localhost,上面的配置表示namenode节点为10.137.169.148
Datanode节点为10.137.169.149、10.137.169.150
5.4安装slave节点hadoop
1.将主节点机中的“hadoop-2.0.1”文件复制到子节点机slave1,slave2中。
hadoop@ master:/> cd /home/hadoop //将master的hadoop-2.0.1打包 hadoop@ master:~> tar -zcvf hadoop.tar.gz hadoop-2.0.1 //将打包后的hadoop.tar.gz分别复制到各个slave节点中 hadoop@ master:~> scp /home/hadoop/hadoop.tar.gz hadoop@slave1:/home/hadoop hadoop@ master:~> scp /home/hadoop/hadoop.tar.gz hadoop@slave2:/home/hadoop |
2.在slave1节点解压“hadoop.tar.gz”安装包
hadoop@ master:/> ssh hadoop@slave1 hadoop@ master:/> cd /home/hadoop hadoop@ master:~> tar -zxvf hadoop.tar.gz #在“/home/hadoop”目录下生成“hadoop-2.0.1”文件夹。到此安装完成。 |
3. 在slave2节点解压“hadoop.tar.gz”安装包,同上
6. 启动hadoop和验证hadoop
6.1执行format命令
hadoop@ master:/> cd /home/hadoop/bin hadoop@ master:~/bin> ./hadoop namenode –format //注意最好要加上执行符号./ |
6.2启动hdfs服务
hadoop@ master:/> start-all.sh |
6.3验证启动成功
hadoop@ master:/> jps 1443 ResourceManager 21112 NameNode 8569 Jps |
hadoop@slave1:/> jps 4709 DataNode 4851 NodeManager 24923 Jps |