Hadoop-2.0.1安装手册

时间:2022-09-19 09:15:59

1.安装环境

1.1jdk版本

jdk-6u31-linux-x64

 

1.2主机配置

Linux主机IP地址

原主机名

改后主机名

10.137.169.148

bigdata-01

Master(namenode)

10.137.169.149

bigdata-02

Slavel(datanode)

10.137.169.150

bigdata-03

Slave2(datanode)

 

2.安装前配置

 

2.1修改主机名

以各个root用户执行

bigdata-01:~ # hostname master

 

bigdata-02:~ # hostname slave1

 

bigdata-03:~ # hostname slave2

Ps:主机名在重启后生效

2.2设置系统/tmp目录有读写权限,

修改每个节点,以root用户执行,

master:~ # chmod a=rwx /tmp/

master:~ # chmod a=rwx /tmp/*

2.3建立IP和主机名之间对应关系

以各个root用户分别在三台机器上执行

master:~ # vi /etc/hosts

在hosts文件下添加主机名和域名之间对应关系(红色字体为添加部分)

 

 

#

# hosts         This file describes a number of hostname-to-address

#               mappings for the TCP/IP subsystem.  It is mostly

#               used at boot time, when no name servers are running.

#               On small systems, this file can be used instead of a

#               "named" name server.

# Syntax:

#   

# IP-Address  Full-Qualified-Hostname  Short-Hostname

127.0.0.1            localhosts

10.137.169.148       master

10.137.169.149       slavel

10.137.169.150       slave2

# special IPv6 addresses

::1             bigdata-01  ipv6-localhost ipv6-loopback

 

fe00::0         ipv6-localnet

 

ff00::0         ipv6-mcastprefix

ff02::1         ipv6-allnodes

ff02::2         ipv6-allrouters

ff02::3         ipv6-allhosts

 

再分别在其他两台slave节点上作同样配置

slave1:~ # vi /etc/hosts

 

slave2:~ # vi /etc/hosts

 

2.4创建用户

以root用户执行,在各个节点分别创建名为Hadoop的普通用户

master:~ #useradd -m -d /home/hadoop -s /bin/bash hadoop

master:~ #chmod -R a+rwx hadoop

    设置Hadoop密码

master:~ # passwd hadoop

Changing password for hadoop.

New Password:

Password changed.

在slave1,和slave2上重复上面步骤

3.SSH免密码验证配置

分别在节点10.137.169.148、10.137.169.149、10.137.169.150执行如下步骤

1.在master上以用户hadoop执行,红色为手输部分

master:~ #su hadoop

hadoop@master:/> ssh-keygen -t dsa

Generating public/private dsa key pair.

Enter file in which to save the key (/home/hadoop/.ssh/id_dsa)://按下Enter   Enterpassphrase (empty for no passphrase): //按下Enter

Enter same passphrase again: //按下Enter

Your identification has been saved in /home/hadoop/.ssh/id_dsa.

Your public key has been saved in /home/hadoop/.ssh/id_dsa.pub.

The key fingerprint is:

a9:4d:a6:2b:bf:09:8c:b2:30:aa:c1:05:be:0a:27:09 hadoop@bigdata-01

The key's randomart image is:

+--[ DSA 1024]----+

|                 |

|                 |

| .               |

|. .      .       |

|E. .    S        |

|o.oo   *         |

|Ooo o o .        |

|=B  .. o         |

|*    o=.         |

+-----------------+

//将id_dsa.pub拷贝到authorized_keys中

hadoop@master:/> cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

2.分别在各个slave上执行上述同样步骤

ssh-keygen -t dsa

cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

3.在master节点上执行,红色为手输部分

//将其他slave节点的公钥拷贝到master节点中的authorized_keys,

//有几个slave节点就需要运行几次命令,slave1,slave2是节点主机名

hadoop@master:/> ssh slave1 cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys 

The authenticity of host 'slave1 (10.137.169.149)' can't be established.

RSA key fingerprint is 0f:5d:31:ba:dc:7a:84:15:6a:aa:20:a1:85:ec:c8:60.

Are you sure you want to continue connecting (yes/no)? yes

Warning: Permanently added 'slave1,10.137.169.149' (RSA) to the list of known hosts.

Password:    //填写之前设置的hadoop用户密码

hadoop@master:/> ssh slave2 cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys 

The authenticity of host 'slave2 (10.137.169.150)' can't be established.

RSA key fingerprint is 0f:5d:31:ba:dc:7a:84:15:6a:aa:20:a1:85:ec:c8:60.

Are you sure you want to continue connecting (yes/no)? yes

Warning: Permanently added 'slave2,10.137.169.150' (RSA) to the list of known hosts.

Password:    //填写之前设置的hadoop用户密码

//将authorized_keys文件拷贝回每一个节点,slave1,slave2是节点名称

hadoop@master:/> scp ~/.ssh/authorized_keys slave1:~/.ssh/authorized_keys 

hadoop@master:/> scp ~/.ssh/authorized_keys slave2:~/.ssh/authorized_keys 

4.权限设置

对master 和各个slave节点的hadoop目录进行权限更改

hadoop@master:/>cd home

hadoop@master:/home> chmod 755 hadoop    

在slave1节点上执行:

hadoop@slave1:/home> chmod 755 hadoop   

在slave2节点上执行:

hadoop@slave2:/home> chmod 755 hadoop   

 

5.验证无密码访问是否生效,如果不提示输入密码登陆成功,说明配置成功

hadoop@master:/> ssh slave1

Last login: Wed Jul 31 00:13:58 2013 from bigdata-01

hadoop@slave1:~>    

 

4.JDK安装

     4.1文件上传  

以hadoop用户ftp上传jdk-6u31-linux-x64.bin到各个节点的/home/hadoop目录

//这一步仅限于在windows平台下使用secureCRT进行上传操作

     点击secureCRT左上方选择Connect FTP Tab

sftp> lcd

sftp> put D:/jdk-6u31-linux-x64.bin 

//本地路径名不要过长,不要含有中文,推荐放在D盘根目录下

Uploading jdk-6u31-linux-x64.bin to /root/jdk-6u31-linux-x64.bin

  100% 83576KB   6964KB/s 00:00:12      

 

文件上传到root目录下,以root用户将jdk安装文件移到/home/hadoop下

master:~ # mv jdk-6u31-linux-x64.bin  /home/hadoop

 

4.2 JDK安装

分别在节点192.168.30.45、192.168.30.47、192.168.30.48安装JDK

以root用户,赋予“jdk-6u31-linux-x64.bin”可执行权限,然后执行

master:/home/hadoop # chmod u+x jdk-6u31-linux-x64.bin

master:/home/hadoop #./jdk-6u31-linux-x64.bin

#生成目录“/home/hadoop/jdk1.6.0_31”

在slave1,slave2 ,以同样方法分别安装jdk

 

5.Hadoop 安装

说明:是先在master主机上配置好Hadoop后再将配置好的Hadoop包复制到各个slave节点中,

所以5.1,5.2,5.3 只在master主机上执行。

5.1文件解压

将hadoop-2.0.1.tar.gz上传到master/home/hadoop目录,方法参考4.1

 在节点192.168.30.45解压hadoop安装包

以hadoop用户赋予“hadoop-2.0.1.tar.gz”可执行权限

hadoop@master:~> chmod u+x hadoop-2.0.1.tar.gz

hadoop@master:~> tar -xvf hadoop-2.0.1.tar.gz

#生成目录“/home/hadoop/hadoop-2.0.1”

 5.2环境变量配置

   1.在master节点下配置home/haoop/.profile文件,以hadoop用户执行

hadoop@master:>vi /home/haoop/.profile

#在.profile文件末尾添加下面设置代码

export JAVA_HOME=/home/hadoop/jdk1.6.0_31/

export HADOOP_HOME=/home/hadoop/hadoop-2.0.1

export PATH=${PATH}:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:

${JAVA_HOME}/bin:${JAVA_HOME}/jre/bin

(注意复制PATH路径时不要换行,即PATH路径添加到.profile时应为一行)

 

hadoop@master:> source   /home/haoop/.profile     //使环境变量生效

 

2.分别在节点192.168.30.45、192.168.30.47、192.168.30.48配置hadoop环境变量

master:/ #vi /etc/profile

#在profile文件末尾添加下面设置代码

export JAVA_HOME=/home/hadoop/jdk1.6.0_31/

export HADOOP_HOME=/home/hadoop/hadoop-2.0.1

export PATH=${PATH}:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:

${JAVA_HOME}/bin:${JAVA_HOME}/jre/bin

(注意复制PATH路径时不要换行,即PATH路径添加到.profile时应为一行)

 

 

3.检查环境变量是否配置成功

echo $JAVA_HOME      //显示/home/hadoop/jdk1.6.0_31/

echo $HADOOP_HOME  //显示/home/hadoop/hadoop-2.0.1

 5.3修改配置文件

   只在master下进行

  1. 修改目录下/home/hadoop/hadoop-2.0.1/etc/hadoop配置文件core-site.xml、mapred-site.xml、yarn-site.xml同时检查端口是否冲突,参考配置如下。

 

core-site.xml

<configuration>

<property>

<name>hadoop.tmp.dir</name>

<value>/home/hadoop/hadoop-2.0.1/hadoop_tmp</value>

<description>A base for other temporary directories.</description>

</property>

<property>

<name>fs.default.name</name>

<value>hdfs://10.137.169.148:9001</value>

</property>

</configuration>

 

mapred-site.xml

<configuration>

    <property>

      <name>mapreduce.framework.name</name>

      <value>yarn</value>

    </property>

    <property><name>mapreduce.shuffle.port</name><value>8082</value></property>

</configuration>

 

yarn-site.xml

<configuration>

<!-- Site specific YARN configuration properties -->

    <property>

    <name>yarn.resourcemanager.resource-tracker.address</name>

    <value>10.137.169.148:8050</value>

  </property>

  <property>

    <description>The address of the scheduler interface.</description>

    <name>yarn.resourcemanager.scheduler.address</name>

    <value>10.137.169.148:8051</value>

  </property>

  <property>

    <description>The address of the applications manager interface in the RM.</description>

    <name>yarn.resourcemanager.address</name>

    <value>10.137.169.148:8052</value>

  </property>

  <property>

    <description>The address of the RM admin interface.</description>

    <name>yarn.resourcemanager.admin.address</name>

    <value>10.137.169.148:8053</value>

  </property>

    <property>

    <description>Address where the localizer IPC is.</description>

    <name>yarn.nodemanager.localizer.address</name>

    <value>0.0.0.0:8054</value>

  </property>

  <!-- run mapreduce job need config -->

  <property>

        <name>yarn.nodemanager.aux-services</name>

        <value>mapreduce.shuffle</value>

    </property>

    <property>

        <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>

        <value>org.apache.hadoop.mapred.ShuffleHandler</value>

    </property>

  <!-- default Directory: /tmp/logs is not writable-->

  <property>

    <description>

      Where to store container logs. An application's localized log directory

      will be found in ${yarn.nodemanager.log-dirs}/application_${appid}.

      Individual containers' log directories will be below this, in directories

      named container_{$contid}. Each container directory will contain the files

      stderr, stdin, and syslog generated by that container.

    </description>

    <name>yarn.nodemanager.log-dirs</name>

    <value>/home/hadoop/hadoop-2.0.1/logs</value>

  </property>

  <property>

    <description>Where to aggregate logs to.</description>

    <name>yarn.nodemanager.remote-app-log-dir</name>

    <value>/home/hadoop/hadoop-2.0.1/logs</value>

  </property>

  <property>

    <description>List of directories to store localized files in. An

      application's localized file directory will be found in:

      ${yarn.nodemanager.local-dirs}/usercache/${user}/appcache/application_${appid}.

      Individual containers' work directories, called container_${contid}, will

      be subdirectories of this.

   </description>

    <name>yarn.nodemanager.local-dirs</name>

    <value>/home/hadoop/hadoop-2.0.1/logs</value>

  </property>

</configuration>

 

2.修改hadoop-env.sh

hadoop@master:/> cd /home/hadoop

hadoop@ master:~> vi hadoop-2.0.1/etc/hadoop/ hadoop-env.sh

#在该文件中添加下面代码

export JAVA_HOME=/home/hadoop/jdk1.6.0_31/

export HADOOP_HOME=/home/hadoop/hadoop-2.0.1

 

3.修改salves

hadoop@ master:/> cd /home/hadoop

hadoop@ master:~> vi hadoop-2.0.1/etc/hadoop/slaves

#将文件数据改为如下

10.137.169.149

10.137.169.150

Slaves中默认配置为localhost,上面的配置表示namenode节点为10.137.169.148

Datanode节点为10.137.169.149、10.137.169.150

5.4安装slave节点hadoop

1.将主节点机中的“hadoop-2.0.1”文件复制到子节点机slave1,slave2中。

hadoop@ master:/> cd /home/hadoop

//将master的hadoop-2.0.1打包

hadoop@ master:~> tar -zcvf hadoop.tar.gz hadoop-2.0.1 

//将打包后的hadoop.tar.gz分别复制到各个slave节点中

hadoop@ master:~> scp /home/hadoop/hadoop.tar.gz hadoop@slave1:/home/hadoop

hadoop@ master:~> scp /home/hadoop/hadoop.tar.gz hadoop@slave2:/home/hadoop

 

2.在slave1节点解压“hadoop.tar.gz”安装包

hadoop@ master:/> ssh hadoop@slave1

hadoop@ master:/> cd /home/hadoop

hadoop@ master:~> tar -zxvf hadoop.tar.gz

#在“/home/hadoop”目录下生成“hadoop-2.0.1”文件夹。到此安装完成。

 

     3. 在slave2节点解压“hadoop.tar.gz”安装包,同上

6. 启动hadoop和验证hadoop

6.1执行format命令

hadoop@ master:/> cd /home/hadoop/bin

hadoop@ master:~/bin> ./hadoop namenode –format //注意最好要加上执行符号./

 

6.2启动hdfs服务

hadoop@ master:/> start-all.sh

 

6.3验证启动成功

hadoop@ master:/> jps

1443 ResourceManager

21112 NameNode

8569 Jps

 

hadoop@slave1:/> jps

4709 DataNode

4851 NodeManager

24923 Jps

本文来自个人博客http://poeticliving.com/archives/155