Hadoop(yarn)集群安装

时间:2024-11-16 08:11:19

说明

说明一:此篇为大数据部分第二篇,第一篇见戳链接/focuson_/article/details/80153371,机器的安装准备说明和zookeeper的安装已经在上一篇博客中说明。

说明二:本文为hadoop的安装,集群分布情况设计为:

机器

安装软件

进程

focuson1

zookeeper,hadoop namenode,hadoop DataNode

JournalNode; DataNode; QuorumPeerMain; NameNode; DFSZKFailoverController;NodeManager

focuson2

zookeeper;hadoop namenode,hadoop DataNode;yarn

JournalNode; DataNode; QuorumPeerMain; NameNode; DFSZKFailoverController;NodeManager;ResourceManager

focuson3

zookeeper,hadoop DataNode;yarn

JournalNode; DataNode; QuorumPeerMain;NodeManager;ResourceManager

 

安装步骤:

 

1、压缩包上传到focuson1家目录

cd/usr/local/src/
mkdir hadoop
mv~/hadoop-2.6.0. .
tar -xvfhadoop-2.6.0.
rm -fhadoop-2.6.0.

2、修改配置文件

1》

exportJAVA_HOME=/usr/local/src/java/jdk1.7.0_51//必须要有的

2》yarn 与Hadoop集成

2.1

<configuration>
    <configuration>
        <!-- 指定mr框架为yarn方式 -->
        <property>
               <name></name>
                <value>yarn</value>
        </property>
    </configuration>
</configuration>

2.2

<configuration>

         <!-- Site specific YARNconfiguration properties -->

         <!-- 开启RM高可靠-->

    <property>

      <name></name>

       <value>true</value>

    </property>

    <!-- 指定RM的clusterid -->

    <property>

      <name>-id</name>

       <value>yrc</value>

    </property>

    <!-- 指定RM的名字-->

    <property>

      <name>-ids</name>

       <value>rm1,rm2</value>

    </property>

    <!-- 分别指定RM的地址-->

    <property>

      <name>1</name>

       <value>focuson2</value>

    </property>

    <property>

      <name>2</name>

       <value>focuson3</value>

    </property>

    <!-- 指定zk集群地址-->

    <property>

      <name>-address</name>

      <value>focuson1:2181,focuson2:2181,focuson3:2181</value>

    </property>

    <property>

      <name>-services</name>

      <value>mapreduce_shuffle</value>

    </property>

</configuration>

3》(端口:rpc:9000;http:50070)

  1. <configuration>
  2. <!--meservice为ns1,需要和中的保持一致 -->
  3. <property>
  4. <name></name>
  5. <value>ns1</value>
  6. </property>
  7. <!-- ns1下面有两个NameNode,分别是nn1,nn2 -->
  8. <property>
  9. <name>.ns1</name>
  10. <value>nn1,nn2</value>
  11. </property>
  12. <!-- nn1的RPC通信地址 -->
  13. <property>
  14. <name>-address.ns1.nn1</name>
  15. <value>focuson1:9000</value>
  16. </property>
  17. <!-- nn1的http通信地址 -->
  18. <property>
  19. <name>-address.ns1.nn1</name>
  20. <value>focuson1:50070</value>
  21. </property>
  22. <!-- nn2的RPC通信地址 -->
  23. <property>
  24. <name>-address.ns1.nn2</name>
  25. <value>focuson2:9000</value>
  26. </property>
  27. <!-- nn2的http通信地址 -->
  28. <property>
  29. <name>-address.ns1.nn2</name>
  30. <value>focuson2:50070</value>
  31. </property>
  32. <!-- 指定NameNode的元数据在JournalNode上的存放位置 -->
  33. <property>
  34. <name></name>
  35. <value>qjournal://focuson1:8485;focuson2:8485;focuson3:8485/ns1</value>
  36. </property>
  37. <!-- 指定JournalNode在本地磁盘存放数据的位置 -->
  38. <property>
  39. <name></name>
  40. <value>/usr/local/src/hadoop/hadoop-2.6.0/journal</value>
  41. </property>
  42. <!-- 开启NameNode失败自动切换 -->
  43. <property>
  44. <name></name>
  45. <value>true</value>
  46. </property>
  47. <!-- 配置失败自动切换实现方式 -->
  48. <property>
  49. <name>.ns1</name>
  50. <value></value>
  51. </property>
  52. <!-- 配置隔离机制方法,多个机制用换行分割,即每个机制暂用一行-->
  53. <property>
  54. <name></name>
  55. <value>
  56. sshfence
  57. shell(/bin/true)
  58. </value>
  59. </property>
  60. <!-- 使用sshfence隔离机制时需要ssh免登陆 -->
  61. <property>
  62. <name>-key-files</name>
  63. <value>/root/.ssh/id_rsa</value>
  64. </property>
  65. <!-- 配置sshfence隔离机制超时时间 -->
  66. <property>
  67. <name>-timeout</name>
  68. <value>30000</value>
  69. </property>
  70. </configuration>

4》

<configuration>
	<!-- 指定hdfs的nameservice为ns1 -->
	<property>
	        <name></name>
	        <value>hdfs://ns1</value>
	</property>
	<!-- 指定hadoop临时目录 -->
	<property>
	        <name></name>
	        <value>/usr/local/src/hadoop/hadoop-2.6.0/tmp</value>
	</property>
	<!-- 指定zookeeper地址 -->
	<property>
	        <name></name>
	        <value>focuson1:2181,focuson2:2181,focuson3:2181</value>
	</property>
</configuration>

5》、slaves

  1. focuson1
  2. focuson2
  3. focuson3

4、拷贝项目到另外两台机器

scp -r /usr/local/src/hadoopfocuson2:/usr/local/src/

scp -r /usr/local/src/hadoopfocuson3:/usr/local/src/

5、进行namenode格式化

在focuson1上执行:hdfs namenode format(格式化之前要先启动JournalNode,sbin/ start journalnode)

会在/usr/local/src/hadoop/hadoop-2.6.0(该路径即配置的)生成tmp文件夹,把文件夹考到focuson2的该路径下。

*如不执行该操作,会报错

5、启动一:dfs,只需在focuson1上执行即可,会自动执行namenode/datanode/journalnode/zkfc

进入/usr/local/src/hadoop,执行 sbin/

输出日志如下:

18/04/28 19:02:36 WARN : Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [focuson1 focuson2]
focuson1: starting namenode, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/hadoop-root-namenode-focuson1.out
focuson2: starting namenode, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/hadoop-root-namenode-focuson2.out
focuson1: starting datanode, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/hadoop-root-datanode-focuson1.out
focuson2: starting datanode, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/hadoop-root-datanode-focuson2.out
focuson3: starting datanode, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/hadoop-root-datanode-focuson3.out
Starting journal nodes [focuson1 focuson2 focuson3]
focuson3: starting journalnode, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/hadoop-root-journalnode-focuson3.out
focuson1: starting journalnode, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/hadoop-root-journalnode-focuson1.out
focuson2: starting journalnode, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/hadoop-root-journalnode-focuson2.out
18/04/28 19:03:04 WARN : Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting ZK Failover Controllers on NN hosts [focuson1 focuson2]
focuson2: starting zkfc, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/hadoop-root-zkfc-focuson2.out
focuson1: starting zkfc, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/hadoop-root-zkfc-focuson1.out

启动二:yarn,在focuson2上:

进入/usr/local/src/hadoop,执行 sbin/

[root@focuson2 hadoop-2.6.0]# ./sbin/ 
starting yarn daemons
starting resourcemanager, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/yarn-root-resourcemanager-focuson1.out
focuson2: starting nodemanager, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/yarn-root-nodemanager-focuson2.out
focuson3: starting nodemanager, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/yarn-root-nodemanager-focuson3.out
focuson1: starting nodemanager, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/yarn-root-nodemanager-focuson1.out

在focuson3上执行(只会启动一个resourcemanager,是为了高可用):

[root@focuson3 hadoop-2.6.0]# ./sbin/
starting yarn daemons
resourcemanager running as process 4258. Stop it first.
focuson3: nodemanager running as process 4689. Stop it first.
focuson2: nodemanager running as process 5783. Stop it first.
focuson1: nodemanager running as process 7596. Stop it first.

 

6、验证。在focuson1上,jps:

 

[root@focuson1 hadoop-2.6.0]# jps
6977 DataNode
7089 JournalNode
7177 DFSZKFailoverController
7596 NodeManager
7790 Jps
4255 QuorumPeerMain
6911 NameNode

在focuson2上,

[root@focuson2 hadoop-2.6.0]# jps
6144 Jps
5505 DFSZKFailoverController
2963 QuorumPeerMain
5140 DataNode
5783 NodeManager
5047 NameNode
6056 ResourceManager
5321 JournalNode

在focuson3上:

[root@focuson3 hadoop-2.6.0]# jps
5136 Jps
4689 NodeManager
4258 ResourceManager
4419 DataNode
3044 QuorumPeerMain
4504 JournalNode

登录web界面查看:

登录web界面查看:

 

 

可见focuson2的namenode为standby,focuson1的为active。

在focuson1上杀掉namenode进程,会发现focuson2的为active,如下:

 
[root@focuson1 hadoop-2.6.0]# jps
6977 DataNode
7089 JournalNode
7177 DFSZKFailoverController
7596 NodeManager
7790 Jps
4255 QuorumPeerMain
6911 NameNode
[root@focuson1 hadoop-2.6.0]# kill -9 6911

 

 

 

 

7.操作一把:

在focuson1上执行hdfs的一些命令:

touch first .txt
hdfs dfs –put first.txt
hdfs dfs –put /ls
...... 

成功!