说明
说明一:此篇为大数据部分第二篇,第一篇见戳链接/focuson_/article/details/80153371,机器的安装准备说明和zookeeper的安装已经在上一篇博客中说明。
说明二:本文为hadoop的安装,集群分布情况设计为:
机器 |
安装软件 |
进程 |
focuson1 |
zookeeper,hadoop namenode,hadoop DataNode |
JournalNode; DataNode; QuorumPeerMain; NameNode; DFSZKFailoverController;NodeManager |
focuson2 |
zookeeper;hadoop namenode,hadoop DataNode;yarn |
JournalNode; DataNode; QuorumPeerMain; NameNode; DFSZKFailoverController;NodeManager;ResourceManager |
focuson3 |
zookeeper,hadoop DataNode;yarn |
JournalNode; DataNode; QuorumPeerMain;NodeManager;ResourceManager |
安装步骤:
1、压缩包上传到focuson1家目录
cd/usr/local/src/
mkdir hadoop
mv~/hadoop-2.6.0. .
tar -xvfhadoop-2.6.0.
rm -fhadoop-2.6.0.
2、修改配置文件
1》
exportJAVA_HOME=/usr/local/src/java/jdk1.7.0_51//必须要有的
2》yarn 与Hadoop集成
2.1
<configuration>
<configuration>
<!-- 指定mr框架为yarn方式 -->
<property>
<name></name>
<value>yarn</value>
</property>
</configuration>
</configuration>
2.2
<configuration>
<!-- Site specific YARNconfiguration properties -->
<!-- 开启RM高可靠-->
<property>
<name></name>
<value>true</value>
</property>
<!-- 指定RM的clusterid -->
<property>
<name>-id</name>
<value>yrc</value>
</property>
<!-- 指定RM的名字-->
<property>
<name>-ids</name>
<value>rm1,rm2</value>
</property>
<!-- 分别指定RM的地址-->
<property>
<name>1</name>
<value>focuson2</value>
</property>
<property>
<name>2</name>
<value>focuson3</value>
</property>
<!-- 指定zk集群地址-->
<property>
<name>-address</name>
<value>focuson1:2181,focuson2:2181,focuson3:2181</value>
</property>
<property>
<name>-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
3》(端口:rpc:9000;http:50070)
-
<configuration>
-
<!--meservice为ns1,需要和中的保持一致 -->
-
<property>
-
<name></name>
-
<value>ns1</value>
-
</property>
-
<!-- ns1下面有两个NameNode,分别是nn1,nn2 -->
-
<property>
-
<name>.ns1</name>
-
<value>nn1,nn2</value>
-
</property>
-
<!-- nn1的RPC通信地址 -->
-
<property>
-
<name>-address.ns1.nn1</name>
-
<value>focuson1:9000</value>
-
</property>
-
<!-- nn1的http通信地址 -->
-
<property>
-
<name>-address.ns1.nn1</name>
-
<value>focuson1:50070</value>
-
</property>
-
<!-- nn2的RPC通信地址 -->
-
<property>
-
<name>-address.ns1.nn2</name>
-
<value>focuson2:9000</value>
-
</property>
-
<!-- nn2的http通信地址 -->
-
<property>
-
<name>-address.ns1.nn2</name>
-
<value>focuson2:50070</value>
-
</property>
-
<!-- 指定NameNode的元数据在JournalNode上的存放位置 -->
-
<property>
-
<name></name>
-
<value>qjournal://focuson1:8485;focuson2:8485;focuson3:8485/ns1</value>
-
</property>
-
<!-- 指定JournalNode在本地磁盘存放数据的位置 -->
-
<property>
-
<name></name>
-
<value>/usr/local/src/hadoop/hadoop-2.6.0/journal</value>
-
</property>
-
<!-- 开启NameNode失败自动切换 -->
-
<property>
-
<name></name>
-
<value>true</value>
-
</property>
-
<!-- 配置失败自动切换实现方式 -->
-
<property>
-
<name>.ns1</name>
-
<value></value>
-
</property>
-
<!-- 配置隔离机制方法,多个机制用换行分割,即每个机制暂用一行-->
-
<property>
-
<name></name>
-
<value>
-
sshfence
-
shell(/bin/true)
-
</value>
-
</property>
-
<!-- 使用sshfence隔离机制时需要ssh免登陆 -->
-
<property>
-
<name>-key-files</name>
-
<value>/root/.ssh/id_rsa</value>
-
</property>
-
<!-- 配置sshfence隔离机制超时时间 -->
-
<property>
-
<name>-timeout</name>
-
<value>30000</value>
-
</property>
-
</configuration>
4》
<configuration>
<!-- 指定hdfs的nameservice为ns1 -->
<property>
<name></name>
<value>hdfs://ns1</value>
</property>
<!-- 指定hadoop临时目录 -->
<property>
<name></name>
<value>/usr/local/src/hadoop/hadoop-2.6.0/tmp</value>
</property>
<!-- 指定zookeeper地址 -->
<property>
<name></name>
<value>focuson1:2181,focuson2:2181,focuson3:2181</value>
</property>
</configuration>
5》、slaves
-
focuson1
-
focuson2
-
focuson3
4、拷贝项目到另外两台机器
scp -r /usr/local/src/hadoopfocuson2:/usr/local/src/
scp -r /usr/local/src/hadoopfocuson3:/usr/local/src/
5、进行namenode格式化
在focuson1上执行:hdfs namenode format(格式化之前要先启动JournalNode,sbin/ start journalnode)
会在/usr/local/src/hadoop/hadoop-2.6.0(该路径即配置的)生成tmp文件夹,把文件夹考到focuson2的该路径下。
*如不执行该操作,会报错
5、启动一:dfs,只需在focuson1上执行即可,会自动执行namenode/datanode/journalnode/zkfc
进入/usr/local/src/hadoop,执行 sbin/
输出日志如下:
18/04/28 19:02:36 WARN : Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [focuson1 focuson2]
focuson1: starting namenode, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/hadoop-root-namenode-focuson1.out
focuson2: starting namenode, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/hadoop-root-namenode-focuson2.out
focuson1: starting datanode, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/hadoop-root-datanode-focuson1.out
focuson2: starting datanode, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/hadoop-root-datanode-focuson2.out
focuson3: starting datanode, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/hadoop-root-datanode-focuson3.out
Starting journal nodes [focuson1 focuson2 focuson3]
focuson3: starting journalnode, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/hadoop-root-journalnode-focuson3.out
focuson1: starting journalnode, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/hadoop-root-journalnode-focuson1.out
focuson2: starting journalnode, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/hadoop-root-journalnode-focuson2.out
18/04/28 19:03:04 WARN : Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting ZK Failover Controllers on NN hosts [focuson1 focuson2]
focuson2: starting zkfc, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/hadoop-root-zkfc-focuson2.out
focuson1: starting zkfc, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/hadoop-root-zkfc-focuson1.out
启动二:yarn,在focuson2上:
进入/usr/local/src/hadoop,执行 sbin/
[root@focuson2 hadoop-2.6.0]# ./sbin/
starting yarn daemons
starting resourcemanager, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/yarn-root-resourcemanager-focuson1.out
focuson2: starting nodemanager, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/yarn-root-nodemanager-focuson2.out
focuson3: starting nodemanager, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/yarn-root-nodemanager-focuson3.out
focuson1: starting nodemanager, logging to /usr/local/src/hadoop/hadoop-2.6.0/logs/yarn-root-nodemanager-focuson1.out
在focuson3上执行(只会启动一个resourcemanager,是为了高可用):
[root@focuson3 hadoop-2.6.0]# ./sbin/
starting yarn daemons
resourcemanager running as process 4258. Stop it first.
focuson3: nodemanager running as process 4689. Stop it first.
focuson2: nodemanager running as process 5783. Stop it first.
focuson1: nodemanager running as process 7596. Stop it first.
6、验证。在focuson1上,jps:
[root@focuson1 hadoop-2.6.0]# jps
6977 DataNode
7089 JournalNode
7177 DFSZKFailoverController
7596 NodeManager
7790 Jps
4255 QuorumPeerMain
6911 NameNode
在focuson2上,
[root@focuson2 hadoop-2.6.0]# jps
6144 Jps
5505 DFSZKFailoverController
2963 QuorumPeerMain
5140 DataNode
5783 NodeManager
5047 NameNode
6056 ResourceManager
5321 JournalNode
在focuson3上:
[root@focuson3 hadoop-2.6.0]# jps
5136 Jps
4689 NodeManager
4258 ResourceManager
4419 DataNode
3044 QuorumPeerMain
4504 JournalNode
登录web界面查看:
可见focuson2的namenode为standby,focuson1的为active。
在focuson1上杀掉namenode进程,会发现focuson2的为active,如下:
[root@focuson1 hadoop-2.6.0]# jps
6977 DataNode
7089 JournalNode
7177 DFSZKFailoverController
7596 NodeManager
7790 Jps
4255 QuorumPeerMain
6911 NameNode
[root@focuson1 hadoop-2.6.0]# kill -9 6911
7.操作一把:
在focuson1上执行hdfs的一些命令:
touch first .txt
hdfs dfs –put first.txt
hdfs dfs –put /ls
......
成功!