Hadoop部署方式-高可用集群部署(High Availability)

时间:2021-11-08 15:12:19

版权声明:原创作品,谢绝转载!否则将追究法律责任。

  本篇博客的高可用集群是建立在完全分布式基础之上的,详情请参考:https://www.cnblogs.com/yinzhengjie/p/9065191.html。并且需要新增一台Linux服务器,用于Namenode的备份节点。

一.实验环境准备
  需要准备五台Linux操作系统的服务器,配置参数最好一样,由于我的虚拟机是之前完全分布式部署而来的,因此我的环境都一致。

1>.NameNode服务器(s101)

Hadoop部署方式-高可用集群部署(High Availability)

2>.DataNode服务器(s102)

Hadoop部署方式-高可用集群部署(High Availability)

3>.DataNode服务器(s103)

Hadoop部署方式-高可用集群部署(High Availability)

4>.DataNode服务器(s104)

Hadoop部署方式-高可用集群部署(High Availability)

5>.DataNode服务器(s105)

Hadoop部署方式-高可用集群部署(High Availability)

二.在s101上修改配置文件并分发到其它节点

  关于配置hadoop高可用的参数,参考官网链接:http://hadoop.apache.org/docs/r2.7.3/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html

1>.在s101上拷贝配置目录并修改符号链接

Hadoop部署方式-高可用集群部署(High Availability)
[yinzhengjie@s101 ~]$ ll /soft/hadoop/etc/
total 12
drwxr-xr-x. 2 yinzhengjie yinzhengjie 4096 Jun 8 05:36 full
lrwxrwxrwx. 1 yinzhengjie yinzhengjie 21 Jun 8 05:54 hadoop -> /soft/hadoop/etc/full
drwxr-xr-x. 2 yinzhengjie yinzhengjie 4096 May 25 09:15 local
drwxr-xr-x. 2 yinzhengjie yinzhengjie 4096 May 25 20:51 pseudo
[yinzhengjie@s101 ~]$ cp -r /soft/hadoop/etc/full /soft/hadoop/etc/ha
[yinzhengjie@s101 ~]$ ln -sfT /soft/hadoop/etc/ha /soft/hadoop/etc/hadoop
[yinzhengjie@s101 ~]$ ll /soft/hadoop/etc/
total 16
drwxr-xr-x. 2 yinzhengjie yinzhengjie 4096 Jun 8 05:36 full
drwxr-xr-x. 2 yinzhengjie yinzhengjie 4096 Jun 8 05:54 ha
lrwxrwxrwx. 1 yinzhengjie yinzhengjie 19 Jun 8 05:54 hadoop -> /soft/hadoop/etc/ha
drwxr-xr-x. 2 yinzhengjie yinzhengjie 4096 May 25 09:15 local
drwxr-xr-x. 2 yinzhengjie yinzhengjie 4096 May 25 20:51 pseudo
[yinzhengjie@s101 ~]$
Hadoop部署方式-高可用集群部署(High Availability)

2>.配置s105ssh免密登陆

Hadoop部署方式-高可用集群部署(High Availability)
[yinzhengjie@s101 ~]$ ssh-copy-id yinzhengjie@s105
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
yinzhengjie@s105's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'yinzhengjie@s105'"
and check to make sure that only the key(s) you wanted were added. [yinzhengjie@s101 ~]$ who
yinzhengjie pts/0 2018-06-08 05:29 (172.16.30.1)
[yinzhengjie@s101 ~]$
[yinzhengjie@s101 ~]$ ssh s105
Last login: Fri Jun 8 05:37:20 2018 from 172.16.30.1
[yinzhengjie@s105 ~]$
[yinzhengjie@s105 ~]$ who
yinzhengjie pts/0 2018-06-08 05:37 (172.16.30.1)
yinzhengjie pts/1 2018-06-08 05:56 (s101)
[yinzhengjie@s105 ~]$ exit
logout
Connection to s105 closed.
[yinzhengjie@s101 ~]$
Hadoop部署方式-高可用集群部署(High Availability)

3>.编辑core-site.xml配置文件

Hadoop部署方式-高可用集群部署(High Availability)
[yinzhengjie@s101 ~]$ more /soft/hadoop/etc/hadoop/core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/yinzhengjie/ha</value>
</property>
<property>
<name>hadoop.http.staticuser.user</name>
<value>yinzhengjie</value>
</property>
</configuration> <!-- core-site.xml配置文件的作用:
用于定义系统级别的参数,如HDFS URL、Hadoop的临时
目录以及用于rack-aware集群中的配置文件的配置等,此中的参
数定义会覆盖core-default.xml文件中的默认配置。 fs.defaultFS 参数的作用:
#fs.defaultFS 客户端连接HDFS时,默认的路径前缀。如果前面配置了nameservice ID的值是mycluster,那么这里可以配置为授权
信息的一部分 hadoop.tmp.dir 参数的作用:
#声明hadoop工作目录的地址。 hadoop.http.staticuser.user 参数的作用:
#在网页界面访问数据使用的用户名。 -->
[yinzhengjie@s101 ~]$
Hadoop部署方式-高可用集群部署(High Availability)

4>.编辑hdfs-site.xml配置文件

Hadoop部署方式-高可用集群部署(High Availability)
[yinzhengjie@s101 ~]$ more /soft/hadoop/etc/hadoop/hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/yinzhengjie/ha/dfs/name1,/home/yinzhengjie/ha/dfs/name2</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/yinzhengjie/ha/dfs/data1,/home/yinzhengjie/ha/dfs/data2</value>
</property> <!-- 高可用配置 -->
<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
</property> <property>
<name>dfs.ha.namenodes.mycluster</name>
<value>nn1,nn2</value>
</property> <property>
<name>dfs.namenode.rpc-address.mycluster.nn1</name>
<value>s101:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn2</name>
<value>s105:8020</value>
</property> <property>
<name>dfs.namenode.http-address.mycluster.nn1</name>
<value>s101:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn2</name>
<value>s105:50070</value>
</property> <property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://s102:8485;s103:8485;s104:8485/mycluster</value>
</property> <property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property> <!-- 在容灾发生时,保护活跃的namenode -->
<property>
<name>dfs.ha.fencing.methods</name>
<value>
sshfence
shell(/bin/true)
</value>
</property> <property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/yinzhengjie/.ssh/id_rsa</value>
</property> </configuration> <!--
hdfs-site.xml 配置文件的作用:
#HDFS的相关设定,如文件副本的个数、块大小及是否使用强制权限
等,此中的参数定义会覆盖hdfs-default.xml文件中的默认配置. dfs.replication 参数的作用:
#为了数据可用性及冗余的目的,HDFS会在多个节点上保存同一个数据
块的多个副本,其默认为3个。而只有一个节点的伪分布式环境中其仅用
保存一个副本即可,这可以通过dfs.replication属性进行定义。它是一个
软件级备份。 dfs.namenode.name.dir 参数的作用:
#本地磁盘目录,NN存储fsimage文件的地方。可以是按逗号分隔的目录列表,
fsimage文件会存储在全部目录,冗余安全。这里多个目录设定,最好在多个磁盘,
另外,如果其中一个磁盘故障,不会导致系统故障,会跳过坏磁盘。由于使用了HA,
建议仅设置一个。如果特别在意安全,可以设置2个 dfs.datanode.data.dir 参数的作用:
#本地磁盘目录,HDFS数据应该存储Block的地方。可以是逗号分隔的目录列表
(典型的,每个目录在不同的磁盘)。这些目录被轮流使用,一个块存储在这个目录,
下一个块存储在下一个目录,依次循环。每个块在同一个机器上仅存储一份。不存在
的目录被忽略。必须创建文件夹,否则被视为不存在。 dfs.nameservices 参数的作用:
#nameservices列表。逗号分隔。 dfs.ha.namenodes.mycluster 参数的作用:
#dfs.ha.namenodes.[nameservice ID] 命名空间中所有NameNode的唯一标示名称。
可以配置多个,使用逗号分隔。该名称是可以让DataNode知道每个集群的所有NameNode。
当前,每个集群最多只能配置两个NameNode。 dfs.namenode.rpc-address.mycluster.nn1 参数的作用:
#dfs.namenode.rpc-address.[nameservice ID].[name node ID] 每个namenode监听的RPC地址。 dfs.namenode.http-address.mycluster.nn1 参数的作用:
#dfs.namenode.http-address.[nameservice ID].[name node ID] 每个namenode监听的http地址。 dfs.namenode.shared.edits.dir 参数的作用:
#这是NameNode读写JNs组的uri。通过这个uri,NameNodes可以读写edit log内容。URI的格式"qjournal://host1:port1;host2:port
2;host3:port3/journalId"。这里的host1、host2、host3指的是Journal Node的地址,这里必须是奇数个,至少3个;其中journalId是集群
的唯一标识符,对于多个联邦命名空间,也使用同一个journalId。 dfs.client.failover.proxy.provider.mycluster 参数的作用:
#这里配置HDFS客户端连接到Active NameNode的一个java类 dfs.ha.fencing.methods 参数的作用:
#dfs.ha.fencing.methods 配置active namenode出错时的处理类。当active namenode出错时,一般需要关闭该进程。处理方式可以
是ssh也可以是shell。 dfs.ha.fencing.ssh.private-key-files 参数的作用:
#使用sshfence时,SSH的私钥文件。 使用了sshfence,这个必须指定 -->
[yinzhengjie@s101 ~]$
Hadoop部署方式-高可用集群部署(High Availability)

5>.分发配置文件

Hadoop部署方式-高可用集群部署(High Availability)
[yinzhengjie@s101 ~]$ more `which xrsync.sh`
#!/bin/bash
#@author :yinzhengjie
#blog:http://www.cnblogs.com/yinzhengjie
#EMAIL:y1053419035@qq.com #判断用户是否传参
if [ $# -lt 1 ];then
echo "请输入参数";
exit
fi #获取文件路径
file=$@ #获取子路径
filename=`basename $file` #获取父路径
dirpath=`dirname $file` #获取完整路径
cd $dirpath
fullpath=`pwd -P` #同步文件到DataNode
for (( i=102;i<=105;i++ ))
do
#使终端变绿色
tput setaf 2
echo =========== s$i %file ===========
#使终端变回原来的颜色,即白灰色
tput setaf 7
#远程执行命令
rsync -lr $filename `whoami`@s$i:$fullpath
#判断命令是否执行成功
if [ $? == 0 ];then
echo "命令执行成功"
fi
done
[yinzhengjie@s101 ~]$ xrsync.sh /soft/hadoop/etc/
=========== s102 %file ===========
命令执行成功
=========== s103 %file ===========
命令执行成功
=========== s104 %file ===========
命令执行成功
=========== s105 %file ===========
命令执行成功
[yinzhengjie@s101 ~]$
Hadoop部署方式-高可用集群部署(High Availability)

三.启动HDFS分布式系统

1>.启动journalnode进程

Hadoop部署方式-高可用集群部署(High Availability)
[yinzhengjie@s101 ~]$ hadoop-daemons.sh start journalnode
s104: starting journalnode, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-journalnode-s104.out
s103: starting journalnode, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-journalnode-s103.out
s102: starting journalnode, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-journalnode-s102.out
[yinzhengjie@s101 ~]$ xcall.sh jps
============= s101 jps ============
2855 Jps
命令执行成功
============= s102 jps ============
2568 Jps
2490 JournalNode
命令执行成功
============= s103 jps ============
2617 Jps
2539 JournalNode
命令执行成功
============= s104 jps ============
2611 Jps
2532 JournalNode
命令执行成功
============= s105 jps ============
2798 Jps
命令执行成功
[yinzhengjie@s101 ~]$ more `which xcall.sh`
#!/bin/bash
#@author :yinzhengjie
#blog:http://www.cnblogs.com/yinzhengjie
#EMAIL:y1053419035@qq.com #判断用户是否传参
if [ $# -lt 1 ];then
echo "请输入参数"
exit
fi #获取用户输入的命令
cmd=$@ for (( i=101;i<=105;i++ ))
do
#使终端变绿色
tput setaf 2
echo ============= s$i $cmd ============
#使终端变回原来的颜色,即白灰色
tput setaf 7
#远程执行命令
ssh s$i $cmd
#判断命令是否执行成功
if [ $? == 0 ];then
echo "命令执行成功"
fi
done
[yinzhengjie@s101 ~]$
Hadoop部署方式-高可用集群部署(High Availability)

2>.格式化名称节点

 [yinzhengjie@s101 ~]$ hdfs namenode -format

3>.将s101中的工作目录同步到s105

Hadoop部署方式-高可用集群部署(High Availability)
[yinzhengjie@s101 ~]$ scp  -r /home/yinzhengjie/ha  yinzhengjie@s105:~
VERSION 100% 205 0.2KB/s 00:00
seen_txid 100% 2 0.0KB/s 00:00
fsimage_0000000000000000000.md5 100% 62 0.1KB/s 00:00
fsimage_0000000000000000000 100% 358 0.4KB/s 00:00
VERSION 100% 205 0.2KB/s 00:00
seen_txid 100% 2 0.0KB/s 00:00
fsimage_0000000000000000000.md5 100% 62 0.1KB/s 00:00
fsimage_0000000000000000000 100% 358 0.4KB/s 00:00
[yinzhengjie@s101 ~]$
Hadoop部署方式-高可用集群部署(High Availability)

4>.启动hdfs进程

Hadoop部署方式-高可用集群部署(High Availability)
[yinzhengjie@s101 ~]$ start-dfs.sh
Starting namenodes on [s101 s105]
s105: starting namenode, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-namenode-s105.out
s101: starting namenode, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-namenode-s101.out
s104: starting datanode, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-datanode-s104.out
s103: starting datanode, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-datanode-s103.out
s102: starting datanode, logging to /soft/hadoop-2.7.3/logs/hadoop-yinzhengjie-datanode-s102.out
Starting journal nodes [s102 s103 s104]
s102: journalnode running as process 2490. Stop it first.
s104: journalnode running as process 2532. Stop it first.
s103: journalnode running as process 2539. Stop it first.
[yinzhengjie@s101 ~]$ xcall.sh jps
============= s101 jps ============
3377 Jps
3117 NameNode
命令执行成功
============= s102 jps ============
2649 DataNode
2490 JournalNode
2764 Jps
命令执行成功
============= s103 jps ============
2539 JournalNode
2700 DataNode
2815 Jps
命令执行成功
============= s104 jps ============
2532 JournalNode
2693 DataNode
2809 Jps
命令执行成功
============= s105 jps ============
3171 NameNode
3254 Jps
命令执行成功
[yinzhengjie@s101 ~]$
Hadoop部署方式-高可用集群部署(High Availability)

5>.手动将s101转换成激活状态

Hadoop部署方式-高可用集群部署(High Availability)

[yinzhengjie@s101 ~]$ hdfs haadmin -transitionToActive nn1        //手动将s101转换成激活状态
[yinzhengjie@s101 ~]$

Hadoop部署方式-高可用集群部署(High Availability)

  配置到这里基本上高可用就配置好了,但是美中不足的是需要字节手动切换NameNode模式,这就比较麻烦了。索性的是:Hadoop生态圈有专门维护的工具叫做zookeeper工具,我们可以用该工具对集群继续管理就相当方便啦!详情请参考:https://www.cnblogs.com/yinzhengjie/p/9154265.html

当你的才华还撑不起你的野心的时候,你就应该静下心来学习。当你的能力还驾驭不了你的目标的时候,你就应该沉下心来历练。问问自己,想要怎样的人生。 欢迎加入高级自动化运维之路:598432640