hadoop-Zookeeper安装

时间:2024-10-25 11:16:57

hadoop-Zookeeper安装

Ububtu18.04安装Zookeeper3.7.1

环境与版本

这里采用的ubuntu18.04环境的基本配置为:

hostname 为master
用户名为hadoop
静态IP为 192.168.100.3
网关为 192.168.100.2
防火墙已经关闭
/etc/hosts已经配置

全版本下载地址:
https://archive.apache.org/dist/zookeeper/
这里我选择3.7.1这个版本
https://archive.apache.org/dist/zookeeper/zookeeper-3.7.1/

下载到本地后,将apache-zookeeper-3.7.1-bin.tar.gz上传到ubuntu18.04的/home/hadoop/opt/app目录下,
该目录下的目录作用如下:

bin目录 框架启动停止,客户端和服务端的
conf 配置文件信息
docs文档
lib 配置文档的依赖

standlone模式安装

解压缩与配置软连接
hadoop@master:~$ cd /home/hadoop/opt/app/
# 解压缩
hadoop@master:~/opt/app$ tar -zxf apache-zookeeper-3.7.1-bin.tar.gz 

查看当前目录结构:

hadoop@master:~/opt/app$ pwd # 查看当前目录
/home/hadoop/opt/app
hadoop@master:~/opt/app$ ll
总用量 9120
drwxrwxr-x  6 hadoop hadoop    4096 1112 10:42 ./
drwxrwxr-x  3 hadoop hadoop    4096 114 14:53 ../
drwxrwxr-x  6 hadoop hadoop    4096 1112 10:41 apache-zookeeper-3.7.1-bin/
-rw-rw-r--  1 hadoop hadoop 9311744 1112 10:39 apache-zookeeper-3.7.1-bin.tar.gz

配置zookeeper的软连接

# 软连接
hadoop@master:~/opt/app$ ln -s apache-zookeeper-3.7.1-bin zookeeper 
创建存储目录与修改zoo.cfg文件

切换到~/opt/app/zookeeper,配置zookeeper的数据存储-data0目录

hadoop@master:~/opt/app$ cd zookeeper # 切换到zookeeper中
hadoop@master:~/opt/app/zookeeper$ ls
bin  conf  docs  lib  LICENSE.txt  NOTICE.txt  README.md  README_packaging.txt
hadoop@master:~/opt/app/zookeeper$ mkdir data0 # 创建data0
hadoop@master:~/opt/app/zookeeper$ ls
bin  conf  data0  docs  lib  LICENSE.txt  NOTICE.txt  README.md  README_packaging.txt

进入~/opt/app/zookeeper/conf目录下,复制zoo_sample.cfg为zoo.cfg,

hadoop@master:~/opt/app/zookeeper$ cd conf/
hadoop@master:~/opt/app/zookeeper/conf$ ls
configuration.xsl  log4j.properties  zoo_sample.cfg
hadoop@master:~/opt/app/zookeeper/conf$ cp zoo_sample.cfg zoo.cfg #复制zoo.cfg

编辑zoo.cfg

hadoop@master:~/opt/app/zookeeper/conf$ vi zoo.cfg # 编辑zoo.cfg文件,编辑内容如下:
# dataDir=/tmp/zookeeper 注释掉这句默认配置,然后添加下面的配置
dataDir=/home/hadoop/opt/app/zookeeper/data0


# 在配置文件的最后面添加
server.1=master:2888:3888
# 如果是单节点的zookeeper集群部署可以配置多个server
#server.2=master:2889:3889
#server.3=master:2890:3890
配置zookeeper的环境变量
hadoop@master:~/opt/app/zookeeper/conf$ vi ~/.bashrc
#这一步会打开bashrc文件,编辑内容如下:
export ZOOKEEPER_HOME=/home/hadoop/opt/app/zookeeper
export PATH=$ZOOKEEPER_HOME/bin:$PATH

hadoop@master:~/opt/app/zookeeper/conf$ source ~/.bashrc # 生效环境变量
启动与查看zk服务端(单机版本)

启动zk服务端(单机版本)
查看zk启动状态

hadoop@master:~/opt/app/zookeeper/conf$ zkServer.sh status #查看zk启动状态
ZooKeeper JMX enabled by default
Using config: /home/hadoop/opt/app/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost.
Error contacting service. It is probably not running.

启动zkServer

hadoop@master:~/opt/app/zookeeper/conf$ zkServer.sh start #启动zkServer
ZooKeeper JMX enabled by default
Using config: /home/hadoop/opt/app/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

查看zk启动状态

hadoop@master:~/opt/app/zookeeper/conf$ zkServer.sh status #查看zk启动状态,单机启动为standalone模式
ZooKeeper JMX enabled by default
Using config: /home/hadoop/opt/app/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost.
Mode: standalone

关闭zk服务端(单机版本)

zkServer.sh start # 启动zk服务端
zkServer.sh stop # 关闭zk服务端
zkServer.sh status # 查看zk服务端状态
启动zk客户端
hadoop@master:~/opt/app/zookeeper/conf$ zkCli.sh #启动zkCli,默认连接端口2181
Connecting to localhost:2181
...
...
...
WATCHER::

WatchedEvent state:SyncConnected type:None path:null
[zk: localhost:2181(CONNECTED) 0] ls /
[zookeeper]
[zk: localhost:2181(CONNECTED) 1] help
ZooKeeper -server host:port cmd args
        addauth scheme auth
        close
        config [-c] [-w] [-s]
        connect host:port
        create [-s] [-e] [-c] [-t ttl] path [data] [acl]
        delete [-v version] path
        deleteall path
        delquota [-n|-b] path
        get [-s] [-w] path
        getAcl [-s] path
        history
        listquota path
        ls [-s] [-w] [-R] path
        ls2 path [watch]
        printwatches on|off
        quit
        reconfig [-s] [-v version] [[-file path] | [-members serverID=host:port1:port2;port3[,...]*]] | [-add serverId=host:port1:port2;port3[,...]]* [-remove serverId[,...]*]
        redo cmdno
        removewatches path [-c|-d|-a] [-l]
        rmr path
        set [-s] [-v version] path data
        setAcl [-s] [-v version] [-R] path acl
        setquota -n|-b val path
        stat [-w] path
        sync path
Command not found: Command not found help
[zk: localhost:2181(CONNECTED) 2] quit

WATCHER::

WatchedEvent state:Closed type:None path:null
2021-11-12 11:05:29,497 [myid:] - INFO  [main:ZooKeeper@1422] - Session: 0x1000029eae60000 closed
2021-11-12 11:05:29,497 [myid:] - INFO  [main-EventThread:ClientCnxn$EventThread@524] - EventThread shut down for session: 0x1000029eae60000
hadoop@master:~/opt/app/zookeeper/conf$

配置伪分布式的Zookeeper

本小节的伪分布式集群部署是指在一台机器(master)上配置多个配置文件(zoo1.cfg,zoo2.cfg,zoo3.cfg),三个配置文件中设置不一样clientPort和dataDir,然后根据不同的配置文件来启动Zookeeper服务。
备注:本部分是在standlone已经完成安装的基础上进行的。

创建并生成各自的配置文件:

配置各个zk节点的配置文件,具体如下:

hadoop@master:~/opt/app/zookeeper/conf$ pwd
/home/hadoop/opt/app/zookeeper/conf
hadoop@master:~/opt/app/zookeeper/conf$ cp zoo.cfg zoo1.cfg
hadoop@master:~/opt/app/zookeeper/conf$ cp zoo.cfg zoo2.cfg
hadoop@master:~/opt/app/zookeeper/conf$ cp zoo.cfg zoo3.cfg

分布修改这三个配置文件:

zoo1.cfg 配置文件

# dataDir=/tmp/zookeeper 注释掉这句默认配置,然后修改下面的配置
dataDir=/home/hadoop/opt/app/zookeeper/data1

# clientPort=2181 注释掉15行这句话,把这句内容放在文件尾部
clientPort=2181
# server.1=master:2888(数据同步端口):3888(Leader选举端口)
server.1=master:2888:3888
server.2=master:2889:3889
server.3=master:2890:3890

zoo2.cfg 配置文件

# dataDir=/tmp/zookeeper 注释掉这句默认配置,然后修改下面的配置
dataDir=/home/hadoop/opt/app/zookeeper/data2

# clientPort=2181 注释掉15行这句话,把这句内容放在文件尾部
clientPort=2182
# server.1=master:2888(数据同步端口):3888(Leader选举端口)
server.1=master:2888:3888
server.2=master:2889:3889
server.3=master:2890:3890

zoo3.cfg 配置文件

# dataDir=/tmp/zookeeper 注释掉这句默认配置,然后修改下面的配置
dataDir=/home/hadoop/opt/app/zookeeper/data3

# clientPort=2183 注释掉15行这句话,把这句内容放在文件尾部
clientPort=2183
# server.1=master:2888(数据同步端口):3888(Leader选举端口)
server.1=master:2888:3888
server.2=master:2889:3889
server.3=master:2890:3890

server.A=B:C:D中A是一个数字,表示这个是第几号服务器,B是这个服务器的IP地址,C表示这个服务器与集群中的Leader服务器交换信息的端口,D表示集群中的Leader服务器挂了,需要一个端口来重新进行选举,选出一个新的Leader,而这个端口就是用来执行选举时服务器相互通信的端口。对于伪分布集群的配置方式,由于B都是一样,需要为不同的Zookeeper服务通信端口分配不同的端口号。

在各个data目录下创建myid文件,并存入当前服务器的编号,如下:

hadoop@master:~/opt/app/zookeeper$ pwd
/home/hadoop/opt/app/zookeeper
hadoop@master:~/opt/app/zookeeper$ touch data1/myid
hadoop@master:~/opt/app/zookeeper$ touch data2/myid
hadoop@master:~/opt/app/zookeeper$ touch data3/myid

hadoop@master:~/opt/app/zookeeper$ echo 1 > data1/myid
hadoop@master:~/opt/app/zookeeper$ echo 2 > data2/myid
hadoop@master:~/opt/app/zookeeper$ echo 3 > data3/myid

启动并测试Zookeeper伪分布式集群

依次启动Zookeeper服务,在启动的时候,选举算法依次投票,那么Leader
服务就是2号配置文件启动的服务。其他的节点服务都是Fllower,当启动第一台服务器的时候查看状态status是不可用的,因为集群中节点未在半数以上。集群中奇数和偶数对故障的容忍度是一致的,所以建议配置奇数个Zookeeper节点。

启动zkServer.sh并指定配置文件为zoo1.cfg

hadoop@master:~/opt/app/zookeeper$ zkServer.sh start /home/hadoop/opt/app/zookeeper/conf/zoo1.cfg # 通过zoo1.cfg启动zkServer
ZooKeeper JMX enabled by default
Using config: /home/hadoop/opt/app/zookeeper/conf/zoo1.cfg
Starting zookeeper ... STARTED
hadoop@master:~/opt/app/zookeeper$ zkServer.sh status /home/hadoop/opt/app/zookeeper/conf/zoo1.cfg # 查看通过zoo1.cfg启动的zkServer状态
ZooKeeper JMX enabled by default
Using config: /home/hadoop/opt/app/zookeeper/conf/zoo1.cfg
Client port found: 2181. Client address: localhost.
Error contacting service. It is probably not running.

启动zkServer.sh并指定配置文件为zoo2.cfg


hadoop@master:~/opt/app/zookeeper$ zkServer.sh start /home/hadoop/opt/app/zookeeper/conf/zoo2.cfg # 通过zoo2.cfg启动zkServer
ZooKeeper JMX enabled by default
Using config: /home/hadoop/opt/app/zookeeper/conf/zoo2.cfg
Starting zookeeper ... STARTED
hadoop@master:~/opt/app/zookeeper$ zkServer.sh status /home/hadoop/opt/app/zookeeper/conf/zoo2.cfg # 查看通过zoo2.cfg启动的zkServer状态,为leader
ZooKeeper JMX enabled by default
Using config: /home/hadoop/opt/app/zookeeper/conf/zoo2.cfg
Client port found: 2182. Client address: localhost.
Mode: leader

启动zkServer.sh并指定配置文件为zoo3.cfg

hadoop@master:~/opt/app/zookeeper$ zkServer.sh start /home/hadoop/opt/app/zookeeper/conf/zoo3.cfg # 通过zoo3.cfg启动zkServer
ZooKeeper JMX enabled by default
Using config: /home/hadoop/opt/app/zookeeper/conf/zoo3.cfg
Starting zookeeper ... STARTED
hadoop@master:~/opt/app/zookeeper$ zkServer.sh status /home/hadoop/opt/app/zookeeper/conf/zoo3.cfg # 查看通过zoo3.cfg启动的zkServer状态
ZooKeeper JMX enabled by default
Using config: /home/hadoop/opt/app/zookeeper/conf/zoo3.cfg
Client port found: 2183. Client address: localhost.
Mode: follower

目前看到zoo2.cfg的进程为leader节点。
如下图:
在这里插入图片描述

此时使用命令jps查看所有进程:

hadoop@master:~/opt/app/zookeeper$ jps
4833 QuorumPeerMain
5060 Jps
4726 QuorumPeerMain
4607 QuorumPeerMain

客户端连接Zookeeper服务节点,则使用如下命令:

hadoop@master:~/opt/app/zookeeper$ zkCli.sh -server localhost:2181

Connecting to localhost:2181
#退出使用quit

centos安装Zookeeper3.7.1

这里采用的centos7.6环境的基本配置为:

hostname 为node1
用户名为root
静态IP为 192.168.100.3
网关为 192.168.100.2
防火墙已经关闭
/etc/hosts已经配置

全版本下载地址:
https://archive.apache.org/dist/zookeeper/
这里我选择3.7.1这个版本
https://archive.apache.org/dist/zookeeper/zookeeper-3.7.1/

下载到本地后,将apache-zookeeper-3.7.1-bin.tar.gz上传到ubuntu18.04的/home/hadoop/opt/app目录下,
该目录下的目录作用如下:

bin目录 框架启动停止,客户端和服务端的
conf 配置文件信息
docs文档
lib 配置文档的依赖

基于centos7.6伪分布模式安装

centos7.6安装zookeeper的过程与ubuntu安装zookeeper的过程完全一致。下面提供具体过程。

上传解压修改文件名

上传apache-zookeeper-3.7.1-bin.tar.gz 到/opt/software目录下,并对压缩包进行解压

[root@node1 ~]#  cd /opt/software/
# 解压缩
[root@node1 ~]#  tar -zxvf apache-zookeeper-3.7.1-bin.tar.gz
 # 修改文件名
[root@node1 ~]#  mv apache-zookeeper-3.7.1-bin zookeeper
# 创建zkdata目录
[root@node1 ~]#  cd /opt/software/zookeeper
[root@node1 ~]#  mkdir zkdata
复制zookeeper安装目录

复制zookeeper目录到/opt/module目录下,并复制3份,命名为001-003

cd /opt/software
cp -r zookeeper /opt/module/zookeeper001
cp -r zookeeper /opt/module/zookeeper002
cp -r zookeepe /opt/module/zookeeper003

创建并配置zoo.cfg配置文件
修改zookeeper001配置文件

首先把zookeeper001/conf下的的zoo_sample.cfg文件复制为zoo.cfg

cd /opt/module/zookeeper001/conf
cp zoo_sample.cfg zoo.cfg

修改zoo.cfg配置文件

vim zoo.cfg

修改配置如下

# 第1步:设置数据持久化目录
dataDir=/opt/module/zookeeper001/zkdata
# 设置客户端连接当前ZooKeeper服务使用的端口号
clientPort=2181
# 设置ZooKeeper集群中每个ZooKeeper服务的地址及端口号
server.1=node1:2888:3888
server.2=node1:2889:3889
server.3=node1:2890:3890

在zookeeper001/zkdata下创建myid文件,内容为1

echo 1 > /opt/module/zookeeper001/zkdata/myid
修改zookeeper002配置文件

首先把zookeeper002/conf下的的zoo_sample.cfg文件复制为zoo.cfg

cd /opt/module/zookeeper001/conf
cp zoo_sample.cfg zoo.cfg

修改zoo.cfg配置文件

vim zoo.cfg

修改配置如下

# 第1步:设置数据持久化目录
dataDir=/opt/module/zookeeper002/zkdata
# 设置客户端连接当前ZooKeeper服务使用的端口号
clientPort=2182
# 设置ZooKeeper集群中每个ZooKeeper服务的地址及端口号
server.1=node1:2888:3888
server.2=node1:2889:3889
server.3=node1:2890:3890

在zookeeper002/zkdata下创建myid文件,内容为2

echo 2 > /opt/module/zookeeper002/zkdata/myid
修改zookeeper003配置文件

首先把zookeeper003/conf下的的zoo_sample.cfg文件复制为zoo.cfg

cd /opt/module/zookeeper003/conf
cp zoo_sample.cfg zoo.cfg

修改zoo.cfg配置文件

vim zoo.cfg

修改配置如下

# 第1步:设置数据持久化目录
dataDir=/opt/module/zookeeper003/zkdata
# 设置客户端连接当前ZooKeeper服务使用的端口号
clientPort=2183
# 设置ZooKeeper集群中每个ZooKeeper服务的地址及端口号
server.1=node1:2888:3888
server.2=node1:2889:3889
server.3=node1:2890:3890

在zookeeper003/zkdata下创建myid文件,内容为1

echo 3 > /opt/module/zookeeper001/zkdata/myid
启动并测试Zookeeper伪分布式集群

依次启动Zookeeper服务,在启动的时候,选举算法依次投票,那么Leader
服务就是2号配置文件启动的服务。其他的节点服务都是Fllower,当启动第一台服务器的时候查看状态status是不可用的,因为集群中节点未在半数以上。集群中奇数和偶数对故障的容忍度是一致的,所以建议配置奇数个Zookeeper节点。

启动zookeeper001下的zkServer.sh
[root@node1 module]# cd /opt/module/
[root@node1 module]# ./zookeeper001/bin/zkServer.sh start # 启动zkServer

输出为:

ZooKeeper JMX enabled by default
Using config: /opt/module/zookeeper001/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

查看zkServer状态


[root@node1 module]#  ./zookeeper001/bin/zkServer.sh status  # 查看zkServer状态

输出为:

ZooKeeper JMX enabled by default
Using config: /opt/module/zookeeper001/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost.
Error contacting service. It is probably not running.
启动zookeeper002下的zkServer.sh
[root@node1 module]# cd /opt/module/
[root@node1 module]# ./zookeeper002/bin/zkServer.sh start # 启动zkServer

输出为:

ZooKeeper JMX enabled by default
Using config: /opt/module/zookeeper002/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

查看zkServer状态


[root@node1 module]#  ./zookeeper002/bin/zkServer.sh status  # 查看zkServer状态

输出为:

Using config: /opt/module/zookeeper002/bin/../conf/zoo.cfg
Client port found: 2182. Client address: localhost. Client SSL: false.
Mode: leader
启动zookeeper003下的zkServer.sh
[root@node1 module]# cd /opt/module/
[root@node1 module]# ./zookeeper003/bin/zkServer.sh start # 启动zkServer

输出为:

ZooKeeper JMX enabled by default
Using config: /opt/module/zookeeper003/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

查看zkServer状态


[root@node1 module]#  ./zookeeper003/bin/zkServer.sh status  # 查看zkServer状态

输出为:

ZooKeeper JMX enabled by default
Using config: /opt/module/zookeeper003/bin/../conf/zoo.cfg
Client port found: 2183. Client address: localhost. Client SSL: false.
Mode: follower

启动后,myid值为2的zookeeper002下的称为了leader。