2015-12-14注:加入新节点不更改运行节点参数需求已满足,将在后续文章中陆续总结。
注:目前方案不满足加入新节点(master节点或regionserver节点)而不更改已运行节点的参数的需求,具体讨论见第六部分。
一、背景知识
先看下HBase的组成:
Master:Master主要负责管理RegionServer集群,如负载均衡及资源分配等,它本身也可以以集群方式运行,但同一时刻只有一个master处于激活状态。当工作中的master宕掉后,zookeeper会切换到其它备选的master上。
RegionServer:负责具体数据块的读写操作。
ZooKeeper:负责集群元数据的维护并监控集群的状态以防止单点故障。部署HBase时可以使用自带的ZooKeeper也可以使用独立的集群,是HBase跑起来的先决条件。
HDFS:写入HBase中的数据最终都持久化到了HDFS中,也是HBase运行的先决条件。
二、skyDNS部署
skyDNS并不是必需项,但设置了skyDNS后可以为k8s的service绑定域名,进行hbase的参数设置时可以以域名代替service的IP地址。k8s环境中的skyDNS由三部分组成: 存储IP地址和域名映射关系的ETCD;进行域名解析的skyDNS;连接k8s和skyDNS的桥梁kube2sky。k8s的域名构成为 service_name.namespace.k8s_cluster_domain。k8s的文档中有对部署skyDNS的简略说明(戳这里),其中要用到Google镜像仓库中的image,国内访问不到,可以使用DockerHub上的替换方案,如:
skyDNS: docker pull shenshouer/skydns:2015-09-22
kube2sky: docker pull shenshouer/kube2sky:1.11
ETCD: 只要是2.x版本的ETCD都可以,也可以和上面的保持一致使用 docker pull shenshouer/etcd:2.0.9
pull下来之后打上tag再push到私有仓库中。
下面创建一个service和一个pod来部署skyDNS,设定 skyDNS service 的服务地址为 172.16.40.1 (53/UDP, 53/TCP),注意该IP地址要在kube-apiserver启动时设定的service的子网范围内;k8s集群的域名后缀为 domeos.sohu,注意这个后缀的选择最好包含两部分,否则kube2sky可能会出问题(具体讨论戳这里)。
首先创建skydns.yaml文件:
apiVersion: v1
kind: Service
metadata:
name: kube-dns
labels:
app: kube-dns
version: v8
spec:
selector:
app: kube-dns
version: v8
type: ClusterIP
clusterIP: 172.16.40.1
ports:
- name: dns
port: 53
protocol: UDP
- name: dns-tcp
port: 53
protocol: TCP
---
apiVersion: v1
kind: ReplicationController
metadata:
name: kube-dns-v8
labels:
app: kube-dns
version: v8
spec:
replicas: 1
selector:
app: kube-dns
version: v8
template:
metadata:
labels:
app: kube-dns
version: v8
spec:
containers:
- name: etcd
image: 10.11.150.76:5000/openxxs/etcd:2.0.3
command:
- "etcd"
args:
- "--data-dir=/var/etcd/data"
- "--listen-client-urls=http://127.0.0.1:2379,http://127.0.0.1:4001"
- "--advertise-client-urls=http://127.0.0.1:2379,http://127.0.0.1:4001"
- "--initial-cluster-token=skydns-etcd"
volumeMounts:
- name: etcd-storage
mountPath: /var/etcd/data
- name: kube2sky
image: 10.11.150.76:5000/openxxs/kube2sky:k8s-dns
args:
- "--domain=domeos.sohu"
- "--kube_master_url=http://10.16.42.200:8080"
- name: skydns
image: 10.11.150.76:5000/openxxs/skydns:2015-09-22
args:
- "--machines=http://localhost:4001"
- "--addr=0.0.0.0:53"
- "--domain=domeos.sohu"
ports:
- containerPort: 53
name: dns
protocol: UDP
- containerPort: 53
name: dns-tcp
protocol: TCP
volumes:
- name: etcd-storage
emptyDir: {}
dnsPolicy: Default
kube2sky中的 --kube_master_url 参数用于指定 kube-apiserver 的地址;kube2sky中的 --domain 和 skydns中的 --domain 要保持一致。
然后 kubectl create -f skydns.yaml 创建服务和pod:
$kubectl create -f skydns.yaml
service "kube-dns" created
replicationcontroller "kube-dns-v8" created
$kubectl get pods
NAME READY STATUS RESTARTS AGE
kube-dns-v8-61aie 3/3 Running 0 9s
$kubectl get service
NAME CLUSTER_IP EXTERNAL_IP PORT(S) SELECTOR AGE
kube-dns 172.16.40.1 <none> 53/UDP,53/TCP app=kube-dns,version=v8 6m
最后,重启 kubelet 加上dns相关的设置参数 --cluster_dns 和 --cluster_domain,参数值要与前面yaml文件中写的一致,如:
./kubelet --logtostderr=true --v=0 --api_servers=http://bx-42-200:8080 --address=0.0.0.0 --hostname_override=bx-42-198 --allow_privileged=false --pod-infra-container-image=10.11.150.76:5000/kubernetes/pause:latest --cluster_dns=172.16.40.1 --cluster_domain=domeos.sohu &
注意:只有在kubelet加了dns设置参数重启之后创建的pods才会使用skyDNS。
此时进入到etcd的container中就可以发现k8s的service域名信息已被写入etcd当中了:
$ docker exec -it 13e243510e3e sh
/ # etcdctl ls --recursive /
/skydns
/skydns/sohu
/skydns/sohu/domeos
/skydns/sohu/domeos/default
/skydns/sohu/domeos/default/kube-dns
/skydns/sohu/domeos/default/kubernetes
/skydns/sohu/domeos/default/zookeeper-
/skydns/sohu/domeos/default/zookeeper-
/skydns/sohu/domeos/default/zookeeper-
/skydns/sohu/domeos/svc
/skydns/sohu/domeos/svc/default
/skydns/sohu/domeos/svc/default/zookeeper-
/skydns/sohu/domeos/svc/default/zookeeper-/b8757496
/skydns/sohu/domeos/svc/default/zookeeper-
/skydns/sohu/domeos/svc/default/zookeeper-/8687b21f
/skydns/sohu/domeos/svc/default/kube-dns
/skydns/sohu/domeos/svc/default/kube-dns/a9f11e6f
/skydns/sohu/domeos/svc/default/kubernetes
/skydns/sohu/domeos/svc/default/kubernetes/cf07aead
/skydns/sohu/domeos/svc/default/zookeeper-
/skydns/sohu/domeos/svc/default/zookeeper-/
/ # etcdctl get /skydns/sohu/domeos/default/zookeeper-
{"host":"172.16.11.1","priority":,"weight":,"ttl":,"targetstrip":}
以 /skydns/sohu/domeos/default/zookeeper-1 这条记录为例,其对应的域名即为 zookeeper-1.default.domeos.sohu ,IP 为 172.16.11.1,服务名称为zookeeper-1,k8s的namespace为default,k8s设定的域为 domeos.sohu。在任一重启kubelet之后创建的pod中都可以以 zookeeper-1.default.domeos.sohu 的方式访问zookeeper-1服务,如:
[@bx_42_199 ~]# docker exec -it 0662660e8708 /bin/bash
[root@test--2h0fx /]# curl zookeeper-.default.domeos.sohu:
curl: () Empty reply from server
三、HDFS集群部署
HDFS由namenode和datanode组成,首先从DockerHub上pull合适的镜像再push到自己的私有仓库中:
# pull 远程images
docker pull bioshrek/hadoop-hdfs-datanode:cdh5
docker pull bioshrek/hadoop-hdfs-namenode:cdh5
# 打上tag
# docker tag <image的ID> <自己的私有仓库IP:PORT/名称:TAG>
docker tag c89c3ebcccae 10.11.150.76:/hdfs-datanode:latest
docker tag ca19d4c7e359 10.11.150.76:/hdfs-namenode:latest
# push到仓库中
docker push 10.11.150.76:/hdfs-datanode:latest
docker push 10.11.150.76:/hdfs-namenode:latest
然后创建如下hdfs.yaml文件:
apiVersion: v1
kind: Service
metadata:
name: hdfs-namenode-service
spec:
selector:
app: hdfs-namenode
type: ClusterIP
clusterIP: "172.16.20.1"
ports:
- name: rpc
port:
targetPort:
- name: p1
port:
- name: p2
port:
- name: p3
port:
- name: p4
port:
- name: p5
port:
- name: p6
port:
- name: p7
port:
- name: p8
port:
- name: p9
port:
- name: p10
port:
- name: p11
port:
- name: p12
port:
- name: p13
port:
- name: p14
port:
---
apiVersion: v1
kind: ReplicationController
metadata:
name: hdfs-namenode-
spec:
replicas:
template:
metadata:
labels:
app: hdfs-namenode
spec:
containers:
- name: hdfs-namenode
image: 10.11.150.76:/hdfs-namenode:latest
volumeMounts:
- name: data1
mountPath: /var/lib/hadoop-hdfs/cache/hdfs/dfs/name
- name: data2
mountPath: /home/chianyu/shared_with_docker_container/cdh5/nn
ports:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
nodeSelector:
kubernetes.io/hostname: bx--
volumes:
- hostPath:
path: /data1/kubernetes/hdfs-namenode/data1
name: data1
- hostPath:
path: /data1/kubernetes/hdfs-namenode/data2
name: data2
---
apiVersion: v1
kind: ReplicationController
metadata:
name: hdfs-datanode-
spec:
replicas:
template:
metadata:
labels:
app: hdfs-datanode
server-id: ""
spec:
containers:
- name: hdfs-datanode-
image: 10.11.150.76:/hdfs-datanode:latest
volumeMounts:
- name: data1
mountPath: /var/lib/hadoop-hdfs/cache/hdfs/dfs/name
- name: data2
mountPath: /home/chianyu/shared_with_docker_container/cdh5/dn
env:
- name: HDFSNAMENODERPC_SERVICE_HOST
value: "172.16.20.1"
- name: HDFSNAMENODERPC_SERVICE_PORT
value: ""
ports:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
nodeSelector:
kubernetes.io/hostname: bx--
volumes:
- hostPath:
path: /data1/kubernetes/hdfs-datanode1/data1
name: data1
- hostPath:
path: /data1/kubernetes/hdfs-datanode1/data2
name: data2
---
apiVersion: v1
kind: ReplicationController
metadata:
name: hdfs-datanode-
spec:
replicas:
template:
metadata:
labels:
app: hdfs-datanode
server-id: ""
spec:
containers:
- name: hdfs-datanode-
image: 10.11.150.76:/hdfs-datanode:latest
volumeMounts:
- name: data1
mountPath: /var/lib/hadoop-hdfs/cache/hdfs/dfs/name
- name: data2
mountPath: /home/chianyu/shared_with_docker_container/cdh5/dn
env:
- name: HDFSNAMENODERPC_SERVICE_HOST
value: "172.16.20.1"
- name: HDFSNAMENODERPC_SERVICE_PORT
value: ""
ports:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
nodeSelector:
kubernetes.io/hostname: bx--
volumes:
- name: data1
hostPath:
path: /data2/kubernetes/hdfs-datanode2/data1
- name: data2
hostPath:
path: /data2/kubernetes/hdfs-datanode2/data2
---
apiVersion: v1
kind: ReplicationController
metadata:
name: hdfs-datanode-
spec:
replicas:
template:
metadata:
labels:
app: hdfs-datanode
server-id: ""
spec:
containers:
- name: hdfs-datanode-
image: 10.11.150.76:/hdfs-datanode:latest
volumeMounts:
- name: data1
mountPath: /var/lib/hadoop-hdfs/cache/hdfs/dfs/name
- name: data2
mountPath: /home/chianyu/shared_with_docker_container/cdh5/dn
env:
- name: HDFSNAMENODERPC_SERVICE_HOST
value: "172.16.20.1"
- name: HDFSNAMENODERPC_SERVICE_PORT
value: ""
ports:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
- containerPort:
nodeSelector:
kubernetes.io/hostname: bx--
volumes:
- name: data1
hostPath:
path: /data3/kubernetes/hdfs-datanode3/data1
- name: data2
hostPath:
path: /data3/kubernetes/hdfs-datanode3/data2
通过 kubectl create -f hdfs.yaml 即创建一个名为hdfs-namenode-service的service,四个分别名为hdfs-namenode-1、hdfs-datanode-1、hdfs-datanode-2、hdfs-datanode-3的RC。通过 kubectl get services/rc/pods 可以看到对应的service和pod都已经正常启动了。
下面对HDFS进行测试是否可以正常使用:
# 查看HDFS pods
kubectl get pods # 通过describe查看pods跑在哪个k8s node上
kubectl describe pod hdfs-datanode--h4jvt # 进入容器内部
docker ps | grep hdfs-datanode-
docker exec -it 2e2c4df0c0a9 /bin/bash # 切换至 hdfs 用户
su hdfs # 创建目录
hadoop fs -mkdir /test
# 创建本地文件
echo "Hello" > hello
# 将本地文件复制到HDFS文件系统中
hadoop fs -put hello /test
# 查看HDFS中的文件信息
hadoop fs -ls /test # 类似的,可以 docker exec 到其它datanode中查看文件信息,如:
root@hdfs-datanode--nek2l:/# hadoop fs -ls /test
Found items
-rw-r--r-- hdfs hadoop -- : /test/hello
四、ZooKeeper集群部署
在 fabric8/zookeeper 的image基础上进行修改,修改后Dockerfile文件内容如下:
FROM jboss/base-jdk: MAINTAINER iocanel@gmail.com USER root ENV ZOOKEEPER_VERSION 3.4.
EXPOSE RUN yum -y install wget bind-utils && yum clean all \
&& wget -q -O - http://apache.mirrors.pair.com/zookeeper/zookeeper-${ZOOKEEPER_VERSION}/zookeeper-${ZOOKEEPER_VERSION}.tar.gz | tar -xzf - -C /opt \
&& mv /opt/zookeeper-${ZOOKEEPER_VERSION} /opt/zookeeper \
&& cp /opt/zookeeper/conf/zoo_sample.cfg /opt/zookeeper/conf/zoo.cfg \
&& mkdir -p /opt/zookeeper/{data,log} WORKDIR /opt/zookeeper
VOLUME ["/opt/zookeeper/conf", "/opt/zookeeper/data", "/opt/zookeeper/log"] COPY config-and-run.sh ./bin/
COPY zoo.cfg ./conf/ CMD ["/opt/zookeeper/bin/config-and-run.sh"]
zoo.cfg 文件内容如下:
# The number of milliseconds of each tick
tickTime=
# The number of ticks that the initial
# synchronization phase can take
initLimit=
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=
# the directory where the snapshot is stored.
dataDir=/opt/zookeeper/data
#This option will direct the machine to write the transaction log to the dataLogDir rather than the dataDir. This allows a dedicated log device to be used, and helps avoid competition between logging and snaphots.
dataLogDir=/opt/zookeeper/log # the port at which the clients will connect
clientPort=
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=
# Purge task interval in hours
# Set to "" to disable auto purge feature
#autopurge.purgeInterval=
config-and-run.sh 文件内容如下:
#!/bin/bash echo "$SERVER_ID / $MAX_SERVERS"
if [ ! -z "$SERVER_ID" ] && [ ! -z "$MAX_SERVERS" ]; then
echo "Starting up in clustered mode"
echo "" >> /opt/zookeeper/conf/zoo.cfg
echo "#Server List" >> /opt/zookeeper/conf/zoo.cfg
for i in $( eval echo {..$MAX_SERVERS});do
HostEnv="ZOOKEEPER_${i}_SERVICE_HOST"
HOST=${!HostEnv}
FollowerPortEnv="ZOOKEEPER_${i}_SERVICE_PORT_FOLLOWERS"
FOLLOWERPORT=${!FollowerPortEnv}
ElectionPortEnv="ZOOKEEPER_${i}_SERVICE_PORT_ELECTION"
ELECTIONPORT=${!ElectionPortEnv}
if [ "$SERVER_ID" = "$i" ];then
echo "server.$i=0.0.0.0:$FOLLOWERPORT:$ELECTIONPORT" >> /opt/zookeeper/conf/zoo.cfg
else
echo "server.$i=$HOST:$FOLLOWERPORT:$ELECTIONPORT" >> /opt/zookeeper/conf/zoo.cfg
fi
done
cat /opt/zookeeper/conf/zoo.cfg # Persists the ID of the current instance of Zookeeper
echo ${SERVER_ID} > /opt/zookeeper/data/myid
else
echo "Starting up in standalone mode"
fi exec /opt/zookeeper/bin/zkServer.sh start-foreground
修改完后创建镜像并push到私有仓库中(镜像名为10.11.150.76:5000/zookeeper-kb:3.4.6-1)。
创建zookeeper.yaml文件:
apiVersion: v1
kind: Service
metadata:
name: zookeeper-
labels:
name: zookeeper-
spec:
ports:
- name: client
port:
targetPort:
- name: followers
port:
targetPort:
- name: election
port:
targetPort:
selector:
name: zookeeper
server-id: ""
type: ClusterIP
clusterIP: 172.16.11.1
---
apiVersion: v1
kind: Service
metadata:
name: zookeeper-
labels:
name: zookeeper-
spec:
ports:
- name: client
port:
targetPort:
- name: followers
port:
targetPort:
- name: election
port:
targetPort:
selector:
name: zookeeper
server-id: ""
type: ClusterIP
clusterIP: 172.16.11.2
---
apiVersion: v1
kind: Service
metadata:
name: zookeeper-
labels:
name: zookeeper-
spec:
ports:
- name: client
port:
targetPort:
- name: followers
port:
targetPort:
- name: election
port:
targetPort:
selector:
name: zookeeper
server-id: ""
type: ClusterIP
clusterIP: 172.16.11.3
---
apiVersion: v1
kind: ReplicationController
metadata:
name: zookeeper-
spec:
replicas:
template:
metadata:
labels:
name: zookeeper
server-id: ""
spec:
volumes:
- hostPath:
path: /data1/kubernetes/zookeeper/data1
name: data
- hostPath:
path: /data1/kubernetes/zookeeper/log1
name: log
containers:
- name: server
image: 10.11.150.76:/zookeeper-kb:3.4.-
env:
- name: SERVER_ID
value: ""
- name: MAX_SERVERS
value: ""
ports:
- containerPort:
- containerPort:
- containerPort:
volumeMounts:
- mountPath: /opt/zookeeper/data
name: data
- mountPath: /opt/zookeeper/log
name: log
nodeSelector:
kubernetes.io/hostname: bx--
---
apiVersion: v1
kind: ReplicationController
metadata:
name: zookeeper-
spec:
replicas:
template:
metadata:
labels:
name: zookeeper
server-id: ""
spec:
volumes:
- hostPath:
path: /data1/kubernetes/zookeeper/data2
name: data
- hostPath:
path: /data1/kubernetes/zookeeper/log2
name: log
containers:
- name: server
image: 10.11.150.76:/zookeeper-kb:3.4.-
env:
- name: SERVER_ID
value: ""
- name: MAX_SERVERS
value: ""
ports:
- containerPort:
- containerPort:
- containerPort:
volumeMounts:
- mountPath: /opt/zookeeper/data
name: data
- mountPath: /opt/zookeeper/log
name: log
nodeSelector:
kubernetes.io/hostname: bx--
---
apiVersion: v1
kind: ReplicationController
metadata:
name: zookeeper-
spec:
replicas:
template:
metadata:
labels:
name: zookeeper
server-id: ""
spec:
volumes:
- hostPath:
path: /data1/kubernetes/zookeeper/data3
name: data
- hostPath:
path: /data1/kubernetes/zookeeper/log3
name: log
containers:
- name: server
image: 10.11.150.76:/zookeeper-kb:3.4.-
env:
- name: SERVER_ID
value: ""
- name: MAX_SERVERS
value: ""
ports:
- containerPort:
- containerPort:
- containerPort:
volumeMounts:
- mountPath: /opt/zookeeper/data
name: data
- mountPath: /opt/zookeeper/log
name: log
nodeSelector:
kubernetes.io/hostname: bx--
通过 kubectl create -f zookeeper.yaml 创建三个service和对应的RC。注意container中已经把ZooKeeper的data和log目录映射到了主机的对应目录上用于持久化存储。
创建完之后即可进行测试:
# 进入zookeeper对应的容器后找到zkCli.sh,用该客户端进行测试
/opt/zookeeper/bin/zkCli.sh
[zk: localhost:(CONNECTED) ] # 连接到k8s创建的zookeeper service (三个service任意一个都行)
[zk: localhost:(CONNECTED) ] connect 172.16.11.2:
[zk: 172.16.11.2:(CONNECTED) ] # 查看目录信息
[zk: 172.16.11.2:(CONNECTED) ] ls /
[zookeeper]
[zk: 172.16.11.2:(CONNECTED) ] get /zookeeper cZxid = 0x0
ctime = Thu Jan :: UTC
mZxid = 0x0
mtime = Thu Jan :: UTC
pZxid = 0x0
cversion = -
dataVersion =
aclVersion =
ephemeralOwner = 0x0
dataLength =
numChildren =
[zk: 172.16.11.2:(CONNECTED) ]
五、HBase部署
以上准备工作做好后,下面部署具有两个master和两个regionserver的HBase集群,其中两个master分别位于两个节点上,两个regionserver也分别位于两个节点上;使用独立的HDFS和ZooKeeper服务。
首先需要创建HBase的镜像,选择的HBase版本为hbase-0.98.10.1-hadoop2。Dockerfile内容如下:
FROM centos:6.6
MAINTAINER openxxs <xiaoshengxu@sohu-inc.com> RUN yum install -y java-1.7.-openjdk-devel.x86_64
ENV JAVA_HOME=/usr/lib/jvm/jre RUN yum install -y nc \
&& yum install -y tar \
&& mkdir /hbase-setup WORKDIR /hbase-setup COPY hbase-0.98.10.1-hadoop2-bin.tar.gz /hbase-setup/hbase-0.98.10.1-hadoop2-bin.tar.gz
RUN tar zxf hbase-0.98.10.1-hadoop2-bin.tar.gz -C /opt/ \
&& ln -s /opt/hbase-0.98.10.1-hadoop2 /opt/hbase ADD hbase-site.xml /opt/hbase/conf/hbase-site.xml
ADD start-k8s-hbase.sh /opt/hbase/bin/start-k8s-hbase.sh
RUN chmod +x /opt/hbase/bin/start-k8s-hbase.sh WORKDIR /opt/hbase/bin ENV PATH=$PATH:/opt/hbase/bin CMD /opt/hbase/bin/start-k8s-hbase.sh
配置文件hbase-site.xml内容如下:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.master.port</name>
<value>@HBASE_MASTER_PORT@</value>
</property>
<property>
<name>hbase.master.info.port</name>
<value>@HBASE_MASTER_INFO_PORT@</value>
</property>
<property>
<name>hbase.regionserver.port</name>
<value>@HBASE_REGION_PORT@</value>
</property>
<property>
<name>hbase.regionserver.info.port</name>
<value>@HBASE_REGION_INFO_PORT@</value>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://@HDFS_PATH@/@ZNODE_PARENT@</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>@ZOOKEEPER_IP_LIST@</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>@ZOOKEEPER_PORT@</value>
</property>
<property>
<name>zookeeper.znode.parent</name>
<value>/@ZNODE_PARENT@</value>
</property>
</configuration>
启动脚本 start-k8s-hbase.sh 主要完成参数替换、写入/etc/hosts、启动 hbase 的功能,内容如下:
#!/bin/bash export HBASE_CONF_FILE=/opt/hbase/conf/hbase-site.xml
export HADOOP_USER_NAME=hdfs
export HBASE_MANAGES_ZK=false sed -i "s/@HBASE_MASTER_PORT@/$HBASE_MASTER_PORT/g" $HBASE_CONF_FILE
sed -i "s/@HBASE_MASTER_INFO_PORT@/$HBASE_MASTER_INFO_PORT/g" $HBASE_CONF_FILE
sed -i "s/@HBASE_REGION_PORT@/$HBASE_REGION_PORT/g" $HBASE_CONF_FILE
sed -i "s/@HBASE_REGION_INFO_PORT@/$HBASE_REGION_INFO_PORT/g" $HBASE_CONF_FILE
sed -i "s/@HDFS_PATH@/$HDFS_SERVICE:$HDFS_PORT\/$ZNODE_PARENT/g" $HBASE_CONF_FILE
sed -i "s/@ZOOKEEPER_IP_LIST@/$ZOOKEEPER_SERVICE_LIST/g" $HBASE_CONF_FILE
sed -i "s/@ZOOKEEPER_PORT@/$ZOOKEEPER_PORT/g" $HBASE_CONF_FILE
sed -i "s/@ZNODE_PARENT@/$ZNODE_PARENT/g" $HBASE_CONF_FILE for i in ${HBASE_MASTER_LIST[@]}
do
arr=(${i//:/ })
echo "${arr[0]} ${arr[1]}" >> /etc/hosts
done for i in ${HBASE_REGION_LIST[@]}
do
arr=(${i//:/ })
echo "${arr[0]} ${arr[1]}" >> /etc/hosts
done if [ "$HBASE_SERVER_TYPE" = "master" ]; then
/opt/hbase/bin/hbase master start > logmaster.log >&
elif [ "$HBASE_SERVER_TYPE" = "regionserver" ]; then
/opt/hbase/bin/hbase regionserver start > logregion.log >&
fi
其中导出HADOOP_USER_NAME为hdfs用户,否则会报Permission Denied的错误;HBASE_MANAGES_ZK=false表示不使用HBase自带的ZooKeeper;HBASE_MASTER_LIST为HBase集群中除当前master外的其余master的服务地址和pod名的对应关系;HBASE_REGION_LIST为HBase集群中除当前regionserver外的其余regionserver的服务地址和pod名的对应关系;最后根据 HBASE_SERVER_TYPE 的取值来确定是启master还是regionserver。
准备好这些文件后即可创建HBase的image:
docker build -t 10.11.150.76:/openxxs/hbase:1.0 .
docker push 10.11.150.76:/openxxs/hbase:1.0
随后创建hbase.yaml文件,内容如下:
apiVersion: v1
kind: Service
metadata:
name: hbase-master-
spec:
selector:
app: hbase-master
server-id: ""
type: ClusterIP
clusterIP: "172.16.30.1"
ports:
- name: rpc
port:
targetPort:
- name: info
port:
targetPort:
---
apiVersion: v1
kind: Service
metadata:
name: hbase-master-
spec:
selector:
app: hbase-master
server-id: ""
type: ClusterIP
clusterIP: "172.16.30.2"
ports:
- name: rpc
port:
targetPort:
- name: info
port:
targetPort:
---
apiVersion: v1
kind: Service
metadata:
name: hbase-region-
spec:
selector:
app: hbase-region
server-id: ""
type: ClusterIP
clusterIP: "172.16.30.3"
ports:
- name: rpc
port:
targetPort:
- name: info
port:
targetPort:
---
apiVersion: v1
kind: Service
metadata:
name: hbase-region-
spec:
selector:
app: hbase-region
server-id: ""
type: ClusterIP
clusterIP: "172.16.30.4"
ports:
- name: rpc
port:
targetPort:
- name: info
port:
targetPort:
---
apiVersion: v1
kind: Pod
metadata:
name: hbase-master-
labels:
app: hbase-master
server-id: ""
spec:
containers:
- name: hbase-master-
image: 10.11.150.76:/openxxs/hbase:1.0
ports:
- containerPort:
- containerPort:
env:
- name: HBASE_SERVER_TYPE
value: master
- name: HBASE_MASTER_PORT
value: ""
- name: HBASE_MASTER_INFO_PORT
value: ""
- name: HBASE_REGION_PORT
value: ""
- name: HBASE_REGION_INFO_PORT
value: ""
- name: HDFS_SERVICE
value: "hdfs-namenode-service.default.domeos.sohu"
- name: HDFS_PORT
value: ""
- name: ZOOKEEPER_SERVICE_LIST
value: "zookeeper-1.default.domeos.sohu,zookeeper-2.default.domeos.sohu,zookeeper-3.default.domeos.sohu"
- name: ZOOKEEPER_PORT
value: ""
- name: ZNODE_PARENT
value: hbase
- name: HBASE_MASTER_LIST
value: "172.16.30.2:hbase-master-2"
- name: HBASE_REGION_LIST
value: "172.16.30.3:hbase-region-1 172.16.30.4:hbase-region-2"
restartPolicy: Always
nodeSelector:
kubernetes.io/hostname: bx--
---
apiVersion: v1
kind: Pod
metadata:
name: hbase-master-
labels:
app: hbase-master
server-id: ""
spec:
containers:
- name: hbase-master-
image: 10.11.150.76:/openxxs/hbase:1.0
ports:
- containerPort:
- containerPort:
env:
- name: HBASE_SERVER_TYPE
value: master
- name: HBASE_MASTER_PORT
value: ""
- name: HBASE_MASTER_INFO_PORT
value: ""
- name: HBASE_REGION_PORT
value: ""
- name: HBASE_REGION_INFO_PORT
value: ""
- name: HDFS_SERVICE
value: "hdfs-namenode-service.default.domeos.sohu"
- name: HDFS_PORT
value: ""
- name: ZOOKEEPER_SERVICE_LIST
value: "zookeeper-1.default.domeos.sohu,zookeeper-2.default.domeos.sohu,zookeeper-3.default.domeos.sohu"
- name: ZOOKEEPER_PORT
value: ""
- name: ZNODE_PARENT
value: hbase
- name: HBASE_MASTER_LIST
value: "172.16.30.1:hbase-master-1"
- name: HBASE_REGION_LIST
value: "172.16.30.3:hbase-region-1 172.16.30.4:hbase-region-2"
restartPolicy: Always
nodeSelector:
kubernetes.io/hostname: bx--
---
apiVersion: v1
kind: Pod
metadata:
name: hbase-region-
labels:
app: hbase-region-
server-id: ""
spec:
containers:
- name: hbase-region-
image: 10.11.150.76:/openxxs/hbase:1.0
ports:
- containerPort:
- containerPort:
env:
- name: HBASE_SERVER_TYPE
value: regionserver
- name: HBASE_MASTER_PORT
value: ""
- name: HBASE_MASTER_INFO_PORT
value: ""
- name: HBASE_REGION_PORT
value: ""
- name: HBASE_REGION_INFO_PORT
value: ""
- name: HDFS_SERVICE
value: "hdfs-namenode-service.default.domeos.sohu"
- name: HDFS_PORT
value: ""
- name: ZOOKEEPER_SERVICE_LIST
value: "zookeeper-1.default.domeos.sohu,zookeeper-2.default.domeos.sohu,zookeeper-3.default.domeos.sohu"
- name: ZOOKEEPER_PORT
value: ""
- name: ZNODE_PARENT
value: hbase
- name: HBASE_MASTER_LIST
value: "172.16.30.1:hbase-master-1 172.16.30.2:hbase-master-2"
- name: HBASE_REGION_LIST
value: "172.16.30.4:hbase-region-2"
restartPolicy: Always
nodeSelector:
kubernetes.io/hostname: bx--
---
apiVersion: v1
kind: Pod
metadata:
name: hbase-region-
labels:
app: hbase-region-
server-id: ""
spec:
containers:
- name: hbase-region-
image: 10.11.150.76:/openxxs/hbase:1.0
ports:
- containerPort:
- containerPort:
env:
- name: HBASE_SERVER_TYPE
value: regionserver
- name: HBASE_MASTER_PORT
value: ""
- name: HBASE_MASTER_INFO_PORT
value: ""
- name: HBASE_REGION_PORT
value: ""
- name: HBASE_REGION_INFO_PORT
value: ""
- name: HDFS_SERVICE
value: "hdfs-namenode-service.default.domeos.sohu"
- name: HDFS_PORT
value: ""
- name: ZOOKEEPER_SERVICE_LIST
value: "zookeeper-1.default.domeos.sohu,zookeeper-2.default.domeos.sohu,zookeeper-3.default.domeos.sohu"
- name: ZOOKEEPER_PORT
value: ""
- name: ZNODE_PARENT
value: hbase
- name: HBASE_MASTER_LIST
value: "172.16.30.1:hbase-master-1 172.16.30.2:hbase-master-2"
- name: HBASE_REGION_LIST
value: "172.16.30.3:hbase-region-1"
restartPolicy: Always
nodeSelector:
kubernetes.io/hostname: bx--
说明:该yaml文件共创建了两个master服务、两个regionserver服务,以及对应的两个master Pods和两个regionserver Pods;Pod的restartPolicy设为Always表示如果该Pod挂掉的话将一直尝试重新启动它;以环境变量的形式将参数传递进Pod中,其中HDFS_SERVICE为HDFS服务经过skyDNS之后的对应域名,若未设置skyDNS则此处值设为HDFS服务对应的IP地址,ZOOKEEPER_SERVICE_LIST同理;HBASE_MASTER_LIST的值格式为 <master服务IP地址>:<master对应Pod名>,多个项之间以空格分隔,HBASE_REGION_LIST同理。
接着就可以创建和查看HBase服务了:
# 创建
$kubectl create -f hbase.yaml
service "hbase-master-1" created
service "hbase-master-2" created
service "hbase-region-1" created
service "hbase-region-2" created
pod "hbase-master-1" created
pod "hbase-master-2" created
pod "hbase-region-1" created
pod "hbase-region-2" created # 查看pods
$kubectl get pods
NAME READY STATUS RESTARTS AGE
hbase-master- / Running 5s
hbase-master- / Pending 5s
hbase-region- / Running 5s
hbase-region- / Pending 5s
hdfs-datanode--nek2l / Running 7d
hdfs-datanode--vkbbt / Running 7d
hdfs-datanode--h4jvt / Running 7d
hdfs-namenode--cl0pj / Running 7d
kube-dns-v8-x8igc / Running 4h
zookeeper--ojhmy / Running 12h
zookeeper--cr73i / Running 12h
zookeeper--79ls0 / Running 12h # 查看service
$kubectl get service
NAME CLUSTER_IP EXTERNAL_IP PORT(S) SELECTOR AGE
hbase-master- 172.16.30.1 <none> /TCP,/TCP app=hbase-master,server-id= 17m
hbase-master- 172.16.30.2 <none> /TCP,/TCP app=hbase-master,server-id= 17m
hbase-region- 172.16.30.3 <none> /TCP,/TCP app=hbase-region,server-id= 17m
hbase-region- 172.16.30.4 <none> /TCP,/TCP app=hbase-region,server-id= 17m
hdfs-namenode-service 172.16.20.1 <none> /TCP,/TCP,/TCP,/TCP,/TCP,/TCP,/TCP,/TCP,/TCP,/TCP,/TCP,/TCP,/TCP,/TCP,/TCP app=hdfs-namenode 7d
kube-dns 172.16.40.1 <none> /UDP,/TCP app=kube-dns,version=v8 10h
kubernetes 172.16.0.1 <none> /TCP <none> 12d
zookeeper- 172.16.11.1 <none> /TCP,/TCP,/TCP name=zookeeper,server-id= 13h
zookeeper- 172.16.11.2 <none> /TCP,/TCP,/TCP name=zookeeper,server-id= 13h
zookeeper- 172.16.11.3 <none> /TCP,/TCP,/TCP name=zookeeper,server-id= 13h
通过ZooKeeper的zkCli.sh可以看到/hbase下对应的master和rs的记录(显示乱码是由于系统显示时编码的原因,无影响):
[zk: localhost:(CONNECTED) ] ls /hbase
[meta-region-server, backup-masters, table, draining, region-in-transition, table-lock, running, master, namespace, hbaseid, online-snapshot, replication, splitWAL, recovering-regions, rs]
[zk: localhost:(CONNECTED) ] ls /hbase/rs
[172.27.0.0,,, 172.28.0.115,,]
[zk: localhost:(CONNECTED) ] get /hbase/master
?master:??E*?O=PBUF base-master-???????*
cZxid = 0x100000186
ctime = Mon Nov :: UTC
mZxid = 0x100000186
mtime = Mon Nov :: UTC
pZxid = 0x100000186
cversion =
dataVersion =
aclVersion =
ephemeralOwner = 0x151563a5e37001a
dataLength =
numChildren =
可以通过docker exec进入到HBase对应的容器中进行表操作以测试HBase的工作状态:
# 进入198的hbase-master-2容器中
[@bx_42_198 /opt/scs/openxxs]# docker exec -it f131fcf15a72 /bin/bash # 使用hbase shell对hbase进行操作
[root@hbase-master- bin]# hbase shell
-- ::, INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 0.98.10.1-hadoop2, rd5014b47660a58485a6bdd0776dea52114c7041e, Tue Feb :: PST # 通过status查看状态,这里显示的 dead 是之前测试时遗留的记录,无影响
hbase(main)::> status
-- ::, WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
servers, dead, 1.5000 average load # 创建表
hbase(main)::> create 'test','id','name'
row(s) in 0.8330 seconds => Hbase::Table - test # 查看表
hbase(main)::> list
TABLE
member
test
row(s) in 0.0240 seconds => ["member", "test"] # 插入数据
hbase(main)::> put 'test','test1','id:5','addon'
row(s) in 0.1540 seconds # 查看数据
hbase(main)::> get 'test','test1'
COLUMN CELL
id: timestamp=, value=addon
row(s) in 0.0490 seconds # 进入199的hbase-master-1容器中查看从198上插入的数据
hbase(main)::> get 'test','test1'
-- ::, WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
COLUMN CELL
id: timestamp=, value=addon
row(s) in 0.2430 seconds
从以上结果可以看出HBase算是在k8s中以不怎么完善的方式跑起来了。
六、讨论
不得不维护/etc/hosts记录的原因是HBase中master之间、regionserver之间、master与regionserver之间都是通过hostname来彼此识别对方的,而k8s的DNS只针对service并没有对Pod进行解析。而如果把master和regionserver放在同一个Pod下的话存在磁盘资源共享冲突的问题(还没仔细研究)。Github讨论中下面的这段话很直白地说明了HBase对hostname的依赖:
. Master uses ZooKeeper to run master election, the winner puts its
hostname:port in ZooKeeper.
. RegionServers will register themselves to the winning Master through its
hostname registered in ZooKeeper.
. We have a RegionServer with meta table/region (which stores Region ->
RegionServer map), it stores its hostname:port in ZooKeeper.
. When a client starts a request to HBase, it gets the hostname of the
RegionServer with meta table from ZooKeeper, scan the meta table, and find
out the hostname of the RegionServer it interests, and finally talks to the
RegionServer through hostname.
为解决这个问题目前尝试过的方法如下:
1. 配置skyDNS:skyDNS只针对service进行解析,无法解析Pod的名称。如果向skyDNS中插入hostname的相关记录并动态维护的话或许可以解决该问题,目前正在尝试中。
2. 更改创建 ReplicationController、Server、Pod、Container 时的各种设置参数,如 name、generateName等,然并卵。
3. 创建container后启动 master 前通过脚本更改 hostname:Docker只允许在Create Container时进行hostname的修改(docker run自身有一个hostname的参数可以指定Container的hostname),但在容器运行之后并不允许修改,修改则报如下错误:docker Error: hostname: you must be root to change the hostname. 这个错误有些误导,事实上是docker的机制不允许你去修改hostname而不是权限问题,用root也没法改。
4. 修改HBase参数使其上报到ZK中的值不是hostname而是IP地址:这个一度是前景光明的解决方案,但将hostname写入ZK在HBase中是硬编码在代码中的,并没有参数可以去设置此项。有人给出了个patch(戳这里),但测试结果并不好。
关于第五部分HBase部署方案的说明:选择使用单Pod而不是ReplicationController,是因为k8s会在RC中Container的hostname后面加上随机字符以区分彼此,而单Pod的Pod name和hostname是一致的;restartPolicy设为Always算是为单Pod方式鲁棒性提供点小小的补偿吧;如果将Pod name设置为对应service的IP或域名怎样?然而hostname并不允许带点号;写入 /etc/hosts 中的IP选择了service的而非Pod的,因为Pod中的IP在运行前并不能获取到,而且在重启Pod后也会发生改变,而service的IP是不变的,因此选择了 serviceIP:PodName 这种对应关系。
最根本的解决方案是让k8s支持hostname(或者说Pod)的DNS解析,前面配置ZooKeeper同样存在hostname这个问题(戳这里),后面将要部署的Kafka也会有这个问题。k8s的开发团队已经进行了许多讨论并准备解决这个问题了(戳这里),希望下个版本会有相关设置。