一、部署docker registry
生产环境中我们一般通过搭建本地的私有镜像仓库(docker registry)来拉取镜像。
1、拉取registry镜像
[root@k8s-master ~]#docker pull docker.io/registry
Using default tag: latest
Trying to pull repository docker.io/library/registry ...
sha256:0e40793ad06ac099ba63b5a8fae7a83288e64b50fe2eafa2b59741de85fd3b97: Pulling from docker.io/library/registry
b7f33cc0b48e: Pull complete
46730e1e05c9: Pull complete
: Pull complete
0cf045fea0fd: Pull complete
b78a03aa98b7: Pull complete
Digest: sha256:0e40793ad06ac099ba63b5a8fae7a83288e64b50fe2eafa2b59741de85fd3b97
Status: Downloaded newer image for docker.io/registry:latest
2、启动registry
docker run -d -p : --name=registry --restart=always --privileged=true --log-driver=none -v /home/data/registrydata:/tmp/registry registry
注:/home/data/registrydata是一个比较大的系统分区,今后镜像仓库中的全部数据都会保存在这个外挂目录下
3、Node节点改名并推送镜像
①以部署dashboard的镜像为例,后面也会用到
百度云下载链接:https://pan.baidu.com/s/1geKEADt#list/path=%2F 密码:lbyp
②上传到Node节点并推送到镜像仓库
docker load < dashboard.tar
docker load < podinfrastructure.tar
docker tag gcr.io/google_containers/kubernetes-dashboard-amd64:v1.7.1 10.0.0.211:/google_containers/kubernetes-dashboard-amd64:latest
docker tag registry.access.redhat.com/rhel7/pod-infrastructure:latest 10.0.0.211:/rhel7/pod-infrastructure:latest docker push 10.0.0.211:/google_containers/kubernetes-dashboard-amd64:latest
docker push 10.0.0.211:/rhel7/pod-infrastructure:latest
推送失败报错
Get https://10.0.0.211:5000/v1/_ping: http: server gave HTTP response to HTTPS client
解决办法
①种方法:vim /etc/sysconfig/docker加入
OPTIONS='--insecure-registry 10.0.0.211:5000'
②种方法
echo '{ "insecure-registries":["10.0.0.211:5000"] }' > /etc/docker/daemon.json
systemctl restart docker.service
4、Master节点从本地仓库拉取镜像
docker pull 10.0.0.211:/google_containers/kubernetes-dashboard-amd64:latest
docker pull 10.0.0.211:/rhel7/pod-infrastructure
查看:
二、部署dashboard
1、编辑dashboard.yaml文件
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
# Keep the name in sync with image version and
# gce/coreos/kube-manifests/addons/dashboard counterparts
name: kubernetes-dashboard-latest
namespace: kube-system
spec:
replicas:
template:
metadata:
labels:
k8s-app: kubernetes-dashboard
version: latest
kubernetes.io/cluster-service: "true"
spec:
containers:
- name: kubernetes-dashboard
image: 10.0.0.211:/google_containers/kubernetes-dashboard-amd64:latest
resources:
# keep request = limit to keep this container in guaranteed class
limits:
cpu: 100m
memory: 50Mi
requests:
cpu: 100m
memory: 50Mi
ports:
- containerPort:
args:
- --apiserver-host=http://10.0.0.211:8080
livenessProbe:
httpGet:
path: /
port:
initialDelaySeconds:
timeoutSeconds:
注:Dashboard是在yaml中定义的,要更改dashboard.yaml中对应的“image: gcr.io/google_containers/kubernetes-dashboard-amd64:v1.5.1”为“image: 10.0.0.211:5000/kubernetes-dashboard-amd64:latest”
2、编辑dashboardsvc.yaml文件
apiVersion: v1
kind: Service
metadata:
name: kubernetes-dashboard
namespace: kube-system
labels:
k8s-app: kubernetes-dashboard
kubernetes.io/cluster-service: "true"
spec:
selector:
k8s-app: kubernetes-dashboard
ports:
- port:
targetPort:
3、Master节点创建启动命令
kubectl create -f dashboard.yaml
kubectl create -f dashboardsvc.yaml
4、Master执行命令验证
[root@k8s-master ~]# kubectl get deployment --all-namespaces
NAMESPACE NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
kube-system kubernetes-dashboard-latest 3h
[root@k8s-master ~]# kubectl get svc --all-namespaces
NAMESPACE NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes 10.254.0.1 <none> /TCP 2d
kube-system kubernetes-dashboard 10.254.233.11 <none> /TCP 3h
[root@k8s-master ~]# kubectl get pod -o wide --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
kube-system kubernetes-dashboard-latest--09p76 / Running 3h 172.16.6.2 k8s-node-
5、浏览器访问
http://10.0.0.211:8080/ui
6、销毁应用
Master节点执行
kubectl delete deployment kubernetes-dashboard-latest --namespace=kube-system
kubectl delete svc kubernetes-dashboard --namespace=kube-system
三、部署遇到的问题
问题1:
{
"kind": "Status",
"apiVersion": "v1",
"metadata": {},
"status": "Failure",
"message": "no endpoints available for service \"kubernetes-dashboard\"",
"reason": "ServiceUnavailable",
"code":
}
解决:需要pause-amd64这个镜像
docker pull googlecontainer/pause-amd64:3.0
docker tag googlecontainer/pause-amd64:3.0 gcr.io/google_containers/pause-amd64:3.0
kubectl delete -f dashboard.yaml
kubectl delete -f dashboardsvc.yaml kubectl create -f dashboard.yaml
kubectl create -f dashboardsvc.yaml
详细查看报错问题
kubectl describe pod kubernetes-dashboard-latest-bf59c4df4-xcblq --namespace kube-system
问题2:部署完成之后访问浏览器报错
Error: 'dial tcp 172.16.6.2:9090: getsockopt: connection refused'
Trying to reach: 'http://172.16.6.2:9090/'
解决:iptables拦截
iptables -P FORWARD ACCEPT 或者 echo "net.ipv4.ip_forward = 1" >>/usr/lib/sysctl.d/50-default.conf
如果永久生效的话,可以修改docker服务启动脚本
vim /etc/systemd/system/docker.service #增加一行
[Service]
ExecStartPost=/sbin/iptables -I FORWARD -s 0.0.0.0/ -j ACCEPT
总结排查方法:
①检查apiserver的地址设置的是否正确,然后就是flannel是否配置启动,确保docker0和flannel0处于同一网段
②查看master和nodes上的flannel配置是否一致
# Flanneld configuration options # etcd url location. Point this to the server where etcd runs
FLANNEL_ETCD_ENDPOINTS="http://10.0.0.211:2379" # etcd config key. This is the configuration key that flannel queries
# For address range assignment
FLANNEL_ETCD_PREFIX="/atomic.io/network" # Any additional options that you want to pass
#FLANNEL_OPTIONS=""
③检查iptables -L,检查node节点上的FORWARD 查看转发是否是drop,如果是drop,则开启
iptables -P FORWARD ACCEPT
或者
echo "net.ipv4.ip_forward = 1" >>/usr/lib/sysctl.d/-default.conf