在前一篇文章中详细介绍了Kubernetes容器集群管理环境 - 完整部署(中篇),这里继续记录下Kubernetes集群插件等部署过程:
十一、Kubernetes集群插件
插件是Kubernetes集群的附件组件,丰富和完善了集群的功能,这里分别介绍的插件有coredns、Dashboard、Metrics Server,需要注意的是:kuberntes 自带插件的 manifests yaml 文件使用 gcr.io 的 docker registry,国内被墙,需要手动替换为其它registry 地址或提前在*服务器上下载,然后再同步到对应的k8s部署机器上。
11.1 - Kubernetes集群插件 - coredns
可以从微软中国提供的 gcr.io免费代理下载被墙的镜像;下面部署命令均在k8s-master01节点上执行。
1)修改配置文件
将下载的 kubernetes-server-linux-amd64.tar.gz 解压后,再解压其中的 kubernetes-src.tar.gz 文件。
[root@k8s-master01 ~]# cd /opt/k8s/work/kubernetes
[root@k8s-master01 kubernetes]# tar -xzvf kubernetes-src.tar.gz 解压之后,coredns 目录是 cluster/addons/dns。 [root@k8s-master01 kubernetes]# cd /opt/k8s/work/kubernetes/cluster/addons/dns/coredns
[root@k8s-master01 coredns]# cp coredns.yaml.base coredns.yaml
[root@k8s-master01 coredns]# source /opt/k8s/bin/environment.sh
[root@k8s-master01 coredns]# sed -i -e "s/__PILLAR__DNS__DOMAIN__/${CLUSTER_DNS_DOMAIN}/" -e "s/__PILLAR__DNS__SERVER__/${CLUSTER_DNS_SVC_IP}/" coredns.yaml 2)创建 coredns
[root@k8s-master01 coredns]# fgrep "image" ./*
./coredns.yaml: image: k8s.gcr.io/coredns:1.3.1
./coredns.yaml: imagePullPolicy: IfNotPresent
./coredns.yaml.base: image: k8s.gcr.io/coredns:1.3.1
./coredns.yaml.base: imagePullPolicy: IfNotPresent
./coredns.yaml.in: image: k8s.gcr.io/coredns:1.3.1
./coredns.yaml.in: imagePullPolicy: IfNotPresent
./coredns.yaml.sed: image: k8s.gcr.io/coredns:1.3.1
./coredns.yaml.sed: imagePullPolicy: IfNotPresent 提前*下载"k8s.gcr.io/coredns:1.3.1"镜像,然后上传到node节点上, 执行"docker load ..."命令导入到node节点的images镜像里面
或者从微软中国提供的gcr.io免费代理下载被墙的镜像,然后在修改yaml文件里更新coredns的镜像下载地址 然后确保对应yaml文件里的镜像拉取策略为IfNotPresent,即本地有则使用本地镜像,不拉取 接着再次进行coredns的创建
[root@k8s-master01 coredns]# kubectl create -f coredns.yaml 3)检查coredns功能 (执行下面命令后,稍微等一会儿,确保READY状态都是可用的)
[root@k8s-master01 coredns]# kubectl get all -n kube-system
NAME READY STATUS RESTARTS AGE
pod/coredns-5b969f4c88-pd5js 1/1 Running 0 55s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kube-dns ClusterIP 10.254.0.2 <none> 53/UDP,53/TCP,9153/TCP 56s NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/coredns 1/1 1 1 57s NAME DESIRED CURRENT READY AGE
replicaset.apps/coredns-5b969f4c88 1 1 1 56s 查看创建的coredns的pod状态,确保没有报错
[root@k8s-master01 coredns]# kubectl describe pod/coredns-5b969f4c88-pd5js -n kube-system
.............
.............
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m12s default-scheduler Successfully assigned kube-system/coredns-5b969f4c88-pd5js to k8s-node03
Normal Pulled 2m11s kubelet, k8s-node03 Container image "k8s.gcr.io/coredns:1.3.1" already present on machine
Normal Created 2m10s kubelet, k8s-node03 Created container coredns
Normal Started 2m10s kubelet, k8s-node03 Started container coredns 4)新建一个 Deployment
[root@k8s-master01 coredns]# cd /opt/k8s/work
[root@k8s-master01 work]# cat > my-nginx.yaml <<EOF
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: my-nginx
spec:
replicas: 2
template:
metadata:
labels:
run: my-nginx
spec:
containers:
- name: my-nginx
image: nginx:1.7.9
ports:
- containerPort: 80
EOF 接着执行这个Deployment的创建
[root@k8s-master01 work]# kubectl create -f my-nginx.yaml export 该 Deployment, 生成 my-nginx 服务:
[root@k8s-master01 work]# kubectl expose deploy my-nginx [root@k8s-master01 work]# kubectl get services --all-namespaces |grep my-nginx
default my-nginx ClusterIP 10.254.170.246 <none> 80/TCP 19s 创建另一个 Pod,查看 /etc/resolv.conf 是否包含 kubelet 配置的 --cluster-dns 和 --cluster-domain,
是否能够将服务 my-nginx 解析到上面显示的 Cluster IP 10.254.170.246 [root@k8s-master01 work]# cd /opt/k8s/work
[root@k8s-master01 work]# cat > dnsutils-ds.yml <<EOF
apiVersion: v1
kind: Service
metadata:
name: dnsutils-ds
labels:
app: dnsutils-ds
spec:
type: NodePort
selector:
app: dnsutils-ds
ports:
- name: http
port: 80
targetPort: 80
---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: dnsutils-ds
labels:
addonmanager.kubernetes.io/mode: Reconcile
spec:
template:
metadata:
labels:
app: dnsutils-ds
spec:
containers:
- name: my-dnsutils
image: tutum/dnsutils:latest
command:
- sleep
- "3600"
ports:
- containerPort: 80
EOF 接着创建这个pod
[root@k8s-master01 work]# kubectl create -f dnsutils-ds.yml 查看上面创建的pod状态(需要等待一会儿,确保STATUS状态为"Running"。如果状态失败,可以执行"kubectl describe pod ...."查看原因)
[root@k8s-master01 work]# kubectl get pods -lapp=dnsutils-ds
NAME READY STATUS RESTARTS AGE
dnsutils-ds-5sc4z 1/1 Running 0 52s
dnsutils-ds-h546r 1/1 Running 0 52s
dnsutils-ds-jx5kx 1/1 Running 0 52s [root@k8s-master01 work]# kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
dnsutils-ds NodePort 10.254.185.211 <none> 80:32767/TCP 7m14s
kubernetes ClusterIP 10.254.0.1 <none> 443/TCP 7d13h
my-nginx ClusterIP 10.254.170.246 <none> 80/TCP 9m11s
nginx-ds NodePort 10.254.41.83 <none> 80:30876/TCP 27h 然后验证coredns 功能。
先依次登陆上面创建的dnsutils的pod里面进行验证,确保pod容器中/etc/resolv.conf里的nameserver地址为"CLUSTER_DNS_SVC_IP"变量值(即environment.sh脚本中定义的)
[root@k8s-master01 work]# kubectl -it exec dnsutils-ds-5sc4z bash
root@dnsutils-ds-5sc4z:/# cat /etc/resolv.conf
nameserver 10.254.0.2
search default.svc.cluster.local svc.cluster.local cluster.local localdomain
options ndots:5 [root@k8s-master01 work]# kubectl exec dnsutils-ds-5sc4z nslookup kubernetes
Server: 10.254.0.2
Address: 10.254.0.2#53 Name: kubernetes.default.svc.cluster.local
Address: 10.254.0.1 [root@k8s-master01 work]# kubectl exec dnsutils-ds-5sc4z nslookup www.baidu.com
Server: 10.254.0.2
Address: 10.254.0.2#53 Non-authoritative answer:
www.baidu.com canonical name = www.a.shifen.com.
www.a.shifen.com canonical name = www.wshifen.com.
Name: www.wshifen.com
Address: 103.235.46.39 发现可以将服务 my-nginx 解析到上面它对应的 Cluster IP 10.254.170.246
[root@k8s-master01 work]# kubectl exec dnsutils-ds-5sc4z nslookup my-nginx
Server: 10.254.0.2
Address: 10.254.0.2#53 Non-authoritative answer:
Name: my-nginx.default.svc.cluster.local
Address: 10.254.170.246 [root@k8s-master01 work]# kubectl exec dnsutils-ds-5sc4z nslookup kube-dns.kube-system.svc.cluster
Server: 10.254.0.2
Address: 10.254.0.2#53 ** server can't find kube-dns.kube-system.svc.cluster: NXDOMAIN command terminated with exit code 1 [root@k8s-master01 work]# kubectl exec dnsutils-ds-5sc4z nslookup kube-dns.kube-system.svc
Server: 10.254.0.2
Address: 10.254.0.2#53 Name: kube-dns.kube-system.svc.cluster.local
Address: 10.254.0.2 [root@k8s-master01 work]# kubectl exec dnsutils-ds-5sc4z nslookup kube-dns.kube-system.svc.cluster.local
Server: 10.254.0.2
Address: 10.254.0.2#53 Name: kube-dns.kube-system.svc.cluster.local
Address: 10.254.0.2 [root@k8s-master01 work]# kubectl exec dnsutils-ds-5sc4z nslookup kube-dns.kube-system.svc.cluster.local.
Server: 10.254.0.2
Address: 10.254.0.2#53 Name: kube-dns.kube-system.svc.cluster.local
Address: 10.254.0.2
11.2 - Kubernetes集群插件 - dashboard
可以从微软中国提供的 gcr.io免费代理下载被墙的镜像;下面部署命令均在k8s-master01节点上执行。
1)修改配置文件
将下载的 kubernetes-server-linux-amd64.tar.gz 解压后,再解压其中的 kubernetes-src.tar.gz 文件 (上面在coredns部署阶段已经解压过了)
[root@k8s-master01 ~]# cd /opt/k8s/work/kubernetes/
[root@k8s-master01 kubernetes]# ls -d cluster/addons/dashboard
cluster/addons/dashboard dashboard 对应的目录是:cluster/addons/dashboard
[root@k8s-master01 kubernetes]# cd /opt/k8s/work/kubernetes/cluster/addons/dashboard 修改 service 定义,指定端口类型为 NodePort,这样外界可以通过地址 NodeIP:NodePort 访问 dashboard;
[root@k8s-master01 dashboard]# vim dashboard-service.yaml
apiVersion: v1
kind: Service
metadata:
name: kubernetes-dashboard
namespace: kube-system
labels:
k8s-app: kubernetes-dashboard
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
spec:
type: NodePort # 添加这一行内容
selector:
k8s-app: kubernetes-dashboard
ports:
- port: 443
targetPort: 8443 2) 执行所有定义文件
需要提前*将k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.1镜像下载下来,然后上传到node节点上,然后执行"docker load ......" 导入到node节点的images镜像里
或者从微软中国提供的gcr.io免费代理下载被墙的镜像,然后在修改yaml文件里更新dashboard的镜像下载地址 [root@k8s-master01 dashboard]# fgrep "image" ./*
./dashboard-controller.yaml: image: k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.1 [root@k8s-master01 dashboard]# ls *.yaml
dashboard-configmap.yaml dashboard-controller.yaml dashboard-rbac.yaml dashboard-secret.yaml dashboard-service.yaml [root@k8s-master01 dashboard]# kubectl apply -f . 3)查看分配的 NodePort
[root@k8s-master01 dashboard]# kubectl get deployment kubernetes-dashboard -n kube-system
NAME READY UP-TO-DATE AVAILABLE AGE
kubernetes-dashboard 1/1 1 1 48s [root@k8s-master01 dashboard]# kubectl --namespace kube-system get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
coredns-5b969f4c88-pd5js 1/1 Running 0 33m 172.30.72.3 k8s-node03 <none> <none>
kubernetes-dashboard-85bcf5dbf8-8s7hm 1/1 Running 0 63s 172.30.72.6 k8s-node03 <none> <none> [root@k8s-master01 dashboard]# kubectl get services kubernetes-dashboard -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes-dashboard NodePort 10.254.164.208 <none> 443:30284/TCP 104s 可以看出:NodePort 30284 映射到 dashboard pod 443 端口; 4)查看 dashboard 支持的命令行参数
[root@k8s-master01 dashboard]# kubectl exec --namespace kube-system -it kubernetes-dashboard-85bcf5dbf8-8s7hm -- /dashboard --help
2019/06/25 16:54:04 Starting overwatch
Usage of /dashboard:
--alsologtostderr log to standard error as well as files
--api-log-level string Level of API request logging. Should be one of 'INFO|NONE|DEBUG'. Default: 'INFO'. (default "INFO")
--apiserver-host string The address of the Kubernetes Apiserver to connect to in the format of protocol://address:port, e.g., http://localhost:8080. If not specified, the assumption is that the binary runs inside a Kubernetes cluster and local discovery is attempted.
--authentication-mode strings Enables authentication options that will be reflected on login screen. Supported values: token, basic. Default: token.Note that basic option should only be used if apiserver has '--authorization-mode=ABAC' and '--basic-auth-file' flags set. (default [token])
--auto-generate-certificates When set to true, Dashboard will automatically generate certificates used to serve HTTPS. Default: false.
--bind-address ip The IP address on which to serve the --secure-port (set to 0.0.0.0 for all interfaces). (default 0.0.0.0)
--default-cert-dir string Directory path containing '--tls-cert-file' and '--tls-key-file' files. Used also when auto-generating certificates flag is set. (default "/certs")
--disable-settings-authorizer When enabled, Dashboard settings page will not require user to be logged in and authorized to access settings page.
--enable-insecure-login When enabled, Dashboard login view will also be shown when Dashboard is not served over HTTPS. Default: false.
--enable-skip-login When enabled, the skip button on the login page will be shown. Default: false.
--heapster-host string The address of the Heapster Apiserver to connect to in the format of protocol://address:port, e.g., http://localhost:8082. If not specified, the assumption is that the binary runs inside a Kubernetes cluster and service proxy will be used.
--insecure-bind-address ip The IP address on which to serve the --port (set to 0.0.0.0 for all interfaces). (default 127.0.0.1)
--insecure-port int The port to listen to for incoming HTTP requests. (default 9090)
--kubeconfig string Path to kubeconfig file with authorization and master location information.
--log_backtrace_at traceLocation when logging hits line file:N, emit a stack trace (default :0)
--log_dir string If non-empty, write log files in this directory
--logtostderr log to standard error instead of files
--metric-client-check-period int Time in seconds that defines how often configured metric client health check should be run. Default: 30 seconds. (default 30)
--port int The secure port to listen to for incoming HTTPS requests. (default 8443)
--stderrthreshold severity logs at or above this threshold go to stderr (default 2)
--system-banner string When non-empty displays message to Dashboard users. Accepts simple HTML tags. Default: ''.
--system-banner-severity string Severity of system banner. Should be one of 'INFO|WARNING|ERROR'. Default: 'INFO'. (default "INFO")
--tls-cert-file string File containing the default x509 Certificate for HTTPS.
--tls-key-file string File containing the default x509 private key matching --tls-cert-file.
--token-ttl int Expiration time (in seconds) of JWE tokens generated by dashboard. Default: 15 min. 0 - never expires (default 900)
-v, --v Level log level for V logs
--vmodule moduleSpec comma-separated list of pattern=N settings for file-filtered logging
pflag: help requested
command terminated with exit code 2 5)访问dashboard
从1.7版本开始,dashboard只允许通过https访问,如果使用kube proxy则必须监听localhost或127.0.0.1。
对于NodePort没有这个限制,但是仅建议在开发环境中使用。
对于不满足这些条件的登录访问,在登录成功后浏览器不跳转,始终停在登录界面。 有三种访问dashboard的方式:
-> kubernetes-dashboard 服务暴露了 NodePort,可以使用 https://NodeIP:NodePort 地址访问 dashboard;
-> 通过 kube-apiserver 访问 dashboard;
-> 通过 kubectl proxy 访问 dashboard: 第一种方式:
kubernetes-dashboard 服务暴露了NodePort端口,可以通过https://NodeIP+NodePort 来访问dashboard
[root@k8s-master01 dashboard]# kubectl get services kubernetes-dashboard -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes-dashboard NodePort 10.254.164.208 <none> 443:30284/TCP 14m 则可以通过访问https://172.16.60.244:30284,https://172.16.60.245:30284,https://172.16.60.246:30284 来打开dashboard界面 第二种方式:通过 kubectl proxy 访问 dashboard
启动代理(下面命令会一直在前台执行,可以选择使用tmux虚拟终端执行)
[root@k8s-master01 dashboard]# kubectl proxy --address='localhost' --port=8086 --accept-hosts='^*$'
Starting to serve on 127.0.0.1:8086 需要注意:
--address 必须为 localhost 或 127.0.0.1;
需要指定 --accept-hosts 选项,否则浏览器访问 dashboard 页面时提示 “Unauthorized”;
这样就可以在这个服务器的浏览器里访问 URL:http://127.0.0.1:8086/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy 第三种方式:通过 kube-apiserver 访问 dashboard
获取集群服务地址列表:
[root@k8s-master01 dashboard]# kubectl cluster-info
Kubernetes master is running at https://172.16.60.250:8443
CoreDNS is running at https://172.16.60.250:8443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
kubernetes-dashboard is running at https://172.16.60.250:8443/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'. 需要注意:
必须通过 kube-apiserver 的安全端口(https)访问 dashbaord,访问时浏览器需要使用自定义证书,否则会被 kube-apiserver 拒绝访问。
创建和导入自定义证书的操作已经在前面"部署node工作节点"环节介绍过了,这里就略过了~~~ 浏览器访问 URL:https://172.16.60.250:8443/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy 即可打开dashboard界面 6)创建登录 Dashboard 的 token 和 kubeconfig 配置文件
dashboard 默认只支持 token 认证(不支持 client 证书认证),所以如果使用 Kubeconfig 文件,需要将 token 写入到该文件。 方法一:创建登录 token
[root@k8s-master01 ~]# kubectl create sa dashboard-admin -n kube-system
serviceaccount/dashboard-admin created [root@k8s-master01 ~]# kubectl create clusterrolebinding dashboard-admin --clusterrole=cluster-admin --serviceaccount=kube-system:dashboard-admin
clusterrolebinding.rbac.authorization.k8s.io/dashboard-admin created [root@k8s-master01 ~]# ADMIN_SECRET=$(kubectl get secrets -n kube-system | grep dashboard-admin | awk '{print $1}') [root@k8s-master01 ~]# DASHBOARD_LOGIN_TOKEN=$(kubectl describe secret -n kube-system ${ADMIN_SECRET} | grep -E '^token' | awk '{print $2}') [root@k8s-master01 ~]# echo ${DASHBOARD_LOGIN_TOKEN}
eyJhbGciOiJSUzI1NiIsImtpZCI6IiJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJkYXNoYm9hcmQtYWRtaW4tdG9rZW4tcmNicnMiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoiZGFzaGJvYXJkLWFkbWluIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQudWlkIjoiZGQ1Njg0OGUtOTc2Yi0xMWU5LTkwZDQtMDA1MDU2YWM3YzgxIiwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50Omt1YmUtc3lzdGVtOmRhc2hib2FyZC1hZG1pbiJ9.Kwh_zhI-dA8kIfs7DRmNecS_pCXQ3B2ujS_eooR-Gvoaz29cJTzD_Z67bRDS1qlJ8oyIQjW2_m837EkUCpJ8LRiOnTMjwBPMeBPHHomDGdSmdj37UEc7YQa5AmkvVWIYiUKgTHJjgLaKlk6eH7Ihvcez3IBHWTFXlULu24mlMt9XP4J7M5fIg7I5-ctfLIbV2NsvWLwiv6JAECocbGX1w0fJTmn9LlheiDQP1ByxU_WavsFYWOYPEqdUQbqcZ7iovT1ZUVyFuGS5rxzSHm86tcK_ptEinYO1dGLjMrLRZ3tB1OAOW8_u-VnHqsNwKjbZJNUljfzCGy1YoI2xUB7V4w 则可以使用上面输出的token 登录 Dashboard。 方法二:创建使用 token 的 KubeConfig 文件 (推荐使用这种方式)
[root@k8s-master01 ~]# source /opt/k8s/bin/environment.sh 设置集群参数
[root@k8s-master01 ~]# kubectl config set-cluster kubernetes \
--certificate-authority=/etc/kubernetes/cert/ca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=dashboard.kubeconfig 设置客户端认证参数,使用上面创建的 Token
[root@k8s-master01 ~]# kubectl config set-credentials dashboard_user \
--token=${DASHBOARD_LOGIN_TOKEN} \
--kubeconfig=dashboard.kubeconfig 设置上下文参数
[root@k8s-master01 ~]# kubectl config set-context default \
--cluster=kubernetes \
--user=dashboard_user \
--kubeconfig=dashboard.kubeconfig 设置默认上下文
[root@k8s-master01 ~]# kubectl config use-context default --kubeconfig=dashboard.kubeconfig 将上面生成的 dashboard.kubeconfig文件拷贝到本地,然后使用这个文件登录 Dashboard。
[root@k8s-master01 ~]# ll dashboard.kubeconfig
-rw------- 1 root root 3025 Jun 26 01:14 dashboard.kubeconfig
这里由于缺少Heapster或metrics-server插件,当前dashboard还不能展示 Pod、Nodes 的 CPU、内存等统计数据和图表。
11.3 - 部署 metrics-server 插件
metrics-server 通过 kube-apiserver 发现所有节点,然后调用 kubelet APIs(通过 https 接口)获得各节点(Node)和 Pod 的 CPU、Memory 等资源使用情况。从 Kubernetes 1.12 开始,kubernetes 的安装脚本移除了 Heapster,从 1.13 开始完全移除了对 Heapster 的支持,Heapster 不再被维护。替代方案如下:
-> 用于支持自动扩缩容的 CPU/memory HPA metrics:metrics-server;
-> 通用的监控方案:使用第三方可以获取 Prometheus 格式监控指标的监控系统,如 Prometheus Operator;
-> 事件传输:使用第三方工具来传输、归档 kubernetes events;
从 Kubernetes 1.8 开始,资源使用指标(如容器 CPU 和内存使用率)通过 Metrics API 在 Kubernetes 中获取, metrics-server 替代了heapster。Metrics Server 实现了Resource Metrics API,Metrics Server 是集群范围资源使用数据的聚合器。 Metrics Server 从每个节点上的 Kubelet 公开的 Summary API 中采集指标信息。
在了解Metrics-Server之前,必须要事先了解下Metrics API的概念。Metrics API相比于之前的监控采集方式(hepaster)是一种新的思路,官方希望核心指标的监控应该是稳定的,版本可控的,且可以直接被用户访问(例如通过使用 kubectl top 命令),或由集群中的控制器使用(如HPA),和其他的Kubernetes APIs一样。官方废弃heapster项目,就是为了将核心资源监控作为一等公民对待,即像pod、service那样直接通过api-server或者client直接访问,不再是安装一个hepater来汇聚且由heapster单独管理。
假设每个pod和node我们收集10个指标,从k8s的1.6开始,支持5000节点,每个节点30个pod,假设采集粒度为1分钟一次,则"10 x 5000 x 30 / 60 = 25000 平均每分钟2万多个采集指标"。因为k8s的api-server将所有的数据持久化到了etcd中,显然k8s本身不能处理这种频率的采集,而且这种监控数据变化快且都是临时数据,因此需要有一个组件单独处理他们,k8s版本只存放部分在内存中,于是metric-server的概念诞生了。其实hepaster已经有暴露了api,但是用户和Kubernetes的其他组件必须通过master proxy的方式才能访问到,且heapster的接口不像api-server一样,有完整的鉴权以及client集成。
有了Metrics Server组件,也采集到了该有的数据,也暴露了api,但因为api要统一,如何将请求到api-server的/apis/metrics请求转发给Metrics Server呢,
解决方案就是:kube-aggregator,在k8s的1.7中已经完成,之前Metrics Server一直没有面世,就是耽误在了kube-aggregator这一步。kube-aggregator(聚合api)主要提供:
-> Provide an API for registering API servers;
-> Summarize discovery information from all the servers;
-> Proxy client requests to individual servers;
Metric API的使用:
-> Metrics API 只可以查询当前的度量数据,并不保存历史数据
-> Metrics API URI 为 /apis/metrics.k8s.io/,在 k8s.io/metrics 维护
-> 必须部署 metrics-server 才能使用该 API,metrics-server 通过调用 Kubelet Summary API 获取数据
Metrics server定时从Kubelet的Summary API(类似/ap1/v1/nodes/nodename/stats/summary)采集指标信息,这些聚合过的数据将存储在内存中,且以metric-api的形式暴露出去。Metrics server复用了api-server的库来实现自己的功能,比如鉴权、版本等,为了实现将数据存放在内存中吗,去掉了默认的etcd存储,引入了内存存储(即实现Storage interface)。因为存放在内存中,因此监控数据是没有持久化的,可以通过第三方存储来拓展,这个和heapster是一致的。
Kubernetes Dashboard 还不支持 metrics-server,如果使用 metrics-server 替代 Heapster,将无法在 dashboard 中以图形展示 Pod 的内存和 CPU 情况,需要通过 Prometheus、Grafana 等监控方案来弥补。kuberntes 自带插件的 manifests yaml 文件使用 gcr.io 的 docker registry,国内被墙,需要手动替换为其它 registry 地址(本文档未替换);可以从微软中国提供的 gcr.io 免费代理下载被墙的镜像;下面部署命令均在k8s-master01节点上执行。
监控架构
1)安装 metrics-server
从 github clone 源码:
[root@k8s-master01 ~]# cd /opt/k8s/work/
[root@k8s-master01 work]# git clone https://github.com/kubernetes-incubator/metrics-server.git
[root@k8s-master01 work]# cd metrics-server/deploy/1.8+/
[root@k8s-master01 1.8+]# ls
aggregated-metrics-reader.yaml auth-reader.yaml metrics-server-deployment.yaml resource-reader.yaml
auth-delegator.yaml metrics-apiservice.yaml metrics-server-service.yaml 修改 metrics-server-deployment.yaml 文件,为 metrics-server 添加三个命令行参数(在"imagePullPolicy"行的下面添加):
[root@k8s-master01 1.8+]# cp metrics-server-deployment.yaml metrics-server-deployment.yaml.bak
[root@k8s-master01 1.8+]# vim metrics-server-deployment.yaml
.........
args:
- --metric-resolution=30s
- --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP 这里需要注意:
--metric-resolution=30s:从 kubelet 采集数据的周期;
--kubelet-preferred-address-types:优先使用 InternalIP 来访问 kubelet,这样可以避免节点名称没有 DNS 解析记录时,通过节点名称调用节点 kubelet API 失败的情况(未配置时默认的情况); 另外:
需要提前FQ将k8s.gcr.io/metrics-server-amd64:v0.3.3镜像下载下来,然后上传到node节点上,然后执行"docker load ......" 导入到node节点的images镜像里
或者从微软中国提供的gcr.io免费代理下载被墙的镜像,然后在修改yaml文件里更新dashboard的镜像下载地址. [root@k8s-master01 1.8+]# fgrep "image" metrics-server-deployment.yaml
# mount in tmp so we can safely use from-scratch images and/or read-only containers
image: k8s.gcr.io/metrics-server-amd64:v0.3.3
imagePullPolicy: Always 由于已经提前将相应镜像导入到各node节点的image里了,所以需要将metrics-server-deployment.yaml文件中的镜像拉取策略修改为"IfNotPresent".
即:本地有则使用本地镜像,不拉取 [root@k8s-master01 1.8+]# fgrep "image" metrics-server-deployment.yaml
# mount in tmp so we can safely use from-scratch images and/or read-only containers
image: k8s.gcr.io/metrics-server-amd64:v0.3.3
imagePullPolicy: IfNotPresent 部署 metrics-server:
[root@k8s-master01 1.8+]# kubectl create -f . 2)查看运行情况
[root@k8s-master01 1.8+]# kubectl -n kube-system get pods -l k8s-app=metrics-server
NAME READY STATUS RESTARTS AGE
metrics-server-54997795d9-4cv6h 1/1 Running 0 50s [root@k8s-master01 1.8+]# kubectl get svc -n kube-system metrics-server
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
metrics-server ClusterIP 10.254.238.208 <none> 443/TCP 65s 3)metrics-server 的命令行参数 (在任意一个node节点上执行下面命令)
[root@k8s-node01 ~]# docker run -it --rm k8s.gcr.io/metrics-server-amd64:v0.3.3 --help 4)查看 metrics-server 输出的 metrics
-> 通过 kube-apiserver 或 kubectl proxy 访问:
https://172.16.60.250:8443/apis/metrics.k8s.io/v1beta1/nodes
https://172.16.60.250:8443/apis/metrics.k8s.io/v1beta1/nodes/
https://172.16.60.250:8443/apis/metrics.k8s.io/v1beta1/pods
https://172.16.60.250:8443/apis/metrics.k8s.io/v1beta1/namespace//pods/ -> 直接使用 kubectl 命令访问 :
# kubectl get --raw apis/metrics.k8s.io/v1beta1/nodes
# kubectl get --raw apis/metrics.k8s.io/v1beta1/pods kubectl
# get --raw apis/metrics.k8s.io/v1beta1/nodes/ kubectl
# get --raw apis/metrics.k8s.io/v1beta1/namespace//pods/ [root@k8s-master01 1.8+]# kubectl get --raw "/apis/metrics.k8s.io/v1beta1" | jq .
{
"kind": "APIResourceList",
"apiVersion": "v1",
"groupVersion": "metrics.k8s.io/v1beta1",
"resources": [
{
"name": "nodes",
"singularName": "",
"namespaced": false,
"kind": "NodeMetrics",
"verbs": [
"get",
"list"
]
},
{
"name": "pods",
"singularName": "",
"namespaced": true,
"kind": "PodMetrics",
"verbs": [
"get",
"list"
]
}
]
} [root@k8s-master01 1.8+]# kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes" | jq .
{
"kind": "NodeMetricsList",
"apiVersion": "metrics.k8s.io/v1beta1",
"metadata": {
"selfLink": "/apis/metrics.k8s.io/v1beta1/nodes"
},
"items": [
{
"metadata": {
"name": "k8s-node01",
"selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/k8s-node01",
"creationTimestamp": "2019-06-27T17:11:43Z"
},
"timestamp": "2019-06-27T17:11:36Z",
"window": "30s",
"usage": {
"cpu": "47615396n",
"memory": "2413536Ki"
}
},
{
"metadata": {
"name": "k8s-node02",
"selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/k8s-node02",
"creationTimestamp": "2019-06-27T17:11:43Z"
},
"timestamp": "2019-06-27T17:11:38Z",
"window": "30s",
"usage": {
"cpu": "42000411n",
"memory": "2496152Ki"
}
},
{
"metadata": {
"name": "k8s-node03",
"selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/k8s-node03",
"creationTimestamp": "2019-06-27T17:11:43Z"
},
"timestamp": "2019-06-27T17:11:40Z",
"window": "30s",
"usage": {
"cpu": "54095172n",
"memory": "3837404Ki"
}
}
]
} 这里需要注意:/apis/metrics.k8s.io/v1beta1/nodes 和 /apis/metrics.k8s.io/v1beta1/pods 返回的 usage 包含 CPU 和 Memory; 5)使用 kubectl top 命令查看集群节点资源使用情况
[root@k8s-master01 1.8+]# kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
k8s-node01 45m 1% 2357Mi 61%
k8s-node02 44m 1% 2437Mi 63%
k8s-node03 54m 1% 3747Mi 47% =======================================================================================================================================
报错解决:
[root@k8s-master01 1.8+]# kubectl top node
Error from server (Forbidden): nodes.metrics.k8s.io is forbidden: User "aggregator" cannot list resource "nodes" in API group "metrics.k8s.io" at the cluster scope 出现上述错误的原因主要是未对aggregator这个sa进行rbac授权!
偷懒的解决方案,直接将这个sa和cluster-admin进行绑定,但不符合最小权限原则。 [root@k8s-master01 1.8+]# kubectl create clusterrolebinding custom-metric-with-cluster-admin --clusterrole=cluster-admin --user=aggregator
11.4 - 部署 kube-state-metrics 插件
上面已经部署了metric-server,几乎容器运行的大多数指标数据都能采集到了,但是下面这种情况的指标数据的采集却无能为力:
-> 调度了多少个replicas?现在可用的有几个?
-> 多少个Pod是running/stopped/terminated状态?
-> Pod重启了多少次?
-> 当前有多少job在运行中?
这些则是kube-state-metrics提供的内容,它是K8S的一个附加服务,基于client-go开发的。它会轮询Kubernetes API,并将Kubernetes的结构化信息转换为metrics。kube-state-metrics能够采集绝大多数k8s内置资源的相关数据,例如pod、deploy、service等等。同时它也提供自己的数据,主要是资源采集个数和采集发生的异常次数统计。
kube-state-metrics 指标类别包括:
CronJob Metrics
DaemonSet Metrics
Deployment Metrics
Job Metrics
LimitRange Metrics
Node Metrics
PersistentVolume Metrics
PersistentVolumeClaim Metrics
Pod Metrics
Pod Disruption Budget Metrics
ReplicaSet Metrics
ReplicationController Metrics
ResourceQuota Metrics
Service Metrics
StatefulSet Metrics
Namespace Metrics
Horizontal Pod Autoscaler Metrics
Endpoint Metrics
Secret Metrics
ConfigMap Metrics
以pod为例的指标有:
kube_pod_info
kube_pod_owner
kube_pod_status_running
kube_pod_status_ready
kube_pod_status_scheduled
kube_pod_container_status_waiting
kube_pod_container_status_terminated_reason
..............
kube-state-metrics与metric-server (或heapster)的对比
1)metric-server是从api-server中获取cpu,内存使用率这种监控指标,并把它们发送给存储后端,如influxdb或云厂商,它当前的核心作用是:为HPA等组件提供决策指标支持。
2)kube-state-metrics关注于获取k8s各种资源的最新状态,如deployment或者daemonset,之所以没有把kube-state-metrics纳入到metric-server的能力中,是因为它们的关注点本质上是不一样的。metric-server仅仅是获取、格式化现有数据,写入特定的存储,实质上是一个监控系统。而kube-state-metrics是将k8s的运行状况在内存中做了个快照,并且获取新的指标,但它没有能力导出这些指标
3)换个角度讲,kube-state-metrics本身是metric-server的一种数据来源,虽然现在没有这么做。
4)另外,像Prometheus这种监控系统,并不会去用metric-server中的数据,它都是自己做指标收集、集成的(Prometheus包含了metric-server的能力),但Prometheus可以监控metric-server本身组件的监控状态并适时报警,这里的监控就可以通过kube-state-metrics来实现,如metric-serverpod的运行状态。
kube-state-metrics本质上是不断轮询api-server,其性能优化:
kube-state-metrics在之前的版本中暴露出两个问题:
1)/metrics接口响应慢(10-20s)
2)内存消耗太大,导致超出limit被杀掉
问题一的方案:就是基于client-go的cache tool实现本地缓存,具体结构为:var cache = map[uuid][]byte{}
问题二的的方案是:对于时间序列的字符串,是存在很多重复字符的(如namespace等前缀筛选),可以用指针或者结构化这些重复字符。
kube-state-metrics优化点和问题
1)因为kube-state-metrics是监听资源的add、delete、update事件,那么在kube-state-metrics部署之前已经运行的资源的数据是不是就拿不到了?其实kube-state-metric利用client-go可以初始化所有已经存在的资源对象,确保没有任何遗漏;
2)kube-state-metrics当前不会输出metadata信息(如help和description);
3)缓存实现是基于golang的map,解决并发读问题当期是用了一个简单的互斥锁,应该可以解决问题,后续会考虑golang的sync.Map安全map;
4)kube-state-metrics通过比较resource version来保证event的顺序;
5)kube-state-metrics并不保证包含所有资源;
下面部署命令均在k8s-master01节点上执行。
1)修改配置文件
将下载的 kube-state-metrics.tar.gz 放到/opt/k8s/work目录下解压
[root@k8s-master01 ~]# cd /opt/k8s/work/
[root@k8s-master01 work]# tar -zvxf kube-state-metrics.tar.gz
[root@k8s-master01 work]# cd kube-state-metrics kube-state-metrics目录下,有所需要的文件
[root@k8s-master01 kube-state-metrics]# ll
total 32
-rw-rw-r-- 1 root root 362 May 6 17:31 kube-state-metrics-cluster-role-binding.yaml
-rw-rw-r-- 1 root root 1076 May 6 17:31 kube-state-metrics-cluster-role.yaml
-rw-rw-r-- 1 root root 1657 Jul 1 17:35 kube-state-metrics-deployment.yaml
-rw-rw-r-- 1 root root 381 May 6 17:31 kube-state-metrics-role-binding.yaml
-rw-rw-r-- 1 root root 508 May 6 17:31 kube-state-metrics-role.yaml
-rw-rw-r-- 1 root root 98 May 6 17:31 kube-state-metrics-service-account.yaml
-rw-rw-r-- 1 root root 404 May 6 17:31 kube-state-metrics-service.yaml [root@k8s-master01 kube-state-metrics]# fgrep -R "image" ./*
./kube-state-metrics-deployment.yaml: image: quay.io/coreos/kube-state-metrics:v1.5.0
./kube-state-metrics-deployment.yaml: imagePullPolicy: IfNotPresent
./kube-state-metrics-deployment.yaml: image: k8s.gcr.io/addon-resizer:1.8.3
./kube-state-metrics-deployment.yaml: imagePullPolicy: IfNotPresent [root@k8s-master01 kube-state-metrics]# cat kube-state-metrics-service.yaml
apiVersion: v1
kind: Service
metadata:
name: kube-state-metrics
namespace: kube-system
labels:
k8s-app: kube-state-metrics
annotations:
prometheus.io/scrape: 'true'
spec:
ports:
- name: http-metrics
port: 8080
targetPort: http-metrics
protocol: TCP
- name: telemetry
port: 8081
targetPort: telemetry
protocol: TCP
type: NodePort #添加这一行
selector:
k8s-app: kube-state-metrics 注意两点:
其中有个是镜像是"k8s.gcr.io/addon-resizer:1.8.3"在国内因为某些原因无法拉取,可以更换为"ist0ne/addon-resizer"即可正常使用。或者通过*下载。
service 如果需要集群外部访问,需要改为NodePort 2)执行所有定义文件
需要提前FQ将quay.io/coreos/kube-state-metrics:v1.5.0 和 k8s.gcr.io/addon-resizer:1.8.3镜像下载下来,然后上传到node节点上,然后执行"docker load ......" 导入到node节点的images镜像里
或者从微软中国提供的gcr.io免费代理下载被墙的镜像,然后在修改yaml文件里更新dashboard的镜像下载地址。由于已经提前将相应镜像导入到各node节点的image里了,
所以需要将kube-state-metrics-deployment.yaml文件中的镜像拉取策略修改为"IfNotPresent".即本地有则使用本地镜像,不拉取。 [root@k8s-master01 kube-state-metrics]# kubectl create -f . 执行后检查一下:
[root@k8s-master01 kube-state-metrics]# kubectl get pod -n kube-system|grep kube-state-metrics
kube-state-metrics-5dd55c764d-nnsdv 2/2 Running 0 9m3s [root@k8s-master01 kube-state-metrics]# kubectl get svc -n kube-system|grep kube-state-metrics
kube-state-metrics NodePort 10.254.228.212 <none> 8080:30978/TCP,8081:30872/TCP 9m14s [root@k8s-master01 kube-state-metrics]# kubectl get pod,svc -n kube-system|grep kube-state-metrics
pod/kube-state-metrics-5dd55c764d-nnsdv 2/2 Running 0 9m12s
service/kube-state-metrics NodePort 10.254.228.212 <none> 8080:30978/TCP,8081:30872/TCP 9m18s 3)验证kube-state-metrics数据采集
通过上面的检查,可以得知映射到外部访问的NodePort端口是30978,通过任意一个node工作节点即可验证访问:
[root@k8s-master01 kube-state-metrics]# curl http://172.16.60.244:30978/metrics|head -10
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0# HELP kube_configmap_info Information about configmap.
# TYPE kube_configmap_info gauge
kube_configmap_info{namespace="kube-system",configmap="extension-apiserver-authentication"} 1
kube_configmap_info{namespace="kube-system",configmap="coredns"} 1
kube_configmap_info{namespace="kube-system",configmap="kubernetes-dashboard-settings"} 1
# HELP kube_configmap_created Unix creation timestamp
# TYPE kube_configmap_created gauge
kube_configmap_created{namespace="kube-system",configmap="extension-apiserver-authentication"} 1.560825764e+09
kube_configmap_created{namespace="kube-system",configmap="coredns"} 1.561479528e+09
kube_configmap_created{namespace="kube-system",configmap="kubernetes-dashboard-settings"} 1.56148146e+09
100 73353 0 73353 0 0 9.8M 0 --:--:-- --:--:-- --:--:-- 11.6M
curl: (23) Failed writing body (0 != 2048)
11.5 - 部署 harbor 私有仓库
安装的话,可以参考Docker私有仓库Harbor介绍和部署记录,需要在两台节点机172.16.60.247、172.16.60.248上都安装harbor私有仓库环境。上层通过Nginx+Keepalived实现Harbor的负载均衡+高可用,两个Harbor相互同步(主主复制)。 harbor上远程同步的操作:1)"仓库管理"创建目标,创建后可以测试是否正常连接目标。2)"同步管理"创建规则,在规则中调用上面创建的目标。3)手动同步或定时同步。
例如:已经在172.16.60.247这台harbor节点的私有仓库library和kevin_img的项目里各自存放了镜像,如下:
现在要把172.16.60.247的harbor私有仓库的这两个项目下的镜像同步到另一个节点172.16.60.248的harbor里。同步同步方式:147 -> 148 或 147 <- 148
上面是手动同步,也可以选择定时同步,分别填写的是"秒 分 时 日 月 周", 如下每两分钟同步一次! 则过了两分钟之后就会自动同步过来了~
11.6 - kubernetes集群管理测试
[root@k8s-master01 ~]# kubectl get cs
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-2 Healthy {"health":"true"}
etcd-0 Healthy {"health":"true"}
etcd-1 Healthy {"health":"true"} [root@k8s-master01 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-node01 Ready <none> 20d v1.14.2
k8s-node02 Ready <none> 20d v1.14.2
k8s-node03 Ready <none> 20d v1.14.2 部署测试实例
[root@k8s-master01 ~]# kubectl run kevin-nginx --image=nginx --replicas=3
kubectl run --generator=deployment/apps.v1 is DEPRECATED and will be removed in a future version. Use kubectl run --generator=run-pod/v1 or kubectl create instead.
deployment.apps/kevin-nginx created [root@k8s-master01 ~]# kubectl run --generator=run-pod/v1 kevin-nginx --image=nginx --replicas=3
pod/kevin-nginx created 稍等一会儿,查看创建的kevin-nginx的pod(由于创建时要自动下载nginx镜像,所以需要等待一段时间)
[root@k8s-master01 ~]# kubectl get pods --all-namespaces|grep "kevin-nginx"
default kevin-nginx 1/1 Running 0 98s
default kevin-nginx-569dcd559b-6h4nn 1/1 Running 0 106s
default kevin-nginx-569dcd559b-7f2b4 1/1 Running 0 106s
default kevin-nginx-569dcd559b-7tds2 1/1 Running 0 106s 查看具体详细事件
[root@k8s-master01 ~]# kubectl get pods --all-namespaces -o wide|grep "kevin-nginx"
default kevin-nginx 1/1 Running 0 2m13s 172.30.72.12 k8s-node03 <none> <none>
default kevin-nginx-569dcd559b-6h4nn 1/1 Running 0 2m21s 172.30.56.7 k8s-node02 <none> <none>
default kevin-nginx-569dcd559b-7f2b4 1/1 Running 0 2m21s 172.30.72.11 k8s-node03 <none> <none>
default kevin-nginx-569dcd559b-7tds2 1/1 Running 0 2m21s 172.30.88.8 k8s-node01 <none> <none> [root@k8s-master01 ~]# kubectl get deployment|grep kevin-nginx
kevin-nginx 3/3 3 3 2m57s 创建svc
[root@k8s-master01 ~]# kubectl expose deployment kevin-nginx --port=8080 --target-port=80 --type=NodePort [root@k8s-master01 ~]# kubectl get svc|grep kevin-nginx
nginx NodePort 10.254.111.50 <none> 8080:32177/TCP 33s 集群内部,各pod之间访问kevin-nginx
[root@k8s-master01 ~]# curl http://10.254.111.50:8080 外部访问kevin-nginx的地址为http://node_ip/32177
http://172.16.60.244:32177
http://172.16.60.245:32177
http://172.16.60.246:32177
11.7 - 清理kubernetes集群
1)清理 Node 节点 (node节点同样操作)
停相关进程:
[root@k8s-node01 ~]# systemctl stop kubelet kube-proxy flanneld docker kube-proxy kube-nginx 清理文件:
[root@k8s-node01 ~]# source /opt/k8s/bin/environment.sh umount kubelet 和 docker 挂载的目录
[root@k8s-node01 ~]# mount | grep "${K8S_DIR}" | awk '{print $3}'|xargs sudo umount 删除 kubelet 工作目录
[root@k8s-node01 ~]# sudo rm -rf ${K8S_DIR}/kubelet 删除 docker 工作目录
[root@k8s-node01 ~]# sudo rm -rf ${DOCKER_DIR} 删除 flanneld 写入的网络配置文件
[root@k8s-node01 ~]# sudo rm -rf /var/run/flannel/ 删除 docker 的一些运行文件
[root@k8s-node01 ~]# sudo rm -rf /var/run/docker/ 删除 systemd unit 文件
[root@k8s-node01 ~]# sudo rm -rf /etc/systemd/system/{kubelet,docker,flanneld,kube-nginx}.service 删除程序文件
[root@k8s-node01 ~]# sudo rm -rf /opt/k8s/bin/* 删除证书文件
[root@k8s-node01 ~]# sudo rm -rf /etc/flanneld/cert /etc/kubernetes/cert 清理 kube-proxy 和 docker 创建的 iptables
[root@k8s-node01 ~]# iptables -F && sudo iptables -X && sudo iptables -F -t nat && sudo iptables -X -t nat 删除 flanneld 和 docker 创建的网桥:
[root@k8s-node01 ~]# ip link del flannel.1
[root@k8s-node01 ~]# ip link del docker0
2)清理 Master 节点 (master节点同样操作)
停相关进程:
[root@k8s-master01 ~]# systemctl stop kube-apiserver kube-controller-manager kube-scheduler kube-nginx 清理文件:
删除 systemd unit 文件
[root@k8s-master01 ~]# rm -rf /etc/systemd/system/{kube-apiserver,kube-controller-manager,kube-scheduler,kube-nginx}.service 删除程序文件
[root@k8s-master01 ~]# rm -rf /opt/k8s/bin/{kube-apiserver,kube-controller-manager,kube-scheduler} 删除证书文件
[root@k8s-master01 ~]# rm -rf /etc/flanneld/cert /etc/kubernetes/cert 清理 etcd 集群
[root@k8s-master01 ~]# systemctl stop etcd 清理文件:
[root@k8s-master01 ~]# source /opt/k8s/bin/environment.sh 删除 etcd 的工作目录和数据目录
[root@k8s-master01 ~]# rm -rf ${ETCD_DATA_DIR} ${ETCD_WAL_DIR} 删除 systemd unit 文件
[root@k8s-master01 ~]# rm -rf /etc/systemd/system/etcd.service 删除程序文件
[root@k8s-master01 ~]# rm -rf /opt/k8s/bin/etcd 删除 x509 证书文件
[root@k8s-master01 ~]# rm -rf /etc/etcd/cert/*
上面部署的dashboard是https证书方式,如果是http方式访问的kubernetes集群web-ui,操作如下:
1)配置kubernetes-dashboard.yaml (里面的"k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.1"镜像已经提前在node节点上下载了)
[root@k8s-master01 ~]# cd /opt/k8s/work/
[root@k8s-master01 work]# cat kubernetes-dashboard.yaml
# ------------------- Dashboard Secret ------------------- # apiVersion: v1
kind: Secret
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard-certs
namespace: kube-system
type: Opaque ---
# ------------------- Dashboard Service Account ------------------- # apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kube-system ---
# ------------------- Dashboard Role & Role Binding ------------------- # kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: kubernetes-dashboard-minimal
namespace: kube-system
rules:
# Allow Dashboard to create 'kubernetes-dashboard-key-holder' secret.
- apiGroups: [""]
resources: ["secrets"]
verbs: ["create"]
# Allow Dashboard to create 'kubernetes-dashboard-settings' config map.
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["create"]
# Allow Dashboard to get, update and delete Dashboard exclusive secrets.
- apiGroups: [""]
resources: ["secrets"]
resourceNames: ["kubernetes-dashboard-key-holder", "kubernetes-dashboard-certs"]
verbs: ["get", "update", "delete"]
# Allow Dashboard to get and update 'kubernetes-dashboard-settings' config map.
- apiGroups: [""]
resources: ["configmaps"]
resourceNames: ["kubernetes-dashboard-settings"]
verbs: ["get", "update"]
# Allow Dashboard to get metrics from heapster.
- apiGroups: [""]
resources: ["services"]
resourceNames: ["heapster"]
verbs: ["proxy"]
- apiGroups: [""]
resources: ["services/proxy"]
resourceNames: ["heapster", "http:heapster:", "https:heapster:"]
verbs: ["get"] ---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: kubernetes-dashboard-minimal
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: kubernetes-dashboard-minimal
subjects:
- kind: ServiceAccount
name: kubernetes-dashboard
namespace: kube-system ---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: kubernetes-dashboard
subjects:
- kind: ServiceAccount
name: kubernetes-dashboard
namespace: kube-system
roleRef:
kind: ClusterRole
name: cluster-admin
apiGroup: rbac.authorization.k8s.io
---
# ------------------- Dashboard Deployment ------------------- # kind: Deployment
apiVersion: apps/v1beta2
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kube-system
spec:
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
k8s-app: kubernetes-dashboard
template:
metadata:
labels:
k8s-app: kubernetes-dashboard
spec:
serviceAccountName: kubernetes-dashboard-admin
containers:
- name: kubernetes-dashboard
image: k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.1
ports:
- containerPort: 9090
protocol: TCP
args:
#- --auto-generate-certificates
# Uncomment the following line to manually specify Kubernetes API server Host
# If not specified, Dashboard will attempt to auto discover the API server and connect
# to it. Uncomment only if the default does not work.
#- --apiserver-host=http://10.0.1.168:8080
volumeMounts:
- name: kubernetes-dashboard-certs
mountPath: /certs
# Create on-disk volume to store exec logs
- mountPath: /tmp
name: tmp-volume
livenessProbe:
httpGet:
scheme: HTTP
path: /
port: 9090
initialDelaySeconds: 30
timeoutSeconds: 30
volumes:
- name: kubernetes-dashboard-certs
secret:
secretName: kubernetes-dashboard-certs
- name: tmp-volume
emptyDir: {}
serviceAccountName: kubernetes-dashboard
# Comment the following tolerations if Dashboard must not be deployed on master
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule ---
# ------------------- Dashboard Service ------------------- # kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kube-system
spec:
ports:
- port: 9090
targetPort: 9090
selector:
k8s-app: kubernetes-dashboard # ------------------------------------------------------------
kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard-external
namespace: kube-system
spec:
ports:
- port: 9090
targetPort: 9090
nodePort: 30090
type: NodePort
selector:
k8s-app: kubernetes-dashboard 创建这个yaml文件
[root@k8s-master01 work]# kubectl create -f kubernetes-dashboard.yaml 稍微等一会儿,查看kubernetes-dashboard的pod创建情况(如下可知,该pod落在了k8s-node03节点上,即172.16.60.246)
[root@k8s-master01 work]# kubectl get pods -n kube-system -o wide|grep "kubernetes-dashboard"
kubernetes-dashboard-7976c5cb9c-q7z2w 1/1 Running 0 10m 172.30.72.6 k8s-node03 <none> <none> [root@k8s-master01 work]# kubectl get svc -n kube-system|grep "kubernetes-dashboard"
kubernetes-dashboard-external NodePort 10.254.227.142 <none> 9090:30090/TCP 10m