Kubernetes之(四)kubeadm部署集群

时间:2021-08-18 08:12:28

Kubernetes之(四)kubeadm部署集群

kubeadm是Kubernetes项目自带的集群构建工具,它负责执行构建一个最小化的可用集群以及将其启动等的必要基本步骤,简单来讲,kubeadm是Kubernetes集群全生命周期的管理工具,可用于实现集群的部署、升级/降级及拆除。
kubeadm集成了kubeadminit和kubeadmjoin等工具程序,其中kubeadminit用于集群的快速初始化,初始化,其核心功能是部署Master节点的各个组件,而kubeadmjoin则用于将节点快速加入到指定集群中,它们是创建Kubernetes集群最佳实践的“快速路径”。另外,kubeadmtoken可于集群构建后管理用于加入集群时使用的认证令牌(token),而kubeadmreset命令的功能则是删除集群构建过程中生成的文件以重置回初始状态。kubeadm还支持管理初始引导认证令牌(BootstrapToken),完成待加入的新节点首次联系APIServer时的身份认证(基于共享密钥)。另外,它们还支持管理集群版本的升级和降级操作。
使用kubeadm部署集群有以下几大优势:

  • 简单易用,kubeadm可完成集群的部署、升级和拆除操作,对新手用户非常友好。
  • 使用领域广,支持将集群部署于裸机、VMware、AWS、Azure、GCE及更多环境的主机上,并且部署过程基本一致。
  • 富有弹性:1.11版本及其以后版本的kubeadm支持阶段式部署,管理员可分为多个步骤独立操作。
  • 生产环境可用,kubeadm遵循以最佳实践的方式部署Kubernetes集群,它强制启用RBAC,设定Master各组件间以及API Server与kubelet之间进行认证及安全通信,并锁定了kubelet API等。

1、部署前准备

实验环境准备

主机 IP地址 系统版本 Pod网段 Services网段 虚拟机内存
master 10.0.10 Centos 7.5 1804 10.244.0.0/16 10.96.0.0/12 2G
node01 10.0.11 Centos 7.5 1804 10.244.0.0/16 10.96.0.0/12 2G
node02 10.0.12 Centos 7.5 1804 10.244.0.0/16 10.96.0.0/12 2G

kubernetes官方要求内存最低2G起.个人建议4G起好一些
所有机器运行

#关闭防火墙和selinux
systemctl stop firewalld &&systemctl disable firewalld &&setenforce 0
#关闭swap分区
[root@master ~]# swapoff -a
#配置syctl.conf核内核参数
[root@master ~]# cat youhua.sh    
#!/bin/sh
echo "* soft nofile 190000" >> /etc/security/limits.conf
echo "* hard nofile 200000" >> /etc/security/limits.conf
echo "* soft nproc 252144" >> /etc/security/limits.conf
echo "* hadr nproc 262144" >> /etc/security/limits.conf

tee /etc/sysctl.conf <<-'EOF'
# System default settings live in /usr/lib/sysctl.d/00-system.conf.
# To override those settings, enter new settings here, or in an /etc/sysctl.d/<name>.conf file
#
# For more information, see sysctl.conf(5) and sysctl.d(5).

net.ipv4.tcp_tw_recycle = 0
net.ipv4.ip_local_port_range = 10000 61000
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_fin_timeout = 30
net.ipv4.ip_forward = 1
net.core.netdev_max_backlog = 2000
net.ipv4.tcp_mem = 131072  262144  524288
net.ipv4.tcp_keepalive_intvl = 30
net.ipv4.tcp_keepalive_probes = 3
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_max_syn_backlog = 2048
net.ipv4.tcp_low_latency = 0
net.core.rmem_default = 256960
net.core.rmem_max = 513920
net.core.wmem_default = 256960
net.core.wmem_max = 513920
net.core.somaxconn = 2048
net.core.optmem_max = 81920
net.ipv4.tcp_mem = 131072  262144  524288
net.ipv4.tcp_rmem = 8760  256960  4088000
net.ipv4.tcp_wmem = 8760  256960  4088000
net.ipv4.tcp_keepalive_time = 1800
net.ipv4.tcp_sack = 1
net.ipv4.tcp_fack = 1
net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_syn_retries = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-arptables = 1
EOF
echo "options nf_conntrack hashsize=819200" >> /etc/modprobe.d/mlx4.conf 
modprobe br_netfilter
sysctl -p
[root@master ~]# sh youhua.sh

准备工作

**配置host解析**
[root@master ~]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
10.0.0.10 master
10.0.0.11 node01
10.0.0.12 node02
#生成公钥并发送至node01 node02节点
[root@master ~]# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): 
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:2sVd/Xh8vuuUOkoy+PP1VHQvsf08JuzLCWBGyfOHQPA root@master
The key's randomart image is:
+---[RSA 2048]----+
|       ...       |
|        + .    . |
|         E    o +|
|        ..+... B+|
|        S+oo..+ O|
|       o+.. o  ==|
|      ...o o + *+|
|        ..+ =.O o|
|         .oo.*+=.|
+----[SHA256]-----+
[root@master ~]# ssh-copy-id node01
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
The authenticity of host 'node01 (10.0.0.11)' can't be established.
ECDSA key fingerprint is SHA256:0WjkfswyQXRv+zeS03AF9xLANd4uZtFo0YcY7kGiagA.
ECDSA key fingerprint is MD5:32:e0:54:7e:8c:a0:1c:59:17:7b:00:3a:71:89:e1:a4.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@node01's password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'node01'"
and check to make sure that only the key(s) you wanted were added.

[root@master ~]# ssh-copy-id node02
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
The authenticity of host 'node02 (10.0.0.12)' can't be established.
ECDSA key fingerprint is SHA256:0WjkfswyQXRv+zeS03AF9xLANd4uZtFo0YcY7kGiagA.
ECDSA key fingerprint is MD5:32:e0:54:7e:8c:a0:1c:59:17:7b:00:3a:71:89:e1:a4.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@node02's password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'node02'"
and check to make sure that only the key(s) you wanted were added.

配置源

#配置docker-ce的阿里源
[root@master ~]# wget -O /etc/yum.repos.d/docker-ce.repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
--2019-03-27 14:17:07--  https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
正在解析主机 mirrors.aliyun.com (mirrors.aliyun.com)... 183.60.159.225, 183.60.159.230, 183.60.159.232, ...
正在连接 mirrors.aliyun.com (mirrors.aliyun.com)|183.60.159.225|:443... 已连接。
已发出 HTTP 请求,正在等待回应... 200 OK
长度:2640 (2.6K) [application/octet-stream]
正在保存至: “/etc/yum.repos.d/docker-ce.repo”

100%[==============================================>] 2,640       --.-K/s 用时 0s      

2019-03-27 14:17:12 (261 MB/s) - 已保存 “/etc/yum.repos.d/docker-ce.repo” [2640/2640])
#配置Kubernetes的阿里源
[root@master ~]# cat /etc/yum.repos.d/k8s.repo 
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
enabled=1
#拷贝给node节点

[root@master ~]# cd /etc/yum.repos.d/
[root@master yum.repos.d]# scp ./docker-ce.repo ./k8s.repo node01:/etc/yum.repos.d/
docker-ce.repo                                        100% 2640     3.4MB/s   00:00    
k8s.repo                                              100%  202   311.7KB/s   00:00    
[root@master yum.repos.d]# scp ./docker-ce.repo ./k8s.repo node02:/etc/yum.repos.d/
docker-ce.repo                                        100% 2640     2.8MB/s   00:00    
k8s.repo                                              100%  202   252.1KB/s   00:00  
#重建yum缓存
[root@master yum.repos.d]# yum makecache fast
已加载插件:fastestmirror
Loading mirror speeds from cached hostfile
 * base: mirrors.aliyun.com
 * extras: mirrors.aliyun.com
 * updates: mirrors.aliyun.com
base                                                             | 3.6 kB  00:00:00     
docker-ce-stable                                                 | 3.5 kB  00:00:00     
extras                                                           | 3.4 kB  00:00:00     
kubernetes                                                       | 1.4 kB  00:00:00     
updates                                                          | 3.4 kB  00:00:00     
(1/2): kubernetes/primary                                        |  47 kB  00:00:05     
(2/2): updates/7/x86_64/primary_db                               | 3.4 MB  00:00:08     
kubernetes                                                                      336/336
元数据缓存已建立

安装docker-ce和kubetnetes所需软件
目前 kubernetes官方支持的docker-ce最高版本是18.06,亲测18.09也可以使用,初始化会提示warnning,但是不会报错,运行也无异常;这里kubelet,kubeadm,kubectl使用1.13.1,docker使用18.06。

节点 部署软件及版本
master docker-ce 18.06、kubelet 1.13.1、kubeadm 1.13.1、kubectl 1.13.1
node01 docker-ce 18.06、kubelet 1.13.1、kubeadm 1.13.1
node02 docker-ce 18.06、kubelet 1.13.1、kubeadm 1.13.1
#master
[root@master ~]# yum install -y kubelet-1.13.1 kubeadm-1.13.1 kubectl-1.13.1 docker-ce-18.06.3.ce-3.el7
[root@master ~]# systemctl start docker kubelet &&systemctl enable docker kubelet
#node01 node02
[root@node01 ~]# yum install -y kubelet-1.13.1 kubeadm-1.13.1 docker-ce-18.06.3.ce-3.el7
[root@node01 ~]# systemctl start docker kubelet &&systemctl enable docker kubelet
#到此,准备工作完成

2、集群初始化

所有机器都需要初始化容器执行引擎(如docker或frakti等) 和kubelet。这是因为kubeadm依赖kubelet来启动Master组件,比如kube-apiserver、kube-managercontroller、kube-scheduler、kube-proxy等。
master初始化创建集群

[root@master ~]# kubeadm init \
--kubernetes-version=v1.13.1 \
--pod-network-cidr=10.244.0.0/16 \
--service-cidr=10.96.0.0/12 \
--image-repository registry.aliyuncs.com/google_containers \
--apiserver-advertise-address=0.0.0.0 \
--ignore-preflight-errors=Swap
[init] Using Kubernetes version: v1.13.1
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [master kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.0.0.10]
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [master localhost] and IPs [10.0.0.10 127.0.0.1 ::1]
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [master localhost] and IPs [10.0.0.10 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 21.543677 seconds
[uploadconfig] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.13" in namespace kube-system with the configuration for the kubelets in the cluster
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "master" as an annotation
[mark-control-plane] Marking the node master as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node master as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: jffthg.hb83xpoxjcm5vh54
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstraptoken] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstraptoken] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstraptoken] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstraptoken] creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes master has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of machines by running the following on each node
as root:

  kubeadm join 10.0.0.10:6443 --token jffthg.hb83xpoxjcm5vh54 --discovery-token-ca-cert-hash sha256:37c446787d8ff4383a50dc5855afb8e307f4d1fff5e5cdbb9235b8af6e49e28f

上面命令的选项决定了集群环境的众多特性设定,这些设定对于伺候在集群中部署运行应用程序很重要。
--kubernetes-version:正在使用的kubetnetes程序组建的版本号,需要与kubelet版本相同
--pod-network-cidr: Pod的地址范围,其值为CIDR格式的网络地址;使用flannel网络插件时,其默认地址为10.244.0.0/16.
--service-cidr: Service网络地址范围,,其值为CIDR格式的网络地址;其默认地址为10.96.0.0/16.
--apiserver-advertise-address: APIserver通告给其他组件的IP地址,默认为0.0.0.0,表示节点上所有可用的地址
--ignore-preflight-errors=Swap: 忽略哪些运行时的错误,此时表示忽略swap未关闭导致的错误
--image-repository registry.aliyuncs.com/google_containers 表示选择拉取初始化所需镜像的仓库,kubernetes的仓库在国外,国内拉取可能失败,1.13版本支持该选项选择仓库,此仓库为阿里云的仓库

设定kubectl的配置文件
kubectl是执行kubernetes集群管理的核心工具,默认情况kubectl会从当前用户主目录的隐藏目录.kube下名为config的配置文件读取配置,包括要接入的集群,以及用于集群认证的证书或令牌等。初始化集群时,kubeadm会自动生成一个用于此功能的配置文件/et/kubernetes/admin.conf,将它复制到$HOME/.kube/config,这里master节点root用户为例,生产用应该用普通用户身份进行。

[root@master ~]# mkdir -p $HOME/.kube
[root@master ~]# cp -i /etc/kubernetes/admin.conf $HOME/.kube/config 

到此,集群的master节点已经基本配置完成,可以使用API server来验证各组件是否正常,必要时可使用kubeadm reset命令重置之后重新初始化集群。
componentstatus 缩写为cs

[root@master ~]# kubectl get cs
NAME                 STATUS    MESSAGE              ERROR
scheduler            Healthy   ok                   
controller-manager   Healthy   ok                   
etcd-0               Healthy   {"health": "true"} 

node加入集群

[root@node01 ~]#  kubeadm join 10.0.0.10:6443 --token jffthg.hb83xpoxjcm5vh54 --discovery-token-ca-cert-hash sha256:37c446787d8ff4383a50dc5855afb8e307f4d1fff5e5cdbb9235b8af6e49e28f
[preflight] Running pre-flight checks
        [WARNING Hostname]: hostname "node01" could not be reached
        [WARNING Hostname]: hostname "node01": lookup node01 on 114.114.114.114:53: no such host
[discovery] Trying to connect to API Server "10.0.0.10:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://10.0.0.10:6443"
[discovery] Requesting info from "https://10.0.0.10:6443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "10.0.0.10:6443"
[discovery] Successfully established connection with API Server "10.0.0.10:6443"
[join] Reading configuration from the cluster...
[join] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet] Downloading configuration for the kubelet from the "kubelet-config-1.13" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Activating the kubelet service
[tlsbootstrap] Waiting for the kubelet to perform the TLS Bootstrap...
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "node01" as an annotation

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the master to see this node join the cluster.

master上查询集群状态

[root@master ~]# kubectl get nodes
NAME     STATUS     ROLES    AGE    VERSION
master   NotReady   master   10m    v1.13.1
node01   NotReady   <none>   114s   v1.13.1
node02   NotReady   <none>   107s   v1.13.1

此时状态为Notready,因为此时没有网络CNI,需要加载falnnel或者calico网络插件,此处以flannel插件为例

[root@master ~]# kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
podsecuritypolicy.extensions/psp.flannel.unprivileged created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.extensions/kube-flannel-ds-amd64 created
daemonset.extensions/kube-flannel-ds-arm64 created
daemonset.extensions/kube-flannel-ds-arm created
daemonset.extensions/kube-flannel-ds-ppc64le created
daemonset.extensions/kube-flannel-ds-s390x created
#稍等几分钟
[root@master ~]# kubectl get nodes 
NAME     STATUS   ROLES    AGE   VERSION
master   Ready    master   39m   v1.13.1
node01   Ready    <none>   30m   v1.13.1
node02   Ready    <none>   30m   v1.13.1

Kubernetes集群以及部署的插件提供了多种不同的服务 如此前部署过的API Server、kube-dns等。API客户端访问集群时需要事先知道API Server的通告地址,管理员可使用“kubectl cluster-info”命令了解到这些信息:

[root@master ~]# kubectl cluster-info
Kubernetes master is running at https://10.0.0.10:6443
KubeDNS is running at https://10.0.0.10:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

Kubernetes集群server和client版本信息查询,可使用kubectl version命令查看:

[root@master ~]# kubectl version --short=true
Client Version: v1.13.1
Server Version: v1.13.1

从集群中移除节点:
运行过程中,若有节点需要从正常运行得集群中移除,则可使用以下步骤:
1) 在master节点使用如下命令排干 当前节点上得Pod资源并移除Node节点:

~]# kubectl drain node01 --delete-local-data --force --ignire-daemonsets
~]# kubectl delete node node01

2)而后在需要删除得Node上执行如下命令重置系统状态即可完成移除操作:

~]# kubeadm reset

根据官方说明tonken的默认有效时间为24h,由于时间差,导致这里的token失效,可以使用kubeadm token list查看token,发现之前初始化的tonken已经失效了

[root@master ~]# kubeadm token list   
TOKEN                     TTL       EXPIRES                     USAGES                   DESCRIPTION                                                EXTRA GROUPS
jffthg.hb83xpoxjcm5vh54   7h        2019-03-28T16:03:51+08:00   authentication,signing   The default bootstrap token generated by 'kubeadm init'.   system:bootstrappers:kubeadm:default-node-token

那么此处需要重新生成token,生成的方法如下:

[root@master ~]# kubeadm token create

如果没有值--discovery-token-ca-cert-hash,可以通过在master节点上运行以下命令链来获取:

[root@master ~]# openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | \
   openssl dgst -sha256 -hex | sed 's/^.* //'

此时,再运行kube join命令将新增node加入到集群当中,此处的--discovery-token-ca-cert-hash依旧可以使用初始化时的证书

kubeadm join 192.168.56.11:6443 --token 1*** --discovery-token-ca-cert-hash 

到此kubernetes集群部署完成

参考资料

马永亮. Kubernetes进阶实战 (云计算与虚拟化技术丛书)
https://www.cnblogs.com/linuxk/