目录
一、实验
1.环境
2.K8S 1.29版本 部署Jenkins 服务
3.jenkins安装Kubernetes插件
二、问题
1.创建pod失败
2.journalctl如何查看日志信息
2.容器内如何查询jenkins初始密码
3.jenkins离线安装中文包报错
4.jenkins插件报错
一、实验
1.环境
(1)主机
表1 主机
主机 | 架构 | 版本 | IP | 备注 |
master | K8S master节点 | 1.29.0 | 192.168.204.8 | |
node1 | K8S node节点 | 1.29.0 | 192.168.204.9 | |
node2 | K8S node节点 | 1.29.0 | 192.168.204.10 | 已部署Kuboard |
(2)master节点查看集群
1)查看node
kubectl get node
2)查看node详细信息
kubectl get node -o wide
(3)查看pod
[root@master ~]# kubectl get pod -A
(4) 访问Kuboard
http://192.168.204.10:30080/kuboard/cluster
查看节点
2.K8S 1.29版本 部署Jenkins 服务
(1)master节点创建命名空间
[root@master jenkins]# kubectl create ns jenkins
(2)Kuboard查看名称空间
已新增jenkins
http://192.168.204.10:30080/kubernetes/K8S-1.29/cluster/namespace
(3)创建serviceAccount服务账户
用来定义运行在Pod中的进程(容器)对Kubernetes API的访问权限的身份。
[root@master jenkins]# vim serviceAccount.yaml
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: jenkins-admin
rules:
- apiGroups: [""]
resources: ["*"]
verbs: ["*"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: jenkins-admin
namespace: jenkins
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: jenkins-admin
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: jenkins-admin
subjects:
- kind: ServiceAccount
name: jenkins-admin
namespace: jenkins
(4)生成资源
[root@master jenkins]# kubectl apply -f serviceAccount.yaml
(5)创建持久化清单
分配一个名为jenkins-pv-volume的pv容量为5G,在这个pv中分名为jenkins-pv-claim的pvc限制3G,挂载目录为/hone/jenkins,挂载节点为node2
[root@master jenkins]# vim volume.yaml
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: jenkins-pv-volume
labels:
type: local
spec:
storageClassName: local-storage
claimRef:
name: jenkins-pv-claim
namespace: jenkins
capacity:
storage: 5Gi
accessModes:
- ReadWriteMany
local:
path: /home/jenkins
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- node2
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: jenkins-pv-claim
namespace: jenkins
spec:
storageClassName: local-storage
accessModes:
- ReadWriteMany
resources:
requests:
storage: 3Gi
(6) node2节点创建挂载的目录
[root@node2 ~]# mkdir /home/jenkins
[root@node2 ~]# chmod 777 jenkins/
(7)生成pv资源
[root@master jenkins]# kubectl apply -f volume.yaml
查看
[root@master jenkins]# kubectl get pv -n jenkins
(8)docker hub 查看jenkins镜像
https://hub.docker.com/r/jenkins/jenkins/tags
(9)node节点提前拉取镜像
node2节点
[root@node2 ~]# docker pull jenkins/jenkins:2.414.1
(10)创建deployment配置文件
[root@master jenkins]# vim deployment.yaml
挂载pv的节点为node2,image镜像版本为2.440.3-lts-jdk17
apiVersion: apps/v1
kind: Deployment
metadata:
name: jenkins
namespace: jenkins
spec:
replicas: 1
selector:
matchLabels:
app: jenkins
template:
metadata:
labels:
app: jenkins
spec:
nodeSelector:
kubernetes.io/hostname: node2
securityContext:
fsGroup: 1000
runAsUser: 1000
serviceAccountName: jenkins-admin
containers:
- name: jenkins
image: jenkins/jenkins:2.440.3-lts-jdk17
resources:
limits:
memory: "2Gi"
cpu: "1000m"
requests:
memory: "500Mi"
cpu: "500m"
ports:
- name: httpport
containerPort: 8080
- name: jnlpport
containerPort: 50000
livenessProbe:
httpGet:
path: "/login"
port: 8080
initialDelaySeconds: 90
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 5
readinessProbe:
httpGet:
path: "/login"
port: 8080
initialDelaySeconds: 60
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
volumeMounts:
- name: jenkins-data
mountPath: /var/jenkins_home
volumes:
- name: jenkins-data
persistentVolumeClaim:
claimName: jenkins-pv-claim
(11) 创建资源
[root@master jenkins]# kubectl apply -f deployment.yaml
查看pod
[root@master jenkins]# kubectl get pods -n jenkins -o wide -w
(12)创建service
[root@master jenkins]# vim service.yaml
Kubernetes的监控注解配置(配置Prometheus来抓取指标的注解设置):
annotations:这是Kubernetes资源的注解字段,用于附加非标准的元数据。
prometheus.io/scrape:这是一个特殊的注解键,表示是否应该抓取这个服务的指标。
true:这是prometheus.io/scrape注解的值,表示应该抓取这个服务的指标。
prometheus.io/port注解指定了Prometheus用来抓取指标的端口。
具体配置:
apiVersion: v1
kind: Service
metadata:
name: jenkins
namespace: jenkins
annotations:
prometheus.io/scrape: 'true'
prometheus.io/path: /
prometheus.io/port: '8080'
spec:
selector:
app: jenkins
type: NodePort
ports:
- port: 8080
targetPort: 8080
nodePort: 32000
(13)生成service资源并查看
[root@master jenkins]# kubectl apply -f service.yaml
查看svc
[root@master jenkins]# kubectl get svc -n jenkins
(14)Kuboard查看
工作负载
容器组
(15)访问
http://192.168.204.10:32000
(16)获取密码
[root@master jenkins]# kubectl logs -f jenkins-69758d74c9-6tc8s -n jenkins
密码在这一段:
Jenkins initial setup is required. An admin user has been created and a password generated.
Please use the following password to proceed to installation:
df904552c24d46999f2bfb44b0aa916e
(17)输入密码
可以跳过插件安装
可以点击右上角X 跳过安装
开始使用
(18)进入系统
3.jenkins安装Kubernetes插件
(1)点击系统右下角
Website可以跳转中文官网
https://www.jenkins.io/zh/
(2)管理界面
https://192.168.204.10:32000/manage
插件管理
https://192.168.204.10:32000/manage/pluginManager/advanced
设置国内源
https://mirrors.tuna.tsinghua.edu.cn/jenkins/updates/update-center.json
System也可以设置 Resource Root URL
(3)修改密码与时区
(4)重新登录
(5)master节点上重启jenkins
[root@master jenkins]# kubectl delete pods jenkins-69758d74c9-6tc8s -n jenkins
(6)查看pod
[root@master jenkins]# kubectl get pod -n jenkins
(7)node2节点查看
[root@node2 var]# cd /home/jenkins/
[root@node2 jenkins]# ls
(8)容器进入查看
内容与node2的/home/jenkins/ 一致
[root@master jenkins]# kubectl exec -it jenkins-69758d74c9-hxb89 -n jenkins /bin/bash
(9)离线安装包
https://updates.jenkins-ci.org/download/plugins/
下载中文包
https://updates.jenkins-ci.org/download/plugins/localization-zh-cn/
(10)安装中文离线包
安装:
完成:(需要重新拉活pod)
(11)安装Kubernetes插件
完成:
(12) jenkins绑定k8s集群
创建
查看 Kubernetes API server
[root@master ~]# kubectl cluster-info
连接测试 (因为是基于K8S部署的jenkins,也部署了Service Account的所以不需要填key)
测试成功:
完成连接:
(13)最后再次查看Kuboard
jenkins名称空间
kube-system名称空间
(14)其他方式的jenkins部署
可以参考本人博客:
持续集成交付CICD:Jenkins部署-****博客
二、问题
1.创建pod失败
(1)报错
节点创建Pod会一直卡在ContainerCreating的状态无法顺利创建并且就绪,READY状态一直为0/1
(2)原因分析
①查看pod
[root@master jenkins]# kubectl describe pod jenkins-69758d74c9-6tc8s -n jenkins
最后显示FailedCreatePodSandBox
Warning FailedCreatePodSandBox 7m18s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "7d71985b3886817eb93f1885835a0bb869f67a4de34797266ff850f53f62af1c" network for pod "jenkins-69758d74c9-6tc8s": networkPlugin cni failed to set up pod "jenkins-69758d74c9-6tc8s_jenkins" network: plugin type="calico" failed (add): error getting ClusterInformation: connection is unauthorized: Unauthorized, failed to clean up sandbox container "7d71985b3886817eb93f1885835a0bb869f67a4de34797266ff850f53f62af1c" network for pod "jenkins-69758d74c9-6tc8s": networkPlugin cni failed to teardown pod "jenkins-69758d74c9-6tc8s_jenkins" network: plugin type="calico" failed (delete): error getting ClusterInformation: connection is unauthorized: Unauthorized]
Normal SandboxChanged 2m1s (x25 over 7m18s) kubelet Pod sandbox changed, it will be killed and re-created.
②node2节点继续查看cni的日志
sudo journalctl -xe | grep cni
最后一个显示failed to "KillPodSandbox"
4月 19 16:50:55 node2 kubelet[51899]: E0419 16:50:55.608296 51899 kubelet.go:2032] failed to "KillPodSandbox" for "2e3c9e42-396b-4b9b-980a-b8275991b8a8" with KillPodSandboxError: "rpc error: code = Unknown desc = networkPlugin cni failed to teardown pod \"jenkins-696cf86678-jx477_jenkins\" network: plugin type=\"calico\" failed (delete): error getting ClusterInformation: connection is unauthorized: Unauthorized"
4月 19 16:50:55 node2 kubelet[51899]: E0419 16:50:55.608390 51899 pod_workers.go:1298] "Error syncing pod, skipping" err="failed to \"KillPodSandbox\" for \"2e3c9e42-396b-4b9b-980a-b8275991b8a8\" with KillPodSandboxError: \"rpc error: code = Unknown desc = networkPlugin cni failed to teardown pod \\\"jenkins-696cf86678-jx477_jenkins\\\" network: plugin type=\\\"calico\\\" failed (delete): error getting ClusterInformation: connection is unauthorized: Unauthorized\"" pod="jenkins/jenkins-696cf86678-jx477" podUID="2e3c9e42-396b-4b9b-980a-b8275991b8a8"
4月 19 16:50:56 node2 sudo[73408]: root : TTY=pts/1 ; PWD=/etc/cni/net.d ; USER=root ; COMMAND=/bin/journalctl -xe
③CNI的配置文件默认在/etc/cni/net.d/
目录,进入目录查看
[root@node2 net.d]# cd /etc/cni/net.d/
[root@node2 net.d]# ls
nodename为node2,正确的
[root@node2 net.d]# vim 10-calico.conflist
④ 查看kubelet日志
[root@node2 net.d]# journalctl --since="2024-04-19 16:00:00" --until="2024-04-19 17:00:00" -fu kubelet
显示Failed to stop sandbox
4月 19 16:56:55 node2 kubelet[51899]: E0419 16:56:55.626079 51899 kuberuntime_manager.go:1381] "Failed to stop sandbox" podSandboxID={"Type":"docker","ID":"2958227182cb84e9c4bc0d44a662316ab58355f1cb9bb8a1923225d9b37247fc"}
最后显示failed to "KillPodSandbox"
4月 19 16:56:55 node2 kubelet[51899]: E0419 16:56:55.626182 51899 kubelet.go:2032] failed to "KillPodSandbox" for "18e3512f-846a-42c3-a10b-6bb0a2a33533" with KillPodSandboxError: "rpc error: code = Unknown desc = networkPlugin cni failed to teardown pod \"jenkins-69758d74c9-br846_jenkins\" network: plugin type=\"calico\" failed (delete): error getting ClusterInformation: connection is unauthorized: Unauthorized"
⑤ 查看各节点cri-docker 并重启服务
systemctl status cri-docker
systemctl restart cri-docker
⑥ 综上分析
原因是node2节点的cni容器出现了异常无法为pod分配ip导致的卡在ContainerCreating的状态。
(3)解决方法
删除异常节点的calico-node容器,让它拉起重新同步数据即可修复。
① 删除 calico-node-zwfqf
②已重新拉活
查看
③ pod部署成功
2.journalctl如何查看日志信息
(1)命令
1)以flow形式查看日志 实时滚动
journalctl -f
2)查看内核日志
journalctl -k
3)查看指定服务日志 实时滚动最新日志
journalctl -u kubelet
4)查看指定日期日志
journalctl --since="2024-04-19 16:00:00" -fu kubelet
journalctl --since="2024-04-19 16:00:00" --until="2024-04-19 17:00:00" -fu kubelet
# –until “1 hour ago” / –until now
journalctl --since “10 min ago” #显示最近10分钟内的日志
journalctl --since today/yesterday #显示今天/昨天以来的日志
5)查看日志占用的磁盘空间
journalctl --disk-usage
6)设置日志占用的空间
journalctl --vacuum-size=500M
7)设置日志保存的时间
journalctl --vacuum-time=1month
8)检查日志文件一致性
journalctl –-verify
9)显示最后num行的日志,如果省略num,则默认显示最后10行
journalctl -n [num]
10)设置日志输出格式
journalctl -o
#格式如下:
mode的值为(short, short-iso,short-precise, short-monotonic, verbose,export, json, json-pretty, json-sse, ca)
11)正常标准输出 日志默认分页输出,–no-pager改为正常的标准输出
journalctl --no-pager
12)获取指定进程号的日志
journalctl _PID=22856
13)查看指定用户的日志
journalctl _UID=33 --since=today
14)通过系统优先级匹配
journalctl _SYSTEMD_UNIT=cron.service PRIORITY=6
15)查看帮助文档
man journalctl
journalctl -h
2.容器内如何查询jenkins初始密码
(1)node节点上查询获取
[root@master jenkins]# kubectl exec -it jenkins-69758d74c9-lb96b -n jenkins /bin/bash
jenkins@jenkins-69758d74c9-lb96b:/$ cat /var/jenkins_home/secrets/initialAdminPassword
jenkins@jenkins-69758d74c9-lb96b:/$ exit
(2)获取到运行的容器ID,然后进入容器查看初始密码
[root@node2 ~]# docker ps -a
jenkins镜像id为4e586344183a
[root@node2 ~]# docker ps -a | grep jenkins
查看
docker ps -a --filter ancestor=4e586344183a --format "{{.ID}}"
进入容器
docker exec -it 0821261b4091 bash
cat /var/jenkins_home/secrets/initialAdminPassword
3.jenkins离线安装中文包报错
(1)报错
(2)原因分析
需要先安装Localization Support。
(3)解决方法
先离线安装Localization Support:
然后安装中文包:
重新拉活jenkins
[root@master jenkins]# kubectl delete pods jenkins-69758d74c9-hxb89 -n jenkins
观察pod:(68s完成重启)
成功:
4.jenkins插件报错
(1)报错
站点报错
安装Kubernetes离线包报错
(2)原因分析
因为 K8s 集群中运行的 Jenkins 的 pod 无法 ping 通外部网络域名,才导致网络报错问题。
(3)解决方法
①查看系统的 coredns pod 容器信息
[root@master ~]# kubectl get pods -n kube-system -o wide |grep coredns
②Kuboard查看
③查看 dns server 的信息
[root@master ~]# kubectl get svc -n kube-system -o wide
dns server 的 IP是10.96.0.10
④node节点操作无权限
echo "$(sed 's/10.96.0.10/10.244.166.133/g' /etc/resolv.conf)" > /etc/resolv.conf
⑤docker进入容器操作无权限
⑥coredns 扩容
原先coredns只部署在了node1节点,现在扩容为3个
完成:node2节点部署的pod为10.244.104.10
查看pod
[root@master ~]# kubectl get pods -n kube-system -o wide |grep coredns
⑦通过docker cp拷贝进行修改
将容器中的文件拷贝出来
[root@node2 /]# sudo docker cp 0821261b4091:/etc/resolv.conf ~
查看配置文件
修改配置文件
将容器中的文件拷贝回去,还是无权限
[root@node2 ~]# sudo docker cp resolv.conf 0821261b4091:/etc/
⑧root用户进入docker容器
修改文件
echo "$(sed 's/10.96.0.10/10.244.104.10/g' /etc/resolv.conf)" > /etc/resolv.conf
⑨重新启动一下 jenkins 服务
观察pod拉活情况
⑩jenkins成功获取插件信息
安装: