首页 > 其他分享 >一次calico-kube-controllers 一直处于创建中引发的后续

一次calico-kube-controllers 一直处于创建中引发的后续

时间:2024-07-07 22:19:07浏览次数:17  
标签:License 16 containerd controllers Running kube calico

背景:

由于课程代码都是基于amd64架构进行编写的,这将导致我的主力机arm64架构机器无法顺利进行实验内容,因此我得在x64的机器上进行实验内容,先是需要搭建K8S环境,此处省略搭建步骤,在我进行kubeadm init操作后,发现镜像拉取一直不成功,镜像地址我写的是默认从K8S官方地址拉取镜像的(这里提一下为什么不写国内镜像地址的原因,原因在于国内镜像仓库更新速度过慢,有时候拉取一些images时会找不到),于是我在我的宿主机开启了代理启用了全局代理模式,发现我的K8S集群仍是无法拉取镜像,提示TimeOut。

于是,我将流量转发配置写在了containerd.service文件内,如以下所示:

root@Y76-Master01-16-181:~# cat /usr/lib/systemd/system/containerd.service 
# Copyright The containerd Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

[Unit]
Description=containerd container runtime
Documentation=https://containerd.io
After=network.target local-fs.target

[Service]
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/bin/containerd

Type=notify
Delegate=yes
KillMode=process
Restart=always
RestartSec=5

# 添加以下三行
Environment="HTTPS_PROXY=http://172.164.17.103:9999"
Environment="HTTP_PROXY=http://172.164.17.103:9999"
Environment="ALL_PROXY=socks5://172.164.17.103:9999"

# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNPROC=infinity
LimitCORE=infinity
LimitNOFILE=infinity
# Comment TasksMax if your systemd version does not supports it.
# Only systemd 226 and above support this version.
TasksMax=infinity
OOMScoreAdjust=-999

[Install]
WantedBy=multi-user.target

 添加代理后,再次进行kubeadm init 操作后,K8S的镜像能顺利被拉到本地中,即通过 crictl命令查看

root@Y76-Master01-16-181:~# crictl -r unix:///var/run/containerd/containerd.sock images

 此处,集群组件状态一切正常,都处于Runing状态,于是我进行calico部署,calico-node Pod都处于Runing状态,唯独calico-kube-controllers  Pod一直处于创建中,查看Pod详细信息

root@Y76-Master01-16-181:~# kubectl describe pod -n kube-system calico-kube-controllers-9449c44c5-v8ssv 


Normal   Scheduled               72s    default-scheduler  Successfully assigned kube-system/calico-kube-controllers-57b57c56f-wz4wm to y76-node01-16-182
Warning  FailedCreatePodSandBox  52s               kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "c0805304ad1009d138d00cad8b5a4d9ddfdd27b8d6a8a886d4df4690cace4452": plugin type="calico" failed (add): error getting ClusterInformation: Get "https://10.96.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default": net/http: TLS handshake timeout
Normal   SandboxChanged          5s (x3 over 51s)  kubelet            Pod sandbox changed, it will be killed and re-created.

 到此处,我进行了一系列排查,但没能解决calico-kube-controllers 状态问题,即使我创建新的Pod也是无法成功创建出来,报错如上图一致,当百思不得其解时,我将虚拟机都还原成原先的快照,填写了国内的镜像地址后进行kubeadm init时,能成功了将所有组件的Pod都Runing起来

思考:

仅仅是镜像地址不同,但却是两个结果,这不应该。我联想到了我一开始的proxy代理操作(即在containerd.service配置了代理),于是,我将虚拟机再次还原快照,重新填写K8S官方的镜像仓库地址,再次进行kubeadm 初始化时,遇到了同样问题,我将containrd.service的配置改成如下内容:

root@Y76-Master01-16-181:~# cat /usr/lib/systemd/system/containerd.service 
# Copyright The containerd Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

[Unit]
Description=containerd container runtime
Documentation=https://containerd.io
After=network.target local-fs.target

[Service]
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/bin/containerd

Type=notify
Delegate=yes
KillMode=process
Restart=always
RestartSec=5

# 优化成以下内容
Environment="HTTPS_PROXY=http://172.164.17.103:9999"
Environment="HTTP_PROXY=http://172.164.17.103:9999"
Environment="NO_PROXY=localhost,127.0.0.1,172.16.0.0/12,10.96.0.0/12,10.244.0.0/16"


# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNPROC=infinity
LimitCORE=infinity
LimitNOFILE=infinity
# Comment TasksMax if your systemd version does not supports it.
# Only systemd 226 and above support this version.
TasksMax=infinity
OOMScoreAdjust=-999

[Install]
WantedBy=multi-user.target


随时将配置同步给其他节点且重启containerd服务

root@Y76-Master01-16-181:~# ansible all -m copy -a "src=/usr/lib/systemd/system/containerd.service  dest=/usr/lib/systemd/system/containerd.service"

root@Y76-Master01-16-181:~# ansible all -m shell -a "systemctl daemon-reload && systemctl restart containerd.service "

 果不其然,Pod状态一切正常

root@Y76-Master01-16-181:~# kubectl get pod -n kube-system 
NAME                                          READY   STATUS    RESTARTS       AGE
calico-kube-controllers-9449c44c5-v8ssv       1/1     Running   0              92m
calico-node-97qbc                             1/1     Running   3 (38m ago)    6h1m
calico-node-bl59h                             1/1     Running   2 (178m ago)   6h1m
calico-node-rzzq7                             1/1     Running   2 (178m ago)   6h1m
coredns-567c556887-8knp9                      1/1     Running   3 (51m ago)    8h
coredns-567c556887-dwg6d                      1/1     Running   2 (178m ago)   8h
etcd-y76-master01-16-181                      1/1     Running   3 (178m ago)   8h
kube-apiserver-y76-master01-16-181            1/1     Running   3 (178m ago)   8h
kube-controller-manager-y76-master01-16-181   1/1     Running   6 (46m ago)    5h46m
kube-proxy-88nd6                              1/1     Running   2 (178m ago)   5h47m
kube-proxy-vrgtp                              1/1     Running   2 (178m ago)   5h47m
kube-proxy-z5jmc                              1/1     Running   2 (178m ago)   5h47m
kube-scheduler-y76-master01-16-181            1/1     Running   6 (46m ago)    8h

 总结:

在进行排错时,应当回想操作过程中自己执行了哪些操作,再排查问题时,应当细究自己做的操作会有怎样的影响,例如此次操作,我将proxy代理给了宿主机,这意味着我的Pod会把流量转发给宿主机,通过宿主机进行通信,而Pod要通信的对端IP地址正是我定义的Pod网段(10.96.0.0/12,10.244.0.0/16),这通过宿主机进行通信肯定是找不到对端的。

 

标签:License,16,containerd,controllers,Running,kube,calico
From: https://www.cnblogs.com/Ky150/p/18288133

相关文章

  • 2.基于Containerd运行时搭建Kubernetes多控制平面集群实践-腾讯云开发者社区-腾讯云
    https://cloud.tencent.com/developer/article/2129846 2.基于Containerd运行时搭建Kubernetes多控制平面集群实践发布于2022-09-2919:27:53 1K0 举报文章被收录于专栏:全栈工程师修炼之路[TOC] 0x00前言简述本章主要讲述,如果使用kubead......
  • kubernetes集群部署:node节点部署和cri-docker运行时安装(四)
    安装前准备同《kubernetes集群部署:环境准备及master节点部署(二)》安装cri-docker在Kubernetes1.20版本之前,Docker是Kubernetes默认的容器运行时。然而,Kubernetes社区决定在Kubernetes1.20及以后的版本中逐步淘汰对Docker的直接支持,一直到Kubernetes1.24版本彻底......
  • KubeSphere 社区双周报|2024.06.21-07.04
    KubeSphere社区双周报主要整理展示新增的贡献者名单和证书、新增的讲师证书以及两周内提交过commit的贡献者,并对近期重要的PR进行解析,同时还包含了线上/线下活动和布道推广等一系列社区动态。本次双周报涵盖时间为:2024.06.21-07.04。贡献者名单新晋KubeSpherecontribu......
  • Kubernetes——Helm(二)
    我们已经知道了如何将信息传到模板中。但是传入的信息并不能被修改。有时我们希望以一种更有用的方式来转换所提供的数据。一、函数初体验quote函数:把.Values对象中的字符串属性用引号引起来,然后放到模板中。apiVersion:v1kind:ConfigMapmetadata:name:{{.Rele......
  • Kubernetes client-go源码走读
    Informer机制Kubernetes使用Informer代替Controller去访问APIServer,Controller的所有操作都和Informer进行交互,而Informer并不会每次都去访问APIServer。Informer使用ListAndWatch的机制,在Informer首次启动时,会调用LISTAPI获取所有最新版本的资源对象,然后再通过WATCH......
  • Linux容器篇-使用kubeadm搭建一个kubernetes集群
    kubernetes集群架构和组件master节点组件kube-apiserver:KubernetesAPI,集群的统一入口,各组件的协调者,以RESTfulAPI提供接口服务,所有对象资源的增删改查和监听操作都交给APIserver处理后再交给Etcd存储。kube-controller-manager:处理集群中的常规后台事务,一个资源对应......
  • 部署KubeEdge、Edgemesh、Sedna
    https://neuromansser.tech/posts/部署kubeedgeedgemeshsedna/ 部署KubeEdge、Edgemesh、SednaPostedonJun10,2024下载keadm下载keadm用于安装KubeEdge,官方文档:https://kubeedge.io/docs/setup/install-with-keadm/(英文版里有下载的部分中文版文档却没有,就有点迷惑…......
  • Kubernetes——批量导出 Kubernetes 命名空间下的资源 Manifest 及 CRD 资源
    批量导出Kubernetes命名空间下的资源Manifest及CRD资源一、需求背景  在Kubernetes集群的日常管理和维护中,经常需要备份或迁移集群中的资源。为了高效地进行这一操作,本脚本旨在批量导出指定命名空间下的多种Kubernetes资源(如Deployment、StatefulSet、Pods、Conf......
  • 基于Kube-Prometheus/v0.13.0的K8S监控部署
    Kube-Prometheus不同版本支持的Kubernetes版本信息如下:kube-prometheusstackKubernetes1.22Kubernetes1.23Kubernetes1.24Kubernetes1.25Kubernetes1.26Kubernetes1.27Kubernetes1.28release-0.10✔✔✗✗xxxrelease-0.11✗✔✔✗xxx......
  • Kubernetes云原生存储解决方案openebs部署实践-4.0.1版本(helm部署)
    Kubernetes云原生存储解决方案openebs部署实践-4.0.1版本(helm部署)简介OpenEBS是一种开源云原生存储解决方案。OpenEBS可以将Kubernetes工作节点可用的任何存储转化为本地或复制的Kubernetes持久卷。OpenEBS帮助应用和平台团队轻松地部署需要快速、持久耐用、可靠且可扩展......