k8s 集群升级

1. 升级概要

  • k8s版本:以 x.y.z 表示,其中 x 是主要版本, y 是次要版本,z 是补丁版本,不能跳过次要版本升级,比如1.28.0->1.30.0,补丁版本可以跳跃更新,比如1.28.0->1.28.10
  • 推荐使用与版本匹配的 kubelet 和 kubeadm,最好各组件版本保持一致
  • 升级后,因为容器spec的哈希值已更改,所有容器都会被重新启动
  • 升级过程需要腾空升每个节点,将工作负载迁移

2. 升级过程

2.1 升级主控制平面节点

控制面节点上的升级过程应该每次处理一个节点。 首先选择一个要先行升级的控制面节点。该节点上必须拥有 /etc/kubernetes/admin.conf 文件。本次模拟从1.23.17升级到1.24.15版本,其他版本升级类似。

2.1.1 升级kubeadm
# 查看当前版本信息
$ kubectl get nodes
NAME           STATUS   ROLES                  AGE    VERSION
k8s-master01   Ready    control-plane,master   148d   v1.23.17
k8s-node01     Ready    <none>                 148d   v1.23.17
k8s-node02     Ready    <none>                 148d   v1.23.17
k8s-node03     Ready    <none>                 148d   v1.23.17

# 查看可升级版本
$ yum list --show-duplicates kubeadm |grep '1.24.'
kubeadm.x86_64                       1.24.0-0                        kubernetes 
kubeadm.x86_64                       1.24.1-0                        kubernetes 
kubeadm.x86_64                       1.24.2-0                        kubernetes 
kubeadm.x86_64                       1.24.3-0                        kubernetes 
kubeadm.x86_64                       1.24.4-0                        kubernetes 
kubeadm.x86_64                       1.24.5-0                        kubernetes 
kubeadm.x86_64                       1.24.6-0                        kubernetes 
kubeadm.x86_64                       1.24.7-0                        kubernetes 
kubeadm.x86_64                       1.24.8-0                        kubernetes 
kubeadm.x86_64                       1.24.9-0                        kubernetes 
kubeadm.x86_64                       1.24.10-0                       kubernetes 
kubeadm.x86_64                       1.24.11-0                       kubernetes 
kubeadm.x86_64                       1.24.12-0                       kubernetes 
kubeadm.x86_64                       1.24.13-0                       kubernetes 
kubeadm.x86_64                       1.24.14-0                       kubernetes 
kubeadm.x86_64                       1.24.15-0                       kubernetes 
kubeadm.x86_64                       1.24.16-0                       kubernetes 
kubeadm.x86_64                       1.24.17-0                       kubernetes

$ yum -y install kubeadm-1.24.15
$ kubeadm version -o yaml
  buildDate: "2023-06-14T09:54:33Z"
  compiler: gc
  gitCommit: 2c67202dc0bb96a7a837cbfb8d72e1f34dfc2808
  gitTreeState: clean
  gitVersion: v1.24.15
  goVersion: go1.19.10
  major: "1"
  minor: "24"
  platform: linux/amd64

2.1.2 校验升级计划,不能有报错信息

$ kubeadm upgrade plan
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[preflight] Running pre-flight checks.
[upgrade] Running cluster health checks
[upgrade] Fetching available versions to upgrade to
[upgrade/versions] Cluster version: v1.23.17
[upgrade/versions] kubeadm version: v1.24.15
W0606 18:11:22.084648  122592 version.go:104] could not fetch a Kubernetes version from the internet: unable to get URL "https://dl.k8s.io/release/stable.txt": Get "https://cdn.dl.k8s.io/release/stable.txt": dial tcp i/o timeout (Client.Timeout exceeded while awaiting headers)
W0606 18:11:22.084821  122592 version.go:105] falling back to the local client version: v1.24.15
[upgrade/versions] Target version: v1.24.15
[upgrade/versions] Latest version in the v1.23 series: v1.23.17

Components that must be upgraded manually after you have upgraded the control plane with 'kubeadm upgrade apply':
kubelet     4 x v1.23.17   v1.24.15

Upgrade to the latest stable version:

kube-apiserver            v1.23.17   v1.24.15
kube-controller-manager   v1.23.17   v1.24.15
kube-scheduler            v1.23.17   v1.24.15
kube-proxy                v1.23.17   v1.24.15
CoreDNS                   v1.8.6     v1.8.6
etcd                      3.5.6-0    3.5.6-0

You can now apply the upgrade by executing the following command:

        kubeadm upgrade apply v1.24.15


The table below shows the current state of component configs as understood by this version of kubeadm.
Configs that have a "yes" mark in the "MANUAL UPGRADE REQUIRED" column require manual config upgrade or
resetting to kubeadm defaults before a successful upgrade can be performed. The version to manually
upgrade to is denoted in the "PREFERRED VERSION" column.

kubeproxy.config.k8s.io   v1alpha1          v1alpha1            no
kubelet.config.k8s.io     v1beta1           v1beta1             no

2.1.3. 执行升级命令,升级控制面组件

$ kubeadm upgrade apply v1.24.15
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[preflight] Running pre-flight checks.
[upgrade] Running cluster health checks
[upgrade/version] You have chosen to change the cluster version to "v1.24.15"
[upgrade/versions] Cluster version: v1.23.17
[upgrade/versions] kubeadm version: v1.24.15
[upgrade/confirm] Are you sure you want to proceed with the upgrade? [y/N]: y 
[upgrade/prepull] Pulling images required for setting up a Kubernetes cluster
[upgrade/prepull] This might take a minute or two, depending on the speed of your internet connection
[upgrade/prepull] You can also perform this action in beforehand using 'kubeadm config images pull'
[upgrade/apply] Upgrading your Static Pod-hosted control plane to version "v1.24.15" (timeout: 5m0s)...
[upgrade/etcd] Upgrading to TLS for etcd
[upgrade/staticpods] Preparing for "etcd" upgrade
[upgrade/staticpods] Current and new manifests of etcd are equal, skipping upgrade
[upgrade/etcd] Waiting for etcd to become available
[upgrade/staticpods] Writing new Static Pod manifests to "/etc/kubernetes/tmp/kubeadm-upgraded-manifests2379417340"
[upgrade/staticpods] Preparing for "kube-apiserver" upgrade
[upgrade/staticpods] Renewing apiserver certificate
[upgrade/staticpods] Renewing apiserver-kubelet-client certificate
[upgrade/staticpods] Renewing front-proxy-client certificate
[upgrade/staticpods] Renewing apiserver-etcd-client certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-apiserver.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2024-06-06-18-17-42/kube-apiserver.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
[apiclient] Found 1 Pods for label selector component=kube-apiserver
[upgrade/staticpods] Component "kube-apiserver" upgraded successfully!
[upgrade/staticpods] Preparing for "kube-controller-manager" upgrade
[upgrade/staticpods] Renewing controller-manager.conf certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-controller-manager.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2024-06-06-18-17-42/kube-controller-manager.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
[apiclient] Found 1 Pods for label selector component=kube-controller-manager
[upgrade/staticpods] Component "kube-controller-manager" upgraded successfully!
[upgrade/staticpods] Preparing for "kube-scheduler" upgrade
[upgrade/staticpods] Renewing scheduler.conf certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-scheduler.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2024-06-06-18-17-42/kube-scheduler.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
[apiclient] Found 1 Pods for label selector component=kube-scheduler
[upgrade/staticpods] Component "kube-scheduler" upgraded successfully!
[upgrade/postupgrade] Removing the deprecated label node-role.kubernetes.io/master='' from all control plane Nodes. After this step only the label node-role.kubernetes.io/control-plane='' will be present on control plane Nodes.
[upgrade/postupgrade] Adding the new taint &Taint{Key:node-role.kubernetes.io/control-plane,Value:,Effect:NoSchedule,TimeAdded:<nil>,} to all control plane Nodes. After this step both taints &Taint{Key:node-role.kubernetes.io/control-plane,Value:,Effect:NoSchedule,TimeAdded:<nil>,} and &Taint{Key:node-role.kubernetes.io/master,Value:,Effect:NoSchedule,TimeAdded:<nil>,} should be present on control plane Nodes.
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

[upgrade/successful] SUCCESS! Your cluster was upgraded to "v1.24.15". Enjoy!

[upgrade/kubelet] Now that your control plane is upgraded, please proceed with upgrading your kubelets if you haven't already done so.

2.1.4. 腾空该节点,将节点标记为不可调度并驱逐所有负载,准备节点的维护

$ kubectl drain k8s-master01 --ignore-daemonsets
node/k8s-master01 cordoned
WARNING: ignoring DaemonSet-managed Pods: default/ds-ng-d6g58, kube-flannel/kube-flannel-ds-qmpmf, kube-system/kube-proxy-tc92t
evicting pod kube-system/coredns-65c54cc984-sjd7x
pod/coredns-65c54cc984-sjd7x evicted
node/k8s-master01 drained

2.1.5. 升级 kubelet 和 kubectl

$ yum -y install kubelet-1.24.15 kubectl-1.24.15
$ systemctl daemon-reload
$ systemctl restart kubelet

2.1.6. 解除节点封锁,通过将节点标记为可调度,让其重新上线

$ kubectl uncordon k8s-master01
node/k8s-master01 uncordoned

2.2 升级其他控制平面节点


  • 不需要执行kubeadm upgrade plan
  • kubeadm upgrade node 替换 kubeadm upgrade apply

