访问Rancher Dashboard,发现无法访问
由于笔者的rancher是用docker部署的,查看rancher log:
docker logs [container-name]
截取一部分报错如下:
.
.
2024-03-24 06:52:27.085313 I | embed: ready to serve client requests
2024-03-24 06:52:27.085567 I | etcdserver: published {Name:default ClientURLs:[http://localhost:2379]} to cluster cdf818194e3a8c32
2024-03-24 06:52:27.087033 N | embed: serving insecure client requests on 127.0.0.1:2379, this is strongly discouraged!
2024/03/24 06:52:27 [INFO] Waiting for server to become available: Get "https://127.0.0.1:6443/version?timeout=15m0s": dial tcp 127.0.0.1:6443: connect: connection refused
2024/03/24 06:52:29 [INFO] Waiting for server to become available: the server has asked for the client to provide credentials
#后续报错基本就都是 Waiting for server to become available: the server has asked for the client to provide credentials
.
.
初步判断可能是K8S集群的证书出了问题。
然后切到master node,尝试查看pod,发现证书过期,和rancher log中无法访问的报错时间匹配
[root@k8s-master-1 ~]# kubectl get pods
Unable to connect to the server: x509: certificate has expired or is not yet valid: current time 2024-03-26T17:06:36+08:00 is after 2024-03-23T11:19:33Z
查看证书过期时间
(1.2版本以上的命令应该为:kubeadm certs check-expiration)
[root@k8s-master-1 ~]# kubeadm alpha certs check-expiration
[check-expiration] Reading configuration from the cluster...
[check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[check-expiration] Error reading configuration from the Cluster. Falling back to default configuration
W0326 17:32:32.371486 1768144 configset.go:348] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED
admin.conf Mar 23, 2024 11:19 UTC <invalid> no
apiserver Mar 23, 2024 11:19 UTC <invalid> ca no
apiserver-etcd-client Mar 23, 2024 11:19 UTC <invalid> etcd-ca no
apiserver-kubelet-client Mar 23, 2024 11:19 UTC <invalid> ca no
controller-manager.conf Mar 23, 2024 11:19 UTC <invalid> no
etcd-healthcheck-client Mar 23, 2024 11:19 UTC <invalid> etcd-ca no
etcd-peer Mar 23, 2024 11:19 UTC <invalid> etcd-ca no
etcd-server Mar 23, 2024 11:19 UTC <invalid> etcd-ca no
front-proxy-client Mar 23, 2024 11:19 UTC <invalid> front-proxy-ca no
scheduler.conf Mar 23, 2024 11:19 UTC <invalid> no
renew all certificate
[root@k8s-master-1 ~]# kubeadm alpha certs renew all
再次查看会发现证书已经更新,但只是更新了一年
[root@k8s-master-1 ~]# kubeadm alpha certs check-expiration
[check-expiration] Reading configuration from the cluster...
[check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[check-expiration] Error reading configuration from the Cluster. Falling back to default configuration
W0326 17:40:08.152879 1776164 configset.go:348] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED
admin.conf Mar 26, 2025 09:40 UTC 364d no
apiserver Mar 26, 2025 09:40 UTC 364d ca no
apiserver-etcd-client Mar 26, 2025 09:40 UTC 364d etcd-ca no
apiserver-kubelet-client Mar 26, 2025 09:40 UTC 364d ca no
controller-manager.conf Mar 26, 2025 09:40 UTC 364d no
etcd-healthcheck-client Mar 26, 2025 09:40 UTC 364d etcd-ca no
etcd-peer Mar 26, 2025 09:40 UTC 364d etcd-ca no
etcd-server Mar 26, 2025 09:40 UTC 364d etcd-ca no
front-proxy-client Mar 26, 2025 09:40 UTC 364d front-proxy-ca no
scheduler.conf Mar 26, 2025 09:40 UTC 364d no
以下为符合docker部署的更新证书的步骤:
# 1、备份证书(非常重要)
cp -r /etc/kubernetes /etc/kubernetes_bak
# 2、查看证书的有效期 (注意:和老版本的命令不一样)
kubeadm certs check-expiration
# 3、升级证书(谨慎操作)
kubeadm certs renew all
# 4、重启etcd kube-apiserver kube-controller kube-scheduler 4个容器(注意etcd是否有多个,是否和其他重复,例如kuboard)
for i in k8s_etcd kube-apiserver kube-controller-manager kube-scheduler;do
echo ….restart container $i….
docker ps |grep $i | grep -v pause | cut -d " " -f1 | xargs docker restart
done
#或者手动一个一个重启
docker ps | grep k8s_etcd
docker ps | grep k8s_kube-apiserver
docker ps | grep k8s_kube-controller-manager
docker ps | grep k8s_kube-scheduler
docker restart container_id
# 5、再次查看已经升级成功
kubeadm certs check-expiration
# 以上需要在master各个节点操作
# 6、更新证书 (需要在有引用证书的master节点操作)
cp -f /etc/kubernetes/admin.conf ~/.kube/config
做完以上步骤后,重启rancher. 然后可以正常访问了
然后建议在创建集群初始化的时候,可以设置证书10年过期,方法可参考下面的;
https://blog.csdn.net/xiaoyaoyun518/article/details/134161291