22:58:25 CST stdout: [master2]
Job for etcd.service failed because a timeout was exceeded.
See "systemctl status etcd.service" and "journalctl -xeu etcd.service" for details.
22:58:25 CST message: [master2]
start etcd failed: Failed to exec command: sudo -E /bin/bash -c "systemctl daemon-reload && systemctl restart etcd && systemctl enable etcd"
Job for etcd.service failed because a timeout was exceeded.
See "systemctl status etcd.service" and "journalctl -xeu etcd.service" for details.: Process exited with status 1
22:59:51 CST stdout: [master1]
Job for etcd.service failed because a timeout was exceeded.
See "systemctl status etcd.service" and "journalctl -xeu etcd.service" for details.
22:59:51 CST message: [master1]
start etcd failed: Failed to exec command: sudo -E /bin/bash -c "systemctl daemon-reload && systemctl restart etcd && systemctl enable etcd"
Job for etcd.service failed because a timeout was exceeded.
See "systemctl status etcd.service" and "journalctl -xeu etcd.service" for details.: Process exited with status 1
我以为是kubesphere中kk的bug或者版本问题,换了几个版本,发现还是有这个问题,查看系统日志
只看到
Oct 20 22:59:45 v2141 etcd[2304]: rejected connection from "192.168.122.142:46836" (error "remote error: tls: bad certificate", ServerName "")
Oct 20 22:59:45 v2141 etcd[2304]: rejected connection from "192.168.122.142:46848" (error "remote error: tls: bad certificate", ServerName "")
Oct 20 22:59:45 v2141 etcd[2304]: rejected connection from "192.168.122.143:47460" (error "remote error: tls: bad certificate", ServerName "")
Oct 20 22:59:45 v2141 etcd[2304]: rejected connection from "192.168.122.143:47470" (error "remote error: tls: bad certificate", ServerName "")
直到我手动执行
systemctl daemon-reload && systemctl restart etcd && systemctl enable etcd
tail -f /var/log/syslog
发现如下
Oct 20 22:59:45 v2141 etcd[2304]: rejected connection from "192.168.122.143:47470" (error "remote error: tls: bad certificate", ServerName "")
Oct 20 22:59:45 v2141 etcd[2304]: health check for peer 4b6b6b04950cb4b0 could not connect: x509: certificate is valid for 127.0.0.1, ::1, 192.168.122.121, 192.168.122.122, 192.168.122.123, 192.168.122.124, 192.168.122.125, 192.168.122.126, 192.168.122.127, not 192.168.122.143
Oct 20 22:59:45 v2141 etcd[2304]: health check for peer 4b6b6b04950cb4b0 could not connect: x509: certificate is valid for 127.0.0.1, ::1, 192.168.122.121, 192.168.122.122, 192.168.122.123, 192.168.122.124, 192.168.122.125, 192.168.122.126, 192.168.122.127, not 192.168.122.143
Oct 20 22:59:45 v2141 etcd[2304]: rejected connection from "192.168.122.142:46860" (error "remote error: tls: bad certificate", ServerName "")
Oct 20 22:59:45 v2141 etcd[2304]: rejected connection from "192.168.122.142:46862" (error "remote error: tls: bad certificate", ServerName "")
Oct 20 22:59:45 v2141 etcd[2304]: health check for peer 8959ed642c954627 could not connect: x509: certificate is valid for 127.0.0.1, ::1, 192.168.122.121, 192.168.122.122, 192.168.122.123, 192.168.122.124, 192.168.122.125, 192.168.122.126, 192.168.122.127, not 192.168.122.142
证书之前之前另外一个集群的,md5sum 得到了确认
root@master1:~# ip a s eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 96:2c:8f:a0:7b:02 brd ff:ff:ff:ff:ff:ff
altname enp6s18
inet 192.168.122.121/24 brd 192.168.122.255 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::942c:8fff:fea0:7b02/64 scope link
valid_lft forever preferred_lft forever
root@master1:~# md5sum /etc/ssl/etcd/ssl/admin-master1-key.pem
c29eacd168e16ce0acb23f7b9540dd18 /etc/ssl/etcd/ssl/admin-master1-key.pem
./kk delete cluster -f config-sample.yaml 清除集群不会删除 文件夹kubekey中之前生成的配置比如key
./kk delete cluster -f config-sample.yaml && rm -rf kubekey
然后就ok了
标签:kupesphere,59,22,记录,192.168,systemctl,etcd,error From: https://blog.51cto.com/first01/7968232