首页 > 其他分享 >汇总Kubernetes在生产环境下遇到的各种问题

汇总Kubernetes在生产环境下遇到的各种问题

时间:2023-04-17 09:45:00浏览次数:39  
标签:Kubernetes lib 遇到 database 汇总 lost mysql var found

1、挂载卷权限问题导致pod运行异常

# 调试:增加command字段,进入容器查看应用运行uid
spec:
  containers:
  - command:
    - /bin/sh
    - -c
    - sleep 500000

# 使用initContainer修改目录权限
spec:
  initContainers:
  - command:
    - /bin/sh
    - -c
    - chmod 777 /prometheus
    image: busybox
    imagePullPolicy: IfNotPresent
    name: volume-permissions
    securityContext:
      runAsUser: 0
    volumeMounts:
    - mountPath: /prometheus
      name: prometheus-data

2、挂载卷内默认生成lost+found目录导致数据库初始化失败

Initializing database
2023-04-12T08:11:26.631401Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2023-04-12T08:11:26.636640Z 0 [ERROR] --initialize specified but the data directory has files in it. Aborting.
2023-04-12T08:11:26.636700Z 0 [ERROR] Aborting

# 调试:增加command字段,进入容器删除lost+found目录
spec:
  containers:
  - command:
    - /bin/sh
    - -c
    - sleep 500000

# 进容器删除lost+found/
mysql@flashcatcloud-nightingale-database-0:/$ cd /var/lib/mysql
mysql@flashcatcloud-nightingale-database-0:/var/lib/mysql$ ls
lost+found
mysql@flashcatcloud-nightingale-database-0:/var/lib/mysql$ rm -r lost+found/
mysql@flashcatcloud-nightingale-database-0:/var/lib/mysql$ ls 
mysql@flashcatcloud-nightingale-database-0:/var/lib/mysql$ 

# 或通过挂载initContainer的方式删除lost+found目录
spec:
  initContainers:
  - command:
    - /bin/sh
    - -c
    - rm -rf /var/lib/mysql/*
    image: busybox
    imagePullPolicy: IfNotPresent
    name: volume-permissions
    resources: {}
    securityContext:
      runAsUser: 0
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/lib/mysql/
      name: database-data

3、容器一直保持在terminating状态

# 查看所在节点kubelet日志: 
failed to "KillPodSandbox" for "a594f4a1-c67b-42c5-84ea-62f7fb1e386d" with KillPodSandboxError: "rpc error: code = Unknown desc = failed to check network namespace closed: remove netns: unlinkat /var/run/netns/cni-b70f6268-4fed-8c40-73f4-2e0ad0d325f4: device or resource busy"

# 解决方法
echo 1 > /proc/sys/fs/may_detach_mounts 

# 基于纯shell的 kubernetes 生产集群的 sysctl 配置
https://www.boysec.cn/boy/f0530e00.html

4、拉取私有镜像仓库的镜像证书受信问题

x509: certificate signed by unknown authority

# 1、容器运行时为Docker
cat >/etc/docker/daemon.json <<EOF
{
	"graph": "/var/lib/docker",
	"registry-mirrors": ["https://registry.cn-hangzhou.aliyuncs.com", "https://harbor.example.com"],
	"insecure-registries": ["https://harbor.example.com"],
	"live-restore": true,
	"exec-opts": ["native.cgroupdriver=systemd"],
	"storage-driver": "overlay2",
	"log-driver": "json-file",
	"log-opts": {
		"max-size": "500m",
		"max-file": "3"
	}
}
EOF
systemctl restart docker.service
systemctl status docker.service

# 2、容器运行时为Containerd
mkdir -p /etc/containerd/certs.d/harbor.example.com/
cat >/etc/containerd/certs.d/harbor.example.com/hosts.toml <<EOF
[host."https://harbor.example.com"]
  capabilities = ["pull", "resolve", "push"]
  skip_verify = true
EOF

cat >>/etc/containerd/config.toml <<EOF
          [plugins."io.containerd.grpc.v1.cri".registry.configs."harbor.example.com".auth]
            username = "admin"
            password = "Harbor12345"
EOF

systemctl restart containerd.service
systemctl status containerd.service

5、

标签:Kubernetes,lib,遇到,database,汇总,lost,mysql,var,found
From: https://www.cnblogs.com/wang-hongwei/p/17324802.html

相关文章

  • kubebuilder开发kubernetes operator demo
    环境准备go环境配置wgethttps://golang.google.cn/dl/go1.19.8.linux-amd64.tar.gztarzxvfgo1.19.8.linux-amd64.tar.gzmvgo/usr/local/vim/etc/profile在最结尾添加exportHOME=/rootexportGOROOT=/usr/local/goexportGOPATH=/opt/idcus/goexportPATH=$PATH:......
  • Kubernetes-Cluster Architecture
    title:Kubernetes-ClusterArchitecturedate:2022-05-28:00:04author:liudongdong1img:https://cdn.pixabay.com/photo/2022/05/05/01/11/cormorant-7175037__340.jpgcover:falsecategories:Storagetags:-StorageKubernetesrunsyourworkloadbyplacin......
  • SAP ABAP 通过 https 消费外部 API 遇到错误消息 SSSLERR_SSL_CONNECT
    错误消息:500NativeSSLerror:SSLhandshakewithapi.uat443failed:SSSLERR_SSL_CONNECT-57SAPCRYPTO:SSL_connectfailedreceivedafatalTLS1.0internal_erroralertfromthepeer这个错误是关于ABAP作为客户端,无法通过https向提供API的外部服务器建立安全......
  • 在docker中运行的jenkins中使用docker时遇到错误
    每次服务器重启,在docker中运行的jenkins中使用docker时遇到以下错误time="2023-04-15T13:55:46Z"level=errormsg="failedtodialgRPC:cannotconnecttotheDockerdaemon.Is'dockerdaemon'runningonthishost?:dialunix/var/run/docker.sock:connect:p......
  • 红帽认证RedHat-RHCSA shell的基本应用用户和组管理网络配置和防火墙管理笔记汇总
    shell命令概述Shell作用:命令解释器介于操作系统内核与用户之间,负责解释命令行获得命令帮助内部命令help命令的“--help”选项使用man命令阅读手册页命令行编辑的几个辅助操作Tab键:自动补齐反斜杠“\”:强制换行快捷键Ctrl+U:清空至行首快捷键Ctrl+K:清空至行尾快捷键Ctr......
  • 记一次Flink遇到性能瓶颈
    前言这周的主要时间花在Flink上面,做了一个简单的从文本文件中读取数据,然后存入数据库的例子,能够正常的实现功能,但是遇到个问题,我有四台机器,自己搭建了一个standalone的集群,不论我把并行度设置多少,跑起来的耗时都非常接近,实在是百思不得其解。机器多似乎并不能帮助它。把过程记录......
  • Jenkins: Kubernetes Plugin
     envinjenkinscontroller  Jenkinsdynamicslaveagent      PodTemplateinJenkinsUI    Examplefromgitpipeline{agent{kubernetes{cloud'kubernetes'defaultContainer'mav......
  • 【JVM】JVM的配置参数汇总分类
    JavaHotSpotVM中-XX:的可配置参数列表进行描述;这些参数可以被松散的聚合成三类:行为参数(BehavioralOptions):用于改变jvm的一些基础行为;性能调优(PerformanceTuning):用于jvm的性能调优;调试参数(DebuggingOptions):一般用于打开跟踪、打印、输出等jvm参数,用于显示jvm更加详细......
  • kubernetes 1.25.0 安装部署
    1.环境说明主机IP地址备注k8s-master192.168.0.200控制节点k8s-node1192.168.0.200工作节点k8s-node2192.168.0.200工作节点2.准备工作(所有节点)分别设置主机名hostnamectlset-hostnamek8s-masterhostnamectlset-hostna......
  • Kubernetes API相关知识
    Kubernetes集群中,所有组件需要操作集群资源时都通过调用kube-apiserver提供的RESTful接口来实现。kube-apiserver进一步和etcd交互,完成资源信息的更新。Kubernetes中的资源本质上是一个API对象,这个对象的期望状态被APIServer保存在etcd中,然后提供RESTful接口用于更新这些对象。......