首页 > 其他分享 >K8S环境部署Prometheus

K8S环境部署Prometheus

时间:2024-06-13 13:10:19浏览次数:21  
标签:k8s created 部署 prometheus -- Prometheus yaml K8S root

K8S环境部署Prometheus

记录在K8S 1.18版本环境下部署Prometheus 0.5版本。

1. 下载kube-prometheus仓库

git clone https://github.com/coreos/kube-prometheus.git
cd kube-prometheus

笔者安装的K8S版本是1.18 ,prometheus选择配套的分支release-0.5:

# 切换到release-0.5
git checkout remotes/origin/release-0.5 -b 0.5

K8S和Pormetheus的配套关系:

kube-prometheus stack Kubernetes 1.14 Kubernetes 1.15 Kubernetes 1.16 Kubernetes 1.17 Kubernetes 1.18
release-0.3
release-0.4
release-0.5
HEAD

最新的版本配套关系查看kube-prometheus官方仓库:https://github.com/prometheus-operator/kube-prometheus ,可以切换版本查看配套关系。

2. 查看manifest

[root@k8s-master kube-prometheus]# cd manifests/
[root@k8s-master manifests]# ll
total 1684
-rw-r--r-- 1 root root     405 Jun 12 16:20 alertmanager-alertmanager.yaml
-rw-r--r-- 1 root root     973 Jun 12 16:20 alertmanager-secret.yaml
-rw-r--r-- 1 root root      96 Jun 12 16:20 alertmanager-serviceAccount.yaml
-rw-r--r-- 1 root root     254 Jun 12 16:20 alertmanager-serviceMonitor.yaml
-rw-r--r-- 1 root root     308 Jun 12 16:22 alertmanager-service.yaml
-rw-r--r-- 1 root root     550 Jun 12 16:20 grafana-dashboardDatasources.yaml
-rw-r--r-- 1 root root 1405645 Jun 12 16:20 grafana-dashboardDefinitions.yaml
-rw-r--r-- 1 root root     454 Jun 12 16:20 grafana-dashboardSources.yaml
-rw-r--r-- 1 root root    7539 Jun 12 16:20 grafana-deployment.yaml
-rw-r--r-- 1 root root      86 Jun 12 16:20 grafana-serviceAccount.yaml
-rw-r--r-- 1 root root     208 Jun 12 16:20 grafana-serviceMonitor.yaml
-rw-r--r-- 1 root root     238 Jun 12 16:22 grafana-service.yaml
-rw-r--r-- 1 root root     376 Jun 12 16:20 kube-state-metrics-clusterRoleBinding.yaml
-rw-r--r-- 1 root root    1651 Jun 12 16:20 kube-state-metrics-clusterRole.yaml
-rw-r--r-- 1 root root    1925 Jun 12 16:20 kube-state-metrics-deployment.yaml
-rw-r--r-- 1 root root     192 Jun 12 16:20 kube-state-metrics-serviceAccount.yaml
-rw-r--r-- 1 root root     829 Jun 12 16:20 kube-state-metrics-serviceMonitor.yaml
-rw-r--r-- 1 root root     403 Jun 12 16:20 kube-state-metrics-service.yaml
-rw-r--r-- 1 root root     266 Jun 12 16:20 node-exporter-clusterRoleBinding.yaml
-rw-r--r-- 1 root root     283 Jun 12 16:20 node-exporter-clusterRole.yaml
-rw-r--r-- 1 root root    2775 Jun 12 16:20 node-exporter-daemonset.yaml
-rw-r--r-- 1 root root      92 Jun 12 16:20 node-exporter-serviceAccount.yaml
-rw-r--r-- 1 root root     711 Jun 12 16:20 node-exporter-serviceMonitor.yaml
-rw-r--r-- 1 root root     355 Jun 12 16:20 node-exporter-service.yaml
-rw-r--r-- 1 root root     292 Jun 12 16:20 prometheus-adapter-apiService.yaml
-rw-r--r-- 1 root root     396 Jun 12 16:20 prometheus-adapter-clusterRoleAggregatedMetricsReader.yaml
-rw-r--r-- 1 root root     304 Jun 12 16:20 prometheus-adapter-clusterRoleBindingDelegator.yaml
-rw-r--r-- 1 root root     281 Jun 12 16:20 prometheus-adapter-clusterRoleBinding.yaml
-rw-r--r-- 1 root root     188 Jun 12 16:20 prometheus-adapter-clusterRoleServerResources.yaml
-rw-r--r-- 1 root root     219 Jun 12 16:20 prometheus-adapter-clusterRole.yaml
-rw-r--r-- 1 root root    1378 Jun 12 16:20 prometheus-adapter-configMap.yaml
-rw-r--r-- 1 root root    1344 Jun 12 16:20 prometheus-adapter-deployment.yaml
-rw-r--r-- 1 root root     325 Jun 12 16:20 prometheus-adapter-roleBindingAuthReader.yaml
-rw-r--r-- 1 root root      97 Jun 12 16:20 prometheus-adapter-serviceAccount.yaml
-rw-r--r-- 1 root root     236 Jun 12 16:20 prometheus-adapter-service.yaml
-rw-r--r-- 1 root root     269 Jun 12 16:20 prometheus-clusterRoleBinding.yaml
-rw-r--r-- 1 root root     216 Jun 12 16:20 prometheus-clusterRole.yaml
-rw-r--r-- 1 root root     621 Jun 12 16:20 prometheus-operator-serviceMonitor.yaml
-rw-r--r-- 1 root root     751 Jun 12 16:20 prometheus-prometheus.yaml
-rw-r--r-- 1 root root     293 Jun 12 16:20 prometheus-roleBindingConfig.yaml
-rw-r--r-- 1 root root     983 Jun 12 16:20 prometheus-roleBindingSpecificNamespaces.yaml
-rw-r--r-- 1 root root     188 Jun 12 16:20 prometheus-roleConfig.yaml
-rw-r--r-- 1 root root     820 Jun 12 16:20 prometheus-roleSpecificNamespaces.yaml
-rw-r--r-- 1 root root   86744 Jun 12 16:20 prometheus-rules.yaml
-rw-r--r-- 1 root root      93 Jun 12 16:20 prometheus-serviceAccount.yaml
-rw-r--r-- 1 root root    6829 Jun 12 16:20 prometheus-serviceMonitorApiserver.yaml
-rw-r--r-- 1 root root     395 Jun 12 16:20 prometheus-serviceMonitorCoreDNS.yaml
-rw-r--r-- 1 root root    6172 Jun 12 16:20 prometheus-serviceMonitorKubeControllerManager.yaml
-rw-r--r-- 1 root root    6778 Jun 12 16:20 prometheus-serviceMonitorKubelet.yaml
-rw-r--r-- 1 root root     347 Jun 12 16:20 prometheus-serviceMonitorKubeScheduler.yaml
-rw-r--r-- 1 root root     247 Jun 12 16:20 prometheus-serviceMonitor.yaml
-rw-r--r-- 1 root root     297 Jun 12 16:21 prometheus-service.yaml
drwxr-xr-x 2 root root    4096 Jun 12 16:20 setup

3. 修改镜像源

修改prometheus-operator,prometheus,alertmanager,kube-state-metrics,node-exporter,prometheus-adapter的镜像源为中科大的镜像源。

sed -i 's/quay.io/quay.mirrors.ustc.edu.cn/g' setup/prometheus-operator-deployment.yaml
sed -i 's/quay.io/quay.mirrors.ustc.edu.cn/g' prometheus-prometheus.yaml 
sed -i 's/quay.io/quay.mirrors.ustc.edu.cn/g' alertmanager-alertmanager.yaml
sed -i 's/quay.io/quay.mirrors.ustc.edu.cn/g' kube-state-metrics-deployment.yaml
sed -i 's/quay.io/quay.mirrors.ustc.edu.cn/g' node-exporter-daemonset.yaml
sed -i 's/quay.io/quay.mirrors.ustc.edu.cn/g' prometheus-adapter-deployment.yaml

4. 修改promethes,alertmanager,grafana的service类型为NodePort类型

为了可以从外部访问 prometheus,alertmanager,grafana,我们这里修改 promethes,alertmanager,grafana的 service 类型为 NodePort 类型。

  1. 修改 prometheus 的 service
[root@k8s-master kube-prometheus]# cat manifests/prometheus-service.yaml
apiVersion: v1
kind: Service
metadata:
  labels:
    prometheus: k8s
  name: prometheus-k8s
  namespace: monitoring
spec:
  type: NodePort        # 增加NodePort配置
  ports:
  - name: web
    port: 9090
    targetPort: web
    nodePort: 30090       # 增加NodePort配置
  selector:
    app: prometheus
    prometheus: k8s
  sessionAffinity: ClientIP
  1. 修改 grafana 的 service
[root@k8s-master kube-prometheus]# cat manifests/grafana-service.yaml
apiVersion: v1
kind: Service
metadata:
  labels:
    app: grafana
  name: grafana
  namespace: monitoring
spec:
  type: NodePort      # 增加NodePort配置
  ports:
  - name: http
    port: 3000
    targetPort: http
    nodePort: 32000   # 增加NodePort配置
  selector:
    app: grafana
  1. 修改 alertmanager 的 service
[root@k8s-master kube-prometheus]# cat manifests/alertmanager-service.yaml
apiVersion: v1
kind: Service
metadata:
  labels:
    alertmanager: main
  name: alertmanager-main
  namespace: monitoring
spec:
  type: NodePort    # 增加NodePort配置
  ports:
  - name: web
    port: 9093
    targetPort: web
    nodePort: 30093   # 增加NodePort配置
  selector:
    alertmanager: main
    app: alertmanager
  sessionAffinity: ClientIP

5. 安装kube-prometheus

安装CRD和prometheus-operator

[root@k8s-master manifests]# kubectl apply -f setup/
Warning: kubectl apply should be used on resource created by either kubectl create --save-config or kubectl                                    apply
namespace/monitoring configured
Warning: kubectl apply should be used on resource created by either kubectl create --save-config or kubectl                                    apply
customresourcedefinition.apiextensions.k8s.io/alertmanagers.monitoring.coreos.com configured
Warning: kubectl apply should be used on resource created by either kubectl create --save-config or kubectl                                    apply
customresourcedefinition.apiextensions.k8s.io/podmonitors.monitoring.coreos.com configured
Warning: kubectl apply should be used on resource created by either kubectl create --save-config or kubectl                                    apply
customresourcedefinition.apiextensions.k8s.io/prometheuses.monitoring.coreos.com configured
Warning: kubectl apply should be used on resource created by either kubectl create --save-config or kubectl                                    apply
customresourcedefinition.apiextensions.k8s.io/prometheusrules.monitoring.coreos.com configured
Warning: kubectl apply should be used on resource created by either kubectl create --save-config or kubectl                                    apply
customresourcedefinition.apiextensions.k8s.io/servicemonitors.monitoring.coreos.com configured
Warning: kubectl apply should be used on resource created by either kubectl create --save-config or kubectl                                    apply
customresourcedefinition.apiextensions.k8s.io/thanosrulers.monitoring.coreos.com configured
clusterrole.rbac.authorization.k8s.io/prometheus-operator created
clusterrolebinding.rbac.authorization.k8s.io/prometheus-operator created
deployment.apps/prometheus-operator created
service/prometheus-operator created
serviceaccount/prometheus-operator created
[root@k8s-master manifests]# kubectl get pod -n monitoring
NAME                                   READY   STATUS              RESTARTS   AGE
prometheus-operator-5cd4d464cc-b9vqq   0/2     ContainerCreating   0          16s

下载prometheus-operator镜像需要花费几分钟,等待prometheus-operator变成running状态。

安装prometheus, alertmanager, grafana, kube-state-metrics, node-exporter等资源

[root@k8s-master manifests]# kubectl apply -f .
alertmanager.monitoring.coreos.com/main created
secret/alertmanager-main created
service/alertmanager-main created
serviceaccount/alertmanager-main created
servicemonitor.monitoring.coreos.com/alertmanager created
secret/grafana-datasources created
configmap/grafana-dashboard-apiserver created
configmap/grafana-dashboard-cluster-total created
configmap/grafana-dashboard-controller-manager created
configmap/grafana-dashboard-k8s-resources-cluster created
configmap/grafana-dashboard-k8s-resources-namespace created
configmap/grafana-dashboard-k8s-resources-node created
configmap/grafana-dashboard-k8s-resources-pod created
configmap/grafana-dashboard-k8s-resources-workload created
configmap/grafana-dashboard-k8s-resources-workloads-namespace created
configmap/grafana-dashboard-kubelet created
configmap/grafana-dashboard-namespace-by-pod created
configmap/grafana-dashboard-namespace-by-workload created
configmap/grafana-dashboard-node-cluster-rsrc-use created
configmap/grafana-dashboard-node-rsrc-use created
configmap/grafana-dashboard-nodes created
configmap/grafana-dashboard-persistentvolumesusage created
configmap/grafana-dashboard-pod-total created
configmap/grafana-dashboard-prometheus-remote-write created
configmap/grafana-dashboard-prometheus created
configmap/grafana-dashboard-proxy created
configmap/grafana-dashboard-scheduler created
configmap/grafana-dashboard-statefulset created
configmap/grafana-dashboard-workload-total created
configmap/grafana-dashboards created
deployment.apps/grafana created
service/grafana created
serviceaccount/grafana created
servicemonitor.monitoring.coreos.com/grafana created
clusterrole.rbac.authorization.k8s.io/kube-state-metrics created
clusterrolebinding.rbac.authorization.k8s.io/kube-state-metrics created
deployment.apps/kube-state-metrics created
service/kube-state-metrics created
serviceaccount/kube-state-metrics created
servicemonitor.monitoring.coreos.com/kube-state-metrics created
clusterrole.rbac.authorization.k8s.io/node-exporter created
clusterrolebinding.rbac.authorization.k8s.io/node-exporter created
daemonset.apps/node-exporter created
service/node-exporter created
serviceaccount/node-exporter created
servicemonitor.monitoring.coreos.com/node-exporter created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
clusterrole.rbac.authorization.k8s.io/prometheus-adapter created
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrolebinding.rbac.authorization.k8s.io/prometheus-adapter created
clusterrolebinding.rbac.authorization.k8s.io/resource-metrics:system:auth-delegator created
clusterrole.rbac.authorization.k8s.io/resource-metrics-server-resources created
configmap/adapter-config created
deployment.apps/prometheus-adapter created
rolebinding.rbac.authorization.k8s.io/resource-metrics-auth-reader created
service/prometheus-adapter created
serviceaccount/prometheus-adapter created
clusterrole.rbac.authorization.k8s.io/prometheus-k8s created
clusterrolebinding.rbac.authorization.k8s.io/prometheus-k8s created
servicemonitor.monitoring.coreos.com/prometheus-operator created
prometheus.monitoring.coreos.com/k8s created
rolebinding.rbac.authorization.k8s.io/prometheus-k8s-config created
rolebinding.rbac.authorization.k8s.io/prometheus-k8s created
rolebinding.rbac.authorization.k8s.io/prometheus-k8s created
rolebinding.rbac.authorization.k8s.io/prometheus-k8s created
role.rbac.authorization.k8s.io/prometheus-k8s-config created
role.rbac.authorization.k8s.io/prometheus-k8s created
role.rbac.authorization.k8s.io/prometheus-k8s created
role.rbac.authorization.k8s.io/prometheus-k8s created
prometheusrule.monitoring.coreos.com/prometheus-k8s-rules created
service/prometheus-k8s created
serviceaccount/prometheus-k8s created
servicemonitor.monitoring.coreos.com/prometheus created
servicemonitor.monitoring.coreos.com/kube-apiserver created
servicemonitor.monitoring.coreos.com/coredns created
servicemonitor.monitoring.coreos.com/kube-controller-manager created
servicemonitor.monitoring.coreos.com/kube-scheduler created
servicemonitor.monitoring.coreos.com/kubelet created

等待monitoring命名空间下的pod都变为运行:

[root@k8s-master ~]# kubectl get  pod -n monitoring
NAME                                   READY   STATUS    RESTARTS   AGE
alertmanager-main-0                    2/2     Running   0          157m
alertmanager-main-1                    2/2     Running   0          157m
alertmanager-main-2                    2/2     Running   0          157m
grafana-5c55845445-gh8g7               1/1     Running   0          20h
kube-state-metrics-75f946484-lqrbf     3/3     Running   0          20h
node-exporter-5h5cs                    2/2     Running   0          20h
node-exporter-f28gj                    2/2     Running   0          20h
node-exporter-w9rhr                    2/2     Running   0          20h
prometheus-adapter-7d68d6f886-qwrfg    1/1     Running   0          20h
prometheus-k8s-0                       3/3     Running   0          20h
prometheus-k8s-1                       3/3     Running   0          20h
prometheus-operator-5cd4d464cc-b9vqq   2/2     Running   0          20h

博主部署测试遇到如下问题,解决方式记录如下:

  1. alertmanager-main的三个容器启动失败,状态为crashLoopBackOff,pod中的其中一个容器无法启动。
  Warning  Unhealthy  11m (x5 over 11m)    kubelet, k8s-node2  Liveness probe failed: Get http://10.244.2.8:9093/-/healthy: dial tcp 10.244.2.8:9093: connect: connection refused
  Warning  Unhealthy  10m (x10 over 11m)   kubelet, k8s-node2  Readiness probe failed: Get http://10.244.2.8:9093/-/ready: dial tcp 10.244.2.8:9093: connect: connection refused

下面的解决方法参考自:https://github.com/prometheus-operator/kube-prometheus/issues/653

# 暂停更新,修改如下资源文件,增加paused:true
kubectl -n monitoring edit alertmanagers.monitoring.coreos.com
...
spec:
  image: quay.io/prometheus/alertmanager:v0.23.0
  nodeSelector:
    kubernetes.io/os: linux
  paused: true
  podMetadata:
    labels:
...

[root@k8s-master ~]# kubectl -n monitoring get statefulset.apps/alertmanager-main -o yaml > dump.yaml
# 修改alertmanager-main.yaml,在spec.template.spec添加hostNetwork: true,在文件的234行左右的位置
[root@k8s-master manifests]# vi dump.yaml
...
    spec:
      hostNetwork: true   # 增加的内容
      containers:
      - args:
        - --config.file=/etc/alertmanager/config/alertmanager.yaml
...
# 删除livenessProbe和readinessProbe探针
[root@k8s-master manifests]# vi dump.yaml
...
        livenessProbe:
          failureThreshold: 10
          httpGet:
            path: /-/healthy
            port: web
            scheme: HTTP
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 3
...
        readinessProbe:
          failureThreshold: 10
          httpGet:
            path: /-/ready
            port: web
            scheme: HTTP
          initialDelaySeconds: 3
          periodSeconds: 5
          successThreshold: 1
          timeoutSeconds: 3
...

# 删除原有的statefulset,重新创建
[root@k8s-master ~]# kubectl delete statefulset.apps/alertmanager-main -n monitoring
[root@k8s-master ~]# kubectl create -f dump.yaml
  1. 其中一个alertmanager状态为pendding,查看原因为不满足节点调度要求。

解决方法如下,

# 去除污点
kubectl describe node k8s-master | grep Taints
kubectl taint nodes k8s-master node-role.kubernetes.io/master-

6. 访问prometheus,alert-manager,grafana

  1. 访问prometheus

浏览器打开http://192.168.0.51:30090,192.168.0.51为master的IP

  1. 访问alert-manager

浏览器打开http://192.168.0.51:30093

  1. 访问grafana

浏览器打开http://192.168.0.51:32000

用户名/密码:admin/admin

标签:k8s,created,部署,prometheus,--,Prometheus,yaml,K8S,root
From: https://www.cnblogs.com/lldhsds/p/18245690

相关文章

  • FlowUs本地部署:数据自主权与定制化服务的完美融合|FlowUs息流企业级解决方案本地私有化
    在当今数字化时代,企业对数据的控制和安全性要求越来越高,同时,用户对软件的使用习惯也趋向多样化。针对这些需求,FlowUs作为一款多功能的协作平台,提供了灵活的解决方案。FlowUs本地部署对于有本地化需求的客户,FlowUs支持企业私有化部署服务。这意味着企业可以根据自己的需求,在本......
  • GitLab-CI (自动化集成&部署)
    GitlabCI/CD是一款用于持续集成(CI),持续交付(CD)的工具,相似的工具有Jenkins、TravisCI、GoCD等。Gitlab的CI/CD算是比较简单的了,只需要依靠一份".gitlab-ci.yml",将该文件随代码上传,Gitlab就会自动执行相应的任务,从而实现CI/CD。gitlab-runner可实现cicd1.安装gitlab-runnerdoc......
  • Springboot计算机毕业设计影院订票选座微信小程序【附源码】开题+论文+mysql+程序+部
    本系统(程序+源码)带文档lw万字以上 文末可获取一份本项目的java源码和数据库参考。系统程序文件列表开题报告内容研究背景随着移动互联网的快速发展,人们的生活方式发生了深刻变化,特别是在娱乐消费领域。电影院作为重要的文化娱乐场所,其订票方式也逐渐从传统的线下购票转向......
  • Springboot计算机毕业设计影院购票微信小程序【附源码】开题+论文+mysql+程序+部署
    本系统(程序+源码)带文档lw万字以上 文末可获取一份本项目的java源码和数据库参考。系统程序文件列表开题报告内容研究背景在当今数字化时代,移动互联网的普及率日益提升,智能手机成为人们生活中不可或缺的工具。微信小程序,作为一种轻量级应用,凭借其便捷性和易用性,受到了广大......
  • 实战 k8s----初识
    什么是k8s?k8s是谷歌开源的一套完整的容器管理平台,方便我们直接管理容器应用。谷歌称之为,kubernetes,[kubə’netis],(跟我一起读库波尔耐题思,重音在耐的音上),由于字母太多,我们简称为k8s,8代表k-->s之间的8个字母。kubernetes译为舵手,标识是一个航海舵。而容器直译为集装箱,也就是舵手......
  • Springboot计算机毕业设计英语学习课程微信小程序【附源码】开题+论文+mysql+程序+部
    本系统(程序+源码)带文档lw万字以上 文末可获取一份本项目的java源码和数据库参考。系统程序文件列表开题报告内容研究背景随着移动互联网的迅猛发展,微信已成为人们生活中不可或缺的一部分。微信小程序作为微信生态系统的重要组成部分,以其便捷性、即用即走的特点,受到了广大......
  • Springboot计算机毕业设计英语学习小程序【附源码】开题+论文+mysql+程序+部署
    本系统(程序+源码)带文档lw万字以上 文末可获取一份本项目的java源码和数据库参考。系统程序文件列表开题报告内容研究背景在全球化的大背景下,英语作为国际通用语言,其重要性不言而喻。然而,传统的英语学习方式往往受到时间、地点和资源的限制,使得学习者难以获得高效、便捷的......
  • K8S部署Metrics-Server
    K8S部署Metrics-Server1)下载manifest的YAMLwgethttps://github.com/kubernetes-sigs/metrics-server/releases/latest/download/high-availability-1.21+.yaml2)编辑需要在添加-–kubelet-insecure-tlscontainers:-args:---kubelet-insecure-t......
  • 微信AI机器人使用说明-2024本地部署版(非wechaty)
     一、效果演示微信机器人实现的功能,先看视频的演示效果:2024年最新稳定的本地部署AI微信机器人使用方法演示可以对话可以语音可以绘画支持主账号管理好友权限管理免费体验AI好友: yuzitao-716二、支持功能1.绑定主账号:绑定主账号之后,主账号可以给其他用户、其他群组......
  • SQLCMD 密码中的 K8S 秘密用法始终为空
    我试图使用K8Ssecret密码连接到SQL服务器,但无论我使用什么语法或方法,密码总是空的。如果我硬编码密码,则一切正常。我还可以使用此命令在POD中打印密码,它还会返回存储在密码中的密码,因此POD可以实际访问密码。kubectlexec-itpodname--printenvMSS......