首页 > 其他分享 >64、K8S-使用K8S部署Prometheus、grafana【使用】

64、K8S-使用K8S部署Prometheus、grafana【使用】

时间:2023-04-13 09:25:44浏览次数:42  
标签:kube prometheus TCP grafana 10.244 Prometheus 192.168 K8S Running

1、运行状态查询

安装好后,我们就要看看运行状态怎么样

1.1、Pod运行状态

]# kubectl -n monitoring get pods -o wide
NAME                                  READY   STATUS    RESTARTS   AGE   IP              NODE      NOMINATED NODE   READINESS GATES
alertmanager-main-0                   2/2     Running   0          96m   10.244.3.7      node1     <none>           <none>
alertmanager-main-1                   2/2     Running   0          96m   10.244.4.8      node2     <none>           <none>
alertmanager-main-2                   2/2     Running   0          96m   10.244.4.9      node2     <none>           <none>
blackbox-exporter-84bb6f6bd9-2tr2q    3/3     Running   0          95m   10.244.3.9      node1     <none>           <none>
grafana-7bdbdbcb4b-67qsj              1/1     Running   0          74m   10.244.3.13     node1     <none>           <none>
kube-state-metrics-c7c57885f-scxdh    3/3     Running   0          94m   10.244.3.10     node1     <none>           <none>
node-exporter-27bgj                   2/2     Running   0          93m   192.168.10.27   master2   <none>           <none>
node-exporter-cnzhw                   2/2     Running   0          93m   192.168.10.30   node2     <none>           <none>
node-exporter-knqgv                   2/2     Running   0          93m   192.168.10.29   node1     <none>           <none>
node-exporter-qwbb6                   2/2     Running   0          93m   192.168.10.26   master1   <none>           <none>
prometheus-adapter-67d7695cb7-7wf9j   1/1     Running   0          95m   10.244.4.10     node2     <none>           <none>
prometheus-adapter-67d7695cb7-vbdkr   1/1     Running   0          95m   10.244.3.8      node1     <none>           <none>
prometheus-k8s-0                      2/2     Running   0          93m   10.244.3.12     node1     <none>           <none>
prometheus-k8s-1                      2/2     Running   0          93m   10.244.4.11     node2     <none>           <none>
prometheus-operator-ffcc9958-2dbgn    2/2     Running   0          94m   10.244.3.11     node1     <none>           <none>

1.2、SVC运行状态

1.2.1、svc运行状态

]# kubectl -n monitoring get svc
NAME                    TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                         AGE
alertmanager-main       NodePort    10.100.113.107   <none>        9093:30093/TCP,8080:30081/TCP   97m
alertmanager-operated   ClusterIP   None             <none>        9093/TCP,9094/TCP,9094/UDP      97m
blackbox-exporter       ClusterIP   10.105.55.97     <none>        9115/TCP,19115/TCP              96m
grafana                 NodePort    10.102.101.236   <none>        3000:30030/TCP                  106m
kube-state-metrics      ClusterIP   None             <none>        8443/TCP,9443/TCP               95m
node-exporter           ClusterIP   None             <none>        9100/TCP                        94m
prometheus-adapter      ClusterIP   10.110.224.24    <none>        443/TCP                         96m
prometheus-k8s          NodePort    10.104.132.49    <none>        9090:30090/TCP,8080:30080/TCP   93m
prometheus-operated     ClusterIP   None             <none>        9090/TCP                        93m
prometheus-operator     ClusterIP   None             <none>        8443/TCP                        95m

注意:ClusterIP=None,表示是headless服务【即无头服务】

1.2.2、svc端口开放情况分析

alertmanager-main
type: NodePort
9093:30093/TCP,8080:30081/TCP

alertmanager web端口:
pod端口:9093
svc端口:30093

alertmanager metrics端口:
pod端口:8080
svc端口:30081
--------------------------------
grafana
type: NodePort
3000:30030/TCP

pod端口:3000
svc端口:30030
--------------------------------
prometheus-k8s
type: NodePort
9090:30090/TCP,8080:30080/TCP

prometheus web端口:
pod端口:9090
svc端口:30090

prometheus metrics端口:
pod端口:8080
svc端口:30080

1.3、EndPoints查询

# 主要查询svc与endpoint关联关系
]# kubectl -n monitoring get endpoints
NAME                    ENDPOINTS                                                              AGE
alertmanager-main       10.244.3.7:8080,10.244.4.8:8080,10.244.4.9:8080 + 3 more...            112m
alertmanager-operated   10.244.3.7:9094,10.244.4.8:9094,10.244.4.9:9094 + 6 more...            112m
blackbox-exporter       10.244.3.9:9115,10.244.3.9:19115                                       110m
grafana                 10.244.3.13:3000                                                       120m
kube-state-metrics      10.244.3.10:8443,10.244.3.10:9443                                      110m
node-exporter           192.168.10.26:9100,192.168.10.27:9100,192.168.10.29:9100 + 1 more...   109m
prometheus-adapter      10.244.3.8:6443,10.244.4.10:6443                                       111m
prometheus-k8s          10.244.3.12:8080,10.244.4.11:8080,10.244.3.12:9090 + 1 more...         108m
prometheus-operated     10.244.3.12:9090,10.244.4.11:9090                                      108m
prometheus-operator     10.244.3.11:8443                                                       110m

1.4、查询prometheus资源状态

]# kubectl -n monitoring get prometheus
NAME   VERSION   DESIRED   READY   RECONCILED   AVAILABLE   AGE
k8s    2.41.0    2         2       True         True        109m

2、Prometheus Web端查询

2.1、查询targets页面

2.2、Graph页面查询

2.2.1、进行数据的采集查询

例如查询K8S集群中每个POD的CPU使用情况,可以使用如下查询条件查询:

提示:metrics的指标名称 container_cpu_usage_seconds_total sum(rate(container_cpu_usage_seconds_total{image!="", pod!=""}[1m] )) by (pod)

 

2.3、规则页面查询

这里为我们自动增加很多规则,后面可以进一步的学习,包括不知道规则怎么写的,可以参考一下。

 

2.4、如果修改Prometheus配置文件

2.4.1、解压出配置文件

# 解压prometheus.yaml
]# kubectl -n monitoring get secrets prometheus-k8s -o jsonpath='{.data.prometheus\.yaml\.gz}' | base64 -d | gzip -d

# 这里以alertmanager为例
]# kubectl -n monitoring get secrets alertmanager-main -o jsonpath='{.data.alertmanager\.yaml}' | base64 -d 
"global":
  "resolve_timeout": "5m"
"inhibit_rules":
- "equal":
  - "namespace"
  - "alertname"
  "source_matchers":
  - "severity = critical"
  "target_matchers":
  - "severity =~ warning|info"
- "equal":
  - "namespace"
  - "alertname"
  "source_matchers":
  - "severity = warning"
  "target_matchers":
  - "severity = info"
- "equal":
  - "namespace"
  "source_matchers":
  - "alertname = InfoInhibitor"
  "target_matchers":
  - "severity = info"
"receivers":
- "name": "Default"
- "name": "Watchdog"
- "name": "Critical"
- "name": "null"
"route":
  "group_by":
  - "namespace"
  "group_interval": "5m"
  "group_wait": "30s"
  "receiver": "Default"
  "repeat_interval": "12h"
  "routes":
  - "matchers":
    - "alertname = Watchdog"
    "receiver": "Watchdog"
  - "matchers":
    - "alertname = InfoInhibitor"
    "receiver": "null"
  - "matchers":
    - "severity = critical"
    "receiver": "Critical"

2.4.2、输出为文件

]# kubectl -n monitoring get secrets prometheus-k8s -o jsonpath='{.data.prometheus\.yaml\.gz}' | base64 -d | gzip -d >prometheus.yaml.v1

2.4.3、修改好后,再base64,再gzip即可

gzip -c prometheus.yaml.v1 | base64

2.4.4、再通过edit修改

再通过kubectl edit修改

3、Grafana Web端查询

3.1、登陆的帐号与密码

admin/admin

3.2、默认已经配置数据源

3.2.1、查询数据源配置

3.2.2、为什么自动配置数据源

]# vi kube-prometheus-0.12.0/manifests/prom_adapter/prometheusAdapter-deployment.yaml
    spec:
      automountServiceAccountToken: true
      containers:
      - args:
        - --cert-dir=/var/run/serving-cert
        - --config=/etc/adapter/config.yaml
        - --logtostderr=true
        - --metrics-relist-interval=1m
        - --prometheus-url=http://prometheus-k8s.monitoring.svc:9090/ # 此处已经配置
        - --secure-port=6443

3.2.3、增加画图模板

这里不再重复介绍,请看文章:https://www.cnblogs.com/ygbh/p/17299339.html#_label3_2_1_2

 

4、AlertManager Web端查询

5、prometheus、grafana、alertmanager定制Ingress-nginx

5.1、创建ingress-nginx资源

5.1.1、定义资源配置清单

kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: ingress-prometheus
  namespace: monitoring
  annotations:
    kubernetes.io/ingress.class: "nginx"
    prometheus.io/http_probe: "true"
spec:
  rules:
  - host: alert.localprom.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: alertmanager-main
            port:
              number: 9093
  - host: grafana.localprom.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: grafana
            port:
              number: 3000
  - host: prom.localprom.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: prometheus-k8s
            port:
              number: 9090
EOF

5.1.2、查询ingress资源状态

]# kubectl get ingress -n monitoring 
NAME                 CLASS    HOSTS                                                          ADDRESS   PORTS   AGE
ingress-prometheus   <none>   alert.localprom.com,grafana.localprom.com,prom.localprom.com             80      13s

5.2、配置hosts

192.168.10.222 prom.localprom.com
192.168.10.222 grafana.localprom.com
192.168.10.222 alert.localprom.com

5.3、访问测试

5.3.1、prometheus

5.3.2、grafana

5.3.3、alertmanager

6、Prometheus增加 controller、scheduler组件监控

6.1、需求

默认情况下,prometheus没有监控到 controller 和 scheduler的信息

6.1.1、targets截图

6.1.2、配置的流程

prometheus 要监控k8s的组件,我们需要关注以下两点:
1、kubernetes必须开放controller和scheduler的监听地址要定制专用的endpoint和svc资源对象
2、prometheus是根据 kubernetes-serviceMonitorKubeScheduler.yaml和kubernetesserviceMonitorKubeControllerManager.yaml 文件来进行监控的。所以,定制的svc关联出来的labels 必须要与kube-prometheus的值一致

6.2、开放监听地址【所有的master节点都要修改】

修改好,会自动加载,不用重启服务
]# kubectl -n kube-system get pods -o wide| grep -E 'schedu|control'
calico-kube-controllers-74846594dd-76m7g   1/1     Running   0                   9d      10.244.1.2      master2   <none>           <none>
kube-controller-manager-master1            1/1     Running   0                   48s     192.168.10.26   master1   <none>           <none>
kube-controller-manager-master2            1/1     Running   0                   2m2s    192.168.10.27   master2   <none>           <none>
kube-scheduler-master1                     1/1     Running   0                   49s     192.168.10.26   master1   <none>           <none>
kube-scheduler-master2                     1/1     Running   0                   2m42s   192.168.10.27   master2   <none>           <none>

6.2.1、controller-manager修改

]# vi /etc/kubernetes/manifests/kube-controller-manager.yaml 
spec:
  containers:
  - command:
    - kube-controller-manager
    - --allocate-node-cidrs=true
    - --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
    - --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
    - --bind-address=0.0.0.0
    - --client-ca-file=/etc/kubernetes/pki/ca.crt
    - --cluster-cidr=10.244.0.0/16

6.2.2、scheduler修改

]# vi /etc/kubernetes/manifests/kube-scheduler.yaml 
  - command:
    - kube-scheduler
    - --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
    - --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
    - --bind-address=0.0.0.0
    - --kubeconfig=/etc/kubernetes/scheduler.conf
    - --leader-elect=true

6.3、定制采集controller数据

6.3.1、创建资源配置清单

kubectl apply -f - <<EOF
apiVersion: v1
kind: Service
metadata:
  name: kube-controller-manager
  namespace: kube-system
  labels:
    app.kubernetes.io/name: kube-controller-manager
spec:
  type: ClusterIP
  clusterIP: None
  ports:
    - name: https-metrics
      port: 10257
      targetPort: 10257
      protocol: TCP
---
apiVersion: v1
kind: Endpoints
metadata:
  name: kube-controller-manager
  namespace: kube-system
  labels:
    app.kubernetes.io/name: kube-controller-manager
subsets:
- addresses:
  - ip: 192.168.10.26
  - ip: 192.168.10.27
  ports:
    - name: https-metrics
      port: 10257
      protocol: TCP
EOF

# 属性解析:这里面的addresses 是master的节点地址,把所有master节点都配置上

6.3.2、查询运行状态

]# kubectl -n kube-system get svc
NAME                      TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)                        AGE
kube-controller-manager   ClusterIP   None           <none>        10257/TCP                      114s
]# kubectl -n kube-system get endpoints NAME ENDPOINTS AGE kube-controller-manager 192.168.10.26:10257,192.168.10.27:10257 2m1s

6.4、定制采集Scheduler数据

6.4.1、创建资源配置清单

kubectl apply -f - <<EOF
apiVersion: v1
kind: Service
metadata:
  name: kube-scheduler
  namespace: kube-system
  labels:
    app.kubernetes.io/name: kube-scheduler
spec:
  type: ClusterIP
  clusterIP: None
  ports:
    - name: https-metrics
      port: 10259
      targetPort: 10259
      protocol: TCP
---
apiVersion: v1
kind: Endpoints
metadata:
  name: kube-scheduler
  namespace: kube-system
  labels:
    app.kubernetes.io/name: kube-scheduler
subsets:
- addresses:
  - ip: 192.168.10.26
  - ip: 192.168.10.27
  ports:
    - name: https-metrics
      port: 10259
      protocol: TCP
EOF

# 属性解析:这里面的addresses 是master的节点地址,把所有master节点都配置上

6.3.3、查询运行状态

]# kubectl -n kube-system get endpoints
NAME                      ENDPOINTS                                                                 AGE
kube-scheduler            192.168.10.26:10259,192.168.10.27:10259                                   53s

]# kubectl -n kube-system get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kube-scheduler ClusterIP None <none> 10259/TCP 57s

6.4、查询prometheus targets是否有增加

此时已经增加完成

 

 

标签:kube,prometheus,TCP,grafana,10.244,Prometheus,192.168,K8S,Running
From: https://www.cnblogs.com/ygbh/p/17311363.html

相关文章

  • K8S的资源监控与资源指标
    Kubernetes系统上的关键指标大体可以分为两个主要组成部分:集群系统本身的指标和容器应用相关的指标。对于集群系统本身相关的监控层面而言,监控整个Kubernetes集群的健康状况是最核心的需求,包括所有工作节点是否运行正常、系统资源容量大小、每个工作节点上运行的容器化应用的数量以......
  • k8s详细教程零基础
    Kubernetes(k8s)作为云原生的核心平台,吸引了越来越多的运维、开发、测试以及其他技术员去了解学习,随着行业越来越内卷,k8s已经被广泛使用,作为一名运维人员,k8s将成为一个必须掌握的技术点·,同时,我们也可以依靠它跳槽涨薪。一、什么是k8s它前生是谷歌的Borg系统,后经过Go语言重写,在2014......
  • 63、K8S-使用K8S部署Prometheus、grafana
    Kubernetes学习目录1、准备工作1.1、教程Github地址https://github.com/prometheus-operator/kube-prometheus.git1.2、下载编写好的yamlwgethttps://github.com/prometheus-operator/kube-prometheus/archive/refs/tags/v0.12.0.tar.gz1.3、解压项目代码tarxvfkub......
  • k8s---Calico网络
    什么是calico?Calico是一种开源网络和网络安全解决方案,适用于容器、虚拟机和基于主机的本机工作负载。Calico支持广泛的平台,包括Kubernetes,OpenShift,MirantisKubernetesEngine(MKE),OpenStack和裸机服务。无论您选择使用Calico的eBPF数据平面还是Linux的标准网络管道,Calico......
  • grafana+influxdb2+jmeter5.4搭建服务监控平台
    一.grafana+influxdb2安装通过docker的方式,创建个目录,写docker-compse1.docker-compse.ymlversion:"3"services:influxdb:image:influxdb:2.2.0container_name:influxdbports:-"8086:8086"grafana:image:grafana/grafana......
  • 62、Prometheus-远端存储-Influxdb部署
    1、基础知识1.1、官方文档https://docs.influxdata.com/influxdb/v1.8/supported_protocols/prometh1.2、需求需把要prometheus数据存到其他远程服务器上2、Influxdb部署2.1、配置yum源cat<<EOF|sudotee/etc/yum.repos.d/influxdb.repo[influxdb]name=Influx......
  • k8s-StatefulSet
    1、StatefulSet介绍RC、Deployment、DaemonSet都是面向无状态的服务,它们所管理的Pod的IP、名字,启停顺序等都是随机的,而StatefulSet是什么?顾名思义,有状态的集合,管理所有有状态的服务,比如MySQL、Bmongo复制集,rediscluster,rabbitmqcluster集群等。StatefulSet本质上是Deploym......
  • 60、Prometheus-alertmanager、邮件告警配置
    1、规则解析1.1、规则简介Prometheus支持两种类型的规则:记录规则和警报规则,它们可以进行配置,然后定期进行评估。要将规则包含在Prometheus中,需要先创建一个包含必要规则语句的文件,并让Prometheus通过Prometheus配置中的rule_fies字段加载该文件。默认情况下,prometheus的规则......
  • 对K8S的架构原理
    这样讲解,对K8S的架构原理不会迷惑了吧!点击关注......
  • 从 1 秒到 10 毫秒!在 APISIX 中减少 Prometheus 请求阻塞
    本文介绍了Prometheus插件造成长尾请求现象的原因,以及如何解决这个问题。作者屠正松,ApacheAPISIXPMCMember。原文链接现象在APISIX社区中,曾有部分用户陆续反馈一种神秘现象:部分请求延迟较长。具体表现为:当流量请求进入一个正常部署的APISIX集群时,偶尔会出现部分请......