一、部署二进制prometheus
略,参考之前文档或自行百度
二、创建prometheus获取api-server的token
1.获取token
kubectl get sa -n monitoring monitor #上一篇给prometheus创建的sa kubectl get sa -n monitoring monitor -o yaml #每个sa默认会创建一个secret kubectl get secrets -n monitoring monitor-token-585gg -o jsonpath='{.data.token}' #从secret获取token
2.验证
token=$(kubectl get secrets -n monitoring monitor-token-585gg -o jsonpath='{.data.token}' | base64 -d ) #token需要解密 curl --cacert /etc/kubernetes/pki/ca.crt -H "Authorization: Bearer ${token}" https://192.168.10.89:6443/api/v1/nodes/k8s-node1/proxy/metrics/cadvisor|head #替换你的api-server地址和其中的node名称 value labeled by kernel version, OS version, docker version, cadvisor version & cadvisor revision. # TYPE cadvisor_version_info gauge cadvisor_version_info{cadvisorRevision="",cadvisorVersion="",dockerVersion="",kernelVersion="3.10.0-1160.108.1.el7.x86_64",osVersion="CentOS Linux 7 (Core)"} 1 # HELP container_blkio_device_usage_total Blkio Device bytes usage # TYPE container_blkio_device_usage_total counter container_blkio_device_usage_total{container="",device="/dev/vda",id="/",image="",major="253",minor="0",name="",namespace="",operation="Async",pod=""} 3.12744192e+09 1725501954356 ............
3.将token保存为文件
kubectl get secrets -n monitoring monitor-token-585gg -o jsonpath='{.data.token}' | base64 -d > k8s-cluster.token #保存token到文件 scp -P 15678 k8s-cluster.token 192.168.10.91:/usr/local/prometheus/ #拷贝到prometheus服务器
三、创建prometheus抓取job
1.抓取api-server
- job_name: 'kubernetes-apiservers-monitor' metrics_path: /metrics scheme: https tls_config: insecure_skip_verify: true #因为我们的证书是自建的,所以需要跳过验证 bearer_token_file: /usr/local/prometheus/k8s-cluster.token #我们生成的token的路径 kubernetes_sd_configs: - role: endpoints api_server: https://192.168.10.89:6443 #k8s集群的api-servier地址 tls_config: insecure_skip_verify: true bearer_token_file: /usr/local/prometheus/k8s-cluster.token relabel_configs: - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name,__meta_kubernetes_endpoint_port_name] action: keep regex: default;kubernetes;https - target_label: __address__ replacement: 192.168.10.89:6443
效果如下
2.抓取node
- job_name: 'kubernetes-node-monitor' metrics_path: /metrics scheme: http #注意此处的协议为http tls_config: insecure_skip_verify: true bearer_token_file: /usr/local/prometheus/k8s-cluster.token kubernetes_sd_configs: - role: node api_server: https://192.168.10.89:6443 tls_config: insecure_skip_verify: true bearer_token_file: /usr/local/prometheus/k8s-cluster.token relabel_configs: - source_labels: [__address__] regex: '(.*):10250' replacement: '${1}:9100' target_label: __address__ action: replace - source_labels: [__meta_kubernetes_node_label_failure_domain_beta_kubernetes_io_region] regex: '(.*)' replacement: 'NODE' action: replace target_label: Type - action: labelmap regex: __meta_kubernetes_node_label_(.+)
效果如下
3.抓取pod
因为当前k8s为新搭建集群,没有应用,所以在prometheus配置中我删除了 prometheus_io_scrape相关配置,所以所有的pod都会被抓取
- job_name: 'kubernetes-pod-monitor' tls_config: insecure_skip_verify: true bearer_token_file: /usr/local/prometheus/k8s-cluster.token kubernetes_sd_configs: - role: pod api_server: https://192.168.10.89:6443 tls_config: insecure_skip_verify: true bearer_token_file: /usr/local/prometheus/k8s-cluster.token relabel_configs: - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape] #如果和我一样新集群没有服务配置了prometheus注解可以删除此配置 action: keep regex: true - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path] action: replace target_label: __metrics_path__ regex: (.+) - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port] action: replace regex: ([^:]+)(?::\d+)?;(\d+) replacement: $1:$2 target_label: __address__ - action: labelmap regex: __meta_kubernetes_pod_label_(.+) - source_labels: [__meta_kubernetes_namespace] action: replace target_label: kubernetes_namespace - source_labels: [__meta_kubernetes_pod_name] action: replace target_label: kubernetes_pod_name
效果如下
因为所有的pod都会被抓取,有些并没有/metrics端口,所以会报错
其他抓取配置与集群内部署方式相同,主要为证书及token相关配置,此处不在赘述
标签:__,kubernetes,token,笔记,prometheus,meta,集群,pod From: https://www.cnblogs.com/panwenbin-logs/p/18398380