一、cAdvisor简介
监控Pod指标数据需要使⽤cadvisor, cadvisor由⾕歌开源, cadvisor不仅可以搜集⼀台机器上所有运⾏的容器信息,还提供基础查询界⾯和http接⼝,⽅便其他组件如Prometheus进⾏数据抓取 cAdvisor可以对节点机器上的资源及容器进⾏实时监控和性能数据采集,包括CPU使⽤情况、内存使⽤情况、⽹络吞吐量及⽂件系统使⽤情况。
二、DaemonSet部署cAdvisor
1.准备清单文档
清单文件参考:https://github.com/google/cadvisor/tree/master/deploy/kubernetes/base
清单文件使用了kustomize配置,我这省略了,配置文件如下
apiVersion: v1 kind: Namespace metadata: name: cadvisor #自定义了名称空间,按需修改 --- apiVersion: apps/v1 # for Kubernetes versions before 1.9.0 use apps/v1beta2 kind: DaemonSet metadata: name: cadvisor namespace: cadvisor annotations: seccomp.security.alpha.kubernetes.io/pod: 'docker/default' spec: selector: matchLabels: name: cadvisor template: metadata: labels: name: cadvisor spec: tolerations: #污点容忍,忽略master的NoSchedule,具体污点可以通过descript命令查看 - effect: NoSchedule key: node-role.kubernetes.io/control-plane #你的污点未必和我一致,请确认 hostNetwork: true containers: - name: cadvisor image: gcr.io/cadvisor/cadvisor:v0.39.3 #默认国内无法下载,需要自行解决 resources: requests: memory: 400Mi cpu: 400m limits: memory: 2000Mi cpu: 800m securityContext: privileged: true #需要开启特权模式 volumeMounts: #删除readOnly挂载选项 - name: rootfs mountPath: /rootfs - name: var-run mountPath: /var/run - name: sys mountPath: /sys - name: docker mountPath: /var/lib/docker ports: - name: http containerPort: 8080 hostPort: 8080 #如果不指定则和容器的port保持一致,看实际情况修改 protocol: TCP volumes: - name: rootfs hostPath: path: / - name: var-run hostPath: path: /var/run - name: sys hostPath: path: /sys - name: docker hostPath: path: /var/lib/containerd/ #应为我的runc用的containerd,如果是docker,改成/var/lib/docker即可
2.应用清单配置
kubectl apply -f daemonset.yaml kubectl get pods -n cadvisor NAME READY STATUS RESTARTS AGE cadvisor-5d2wq 1/1 Running 0 5m cadvisor-lgb2b 1/1 Running 0 5m cadvisor-wsvh7 1/1 Running 0 5m netstat -tnlp|grep 8080 #与清单的hostPort保持一致
3.访问web界面验证
访问集群节点的8080端口
查看 metrics 接口
三、cadvisor常用指标数据及示例
常用示例
(1)获取容器CPU使用率 sum(irate(container_cpu_usage_seconds_total{image!=""}[1m])) without (cpu) (2)查询容器内存使用量(单位:字节) container_memory_usage_bytes{image!=""} (3)查询容器网络接收量(速率)(单位:字节/秒) sum(rate(container_network_receive_bytes_total{image!=""}[1m])) without(interface) (4)容器网络传输量 字节/秒 sum(rate(container_network_transmit_bytes_total{image!=""}[1m])) without(interface) (5)容器文件系统读取速率 字节/秒 sum(rate(container_fs_reads_bytes_total{image!=""}[1m])) without (device) (6)容器文件系统写入速率 字节/秒 sum(rate(container_fs_writes_bytes_total{image!=""}[1m])) without (device) (7)容器网络接收的字节数(1分钟内),根据名称查询 name=~".+" sum(rate(container_network_receive_bytes_total{name=~".+"}[1m])) by (name) (8)容器网络传输的字节数(1分钟内),根据名称查询 name=~".+" sum(rate(container_network_transmit_bytes_total{name=~".+"}[1m])) by (name) (9)所用容器system cpu的累计使用时间(1min内) sum(rate(container_cpu_system_seconds_total[1m])) (10)每个容器system cpu的使用时间(1min内) sum(irate(container_cpu_system_seconds_total{image!=""}[1m])) without (cpu) (11)每个容器的cpu使用率 sum(rate(container_cpu_usage_seconds_total{name=~".+"}[1m])) by (name) * 100 (12)总容器的cpu使用率 sum(sum(rate(container_cpu_usage_seconds_total{name=~".+"}[1m])) by (name) * 100)
四、配置prometheus采集cadvisor
1.配置prometheus
vim /usr/local/prometheus/prometheus.yml #在文件最后添加一个job - job_name: "cadvisor" static_configs: #改成你集群的节点IP和cadvisor的端口 - targets: ["192.168.100.131:8080","192.168.100.132:8080","192.168.100.133:8080"] curl -X POST http://127.0.0.1:9090/-/reload #如果没有配置热更新则需要重启
2.prometheus验证cadvisor数据
五、grafana配置 模板监控pod
1.创建新的dashboard
2.导入对应的模板,这来使用的模板ID为14282
3.查看dashboard数据
标签:container,cadvisor,sum,笔记,name,prometheus,total,cpu,cAdvisor From: https://www.cnblogs.com/panwenbin-logs/p/18385045