写在前面
在按照下面步骤操作之前,请先确保服务器已经部署k8s,prometheus以及prometheus operator,关于这些环境的部署,可以自行查找相关资料安装部署,本文档便不在此赘述。
关于prometheus监控这部分,大致的系统架构图如下,感兴趣的同学可以自行研究一下,这里就不再具体说明。
1、Deployment(工作负载)以及Service(服务)部署
配置yaml可参考如下:
--- apiVersion: apps/v1 kind: Deployment metadata: labels: app: elasticsearch-exporter name: elasticsearch-exporter namespace: prometheus-exporter spec: replicas: 1 selector: matchLabels: app: elasticsearch-exporter template: metadata: annotations: prometheus.io/scrape: 'true' prometheus.io/port: '9114' prometheus.io/path: 'metrics' labels: app: elasticsearch-exporter spec: containers: - command: - '/bin/elasticsearch_exporter' # 设置账号密码格式参考:--es.uri=http://username:password@localhost:9200 - '--es.uri=http://admin:[email protected]:9200' image: prometheuscommunity/elasticsearch-exporter:v1.5.0 imagePullPolicy: IfNotPresent name: elasticsearch-exporter ports: - containerPort: 9114 --- apiVersion: v1 kind: Service metadata: labels: app: elasticsearch-exporter name: elasticsearch-exporter-svc namespace: prometheus-exporter spec: ports: - name: http port: 9114 protocol: TCP targetPort: 9114 selector: app: elasticsearch-exporter type: ClusterIP
说明:
1> 关于yaml中配置,prometheus operator官方也有对应的模板说明,官方地址可如下:https://github.com/prometheus-community/elasticsearch_exporter
2> 关于elasticsearch exporter 镜像版本可以根据需要选择对应的版本,官方镜像仓库地址如下:https://hub.docker.com/r/prometheuscommunity/elasticsearch-exporter/tags
3> 部署成功图如下:
(1)Deployment(工作负载)
(2)Service(服务)
2、创建ServiceMonitor配置文件
yaml配置文件如下:
--- apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: labels: app: elasticsearch-exporter name: elasticsearch-exporter- namespace: prometheus-exporter spec: endpoints: - honorLabels: true interval: 1m path: /metrics port: http scheme: http params: target: - 'es-cluster.monitorsoftware:9200' relabelings: - sourceLabels: [__param_target] targetLabel: instance namespaceSelector: matchNames: - prometheus-exporter selector: matchLabels: app: elasticsearch-exporter
说明:
1> prometheus operator是通过ServiceMonitor发现监控目标,并对其进行监控。serviceMonitor 是对service 获取数据的一种方式。
- promethus-operator可以通过serviceMonitor 自动识别带有某些 label 的service ,并从这些service 获取数据。
- serviceMonitor 也是由promethus-operator 自动发现的。
2> prometheus监控过程如下:
3> 部署成功图如下
(1)serviceMonitor部署
(2)Prometheus部署成功图
3、Prometheus告警规则配置
prometheus rule规则配置:
--- apiVersion: monitoring.coreos.com/v1 kind: PrometheusRule metadata: labels: prometheus: k8s role: alert-rules name: elasticsearch-exporter-rules namespace: k8s-monitor-system spec: groups: - name: elasticsearch-exporter rules: - alert: es-ElasticsearchHealthyNodes expr: elasticsearch_cluster_health_number_of_nodes < 3 for: 0m labels: severity: critical annotations: summary: Elasticsearch Healthy Nodes (instance {{ $labels.instance }}) description: "Missing node in Elasticsearch cluster\n VALUE = {{ $value }}\n LABELS = {{ $labels }}" - alert: es-ElasticsearchClusterRed expr: elasticsearch_cluster_health_status{color="red"} == 1 for: 0m labels: severity: critical annotations: summary: Elasticsearch Cluster Red (instance {{ $labels.instance }}) description: "Elastic Cluster Red status\n VALUE = {{ $value }}\n LABELS = {{ $labels }}" - alert: es-ElasticsearchClusterYellow expr: elasticsearch_cluster_health_status{color="yellow"} == 1 for: 0m labels: severity: warning annotations: summary: Elasticsearch Cluster Yellow (instance {{ $labels.instance }}) description: "Elastic Cluster Yellow status\n VALUE = {{ $value }}\n LABELS = {{ $labels }}" - alert: es-ElasticsearchDiskOutOfSpace expr: elasticsearch_filesystem_data_available_bytes / elasticsearch_filesystem_data_size_bytes * 100 < 10 for: 0m labels: severity: critical annotations: summary: Elasticsearch disk out of space (instance {{ $labels.instance }}) description: "The disk usage is over 90%\n VALUE = {{ $value }}\n LABELS = {{ $labels }}" - alert: es-ElasticsearchHeapUsageTooHigh expr: (elasticsearch_jvm_memory_used_bytes{area="heap"} / elasticsearch_jvm_memory_max_bytes{area="heap"}) * 100 > 90 for: 2m labels: severity: critical annotations: summary: Elasticsearch Heap Usage Too High (instance {{ $labels.instance }}) description: "The heap usage is over 90%\n VALUE = {{ $value }}\n LABELS = {{ $labels }}" - alert: es-ElasticsearchHealthyDataNodes expr: elasticsearch_cluster_health_number_of_data_nodes < 3 for: 0m labels: severity: critical annotations: summary: Elasticsearch Healthy Data Nodes (instance {{ $labels.instance }}) description: "Missing data node in Elasticsearch cluster\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"
说明:
1> prometheusRule规则配置,可以参考模板配置,模板网址如下:https://awesome-prometheus-alerts.grep.to/rules#elasticsearch
2> 部署成功图如下:
4、Grafana部署图
4.1、grafana dashboard地址如下:https://grafana.com/grafana/dashboards
官方推荐模板ID为:14191
4.2、dashboard效果图如下
标签:Exporter,labels,prometheus,instance,Prometheus,exporter,elasticsearch,Elasticsea From: https://www.cnblogs.com/cndarren/p/16911487.html