prometheus基本使用

标签：基本 job scrape Prometheus prometheus 监控使用 configs

参考链接：
https://www.prometheus.wang/
作者总结的很好，大家都可以跟着学习看看

prometheus的由来

受启发与google的brogmon监控系统，从2012年开始由前Google工程师在Soundcloud以开源软件的形式进行研发，并且于2015年早期对外发布早期版本。2016年5月继Kubernetes之后成为第二个正式加入CNCF基金会的项目，同年6月正式发布1.0版本。2017年底发布了基于全新存储层的2.0版本，能更好地与容器平台、云平台配合。

监控的目标

监控系统需要能够有效的支持白盒监控和黑盒监控。

白盒监控

通过白盒能够了解其内部的实际运行状态，通过对监控指标的观察能够预判可能出现的问题，

黑盒监控

创建的如HTTP探针，TCP探针等，可以在系统或者服务在发生故障时能够快速通过相关的人员进行处理。通过建立完善的监控体系，从而达到以下目的

长期趋势分析
对照分析
告警
故障分析与定位
数据可视化

与常见监控系统比较

优势：易于管理，监控服务的内部运行状态，强大的数据模型，强大的查询语言 PromQL，高效，可扩展，易于集成，可视化，开放性

安装Prometheus Server

软件下载地址

https://prometheus.io/download/

docker安装

docker run -p 9090:9090 -v /etc/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml prom/prometheus

默认的配置文件

promethes.yml:

# my global config 全局配置
global:
  # 拉取时间间隔 设置为每15秒一次。默认是每1分钟一次。
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  # 每15秒评估一次规则。默认为每1分钟一次
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # 拉取超时时间 设置为全局默认值(10秒)。
  # scrape_timeout is set to the global default (10s).

# 告警管理组件配置
# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093

# 加载一次规则，并根据全局'evaluation_interval'定期对它们进行评估
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# 一个只包含一个端点的拉取配置:
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
# 拉取配置
scrape_configs:
  # 作业名称作为标签' job=<job_name> '添加到从该配置中抓取的任何时间序列中。
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
    - targets: ['localhost:9090']

安装Node Exporter

node exporter简介

在Prometheus的架构设计中，Prometheus Server并不直接服务监控特定的目标，其主要任务负责数据的收集，存储并且对外提供数据查询支持。因此为了能够能够监控到某些东西，如主机的CPU使用率，我们需要使用到Exporter。Prometheus周期性的从Exporter暴露的HTTP服务地址（通常是/metrics）拉取监控样本数据。

从上面的描述中可以看出Exporter可以是一个相对开放的概念，其可以是一个独立运行的程序独立于监控目标以外，也可以是直接内置在监控目标中。只要能够向Prometheus提供标准格式的监控样本数据即可。

这里为了能够采集到主机的运行指标如CPU, 内存，磁盘等信息。我们可以使用Node Exporter。

下载参考上面的prometheus二进制下载链接

解压完成后，直接启动即可
./node_exporter

配置prometheus从node exporter拉取监控数据

编辑prometheus.yml并在scrape_configs节点下添加以下内容:

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']
  # 采集node exporter监控数据
  - job_name: 'node'
    static_configs:
      - targets: ['localhost:9100']

重新启动Prometheus Server

访问http://localhost:9090，进入到Prometheus Server。如果输入“up”并且点击执行按钮以后，可以看到如下结果：

up{instance="localhost:9090",job="prometheus"}    1
up{instance="localhost:9100",job="node"}    1

其中“1”表示正常，反之“0”则为异常。

监控数据可视化

使用grafana创建可视化Dashboard

docker run -d -p 3000:3000 grafana/grafana

选择数据源可就可以看到prometheus的数据了。

标签：基本,job,scrape,Prometheus,prometheus,监控,使用,configs
From： https://www.cnblogs.com/jasmine456/p/17986494