首页 > 其他分享 >prometheus

prometheus

时间:2023-06-29 14:56:01浏览次数:41  
标签:__ labels Prometheus server prometheus time

prometheus

https://prometheus.io/

From metrics to insight

Power your metrics and alerting with the leading
open-source monitoring solution.

 

架构

https://juejin.cn/post/7201757033321267258

  • Prometheus Server: 用于收集和存储时间序列数据
  • Client Library: 客户端库,检测应用程序代码,当Prometheus抓取实例的HTTP端点时,客户端库会将所有跟踪的metrics指标的当前状态发送到prometheus server端。
  • Exporters: prometheus支持多种exporter,通过exporter可以采集metrics数据,然后发送到prometheus server端,所有向promtheus server提供监控数据的程序都可以被称为exporter
  • Alertmanager: 从 Prometheus server 端接收到 alerts 后,会进行去重,分组,并路由到相应的接收方,发出报警,常见的接收方式有:电子邮件,微信,钉钉, slack等。
  • Grafana:监控仪表盘,可视化监控数据
  • pushgateway: 各个目标主机可上报数据到pushgateway,然后prometheus server统一从pushgateway拉取数据。

image.png

从上图可发现,Prometheus整个生态圈组成主要包括prometheus server,Exporter,pushgateway,alertmanager,grafana,Web ui界面,Prometheus server由三个部分组成,Retrieval,Storage,PromQL

  • Retrieval负责在活跃的target主机上抓取监控指标数据
  • Storage存储主要是把采集到的数据存储到磁盘中
  • PromQL是Prometheus提供的查询语言模块。

 

Prometheus工作流程

1)Prometheus server可定期从活跃的(up)目标主机上(target)拉取监控指标数据,目标主机的监控数据可通过配置静态job或者服务发现的方式被prometheus server采集到,这种方式默认的pull方式拉取指标;也可通过pushgateway把采集的数据上报到prometheus server中;还可通过一些组件自带的exporter采集相应组件的数据;

2)Prometheus server把采集到的监控指标数据保存到本地磁盘或者数据库;

3)Prometheus采集的监控指标数据按时间序列存储,通过配置报警规则,把触发的报警发送到alertmanager

4)Alertmanager通过配置报警接收方,发送报警到邮件,微信或者钉钉等

5)Prometheus 自带的web ui界面提供PromQL查询语言,可查询监控数据

6)Grafana可接入prometheus数据源,把监控数据以图形化形式展示出来

 

理解时间序列

https://www.prometheus.wang/promql/what-is-prometheus-metrics-and-labels.html

在1.2节当中,通过Node Exporter暴露的HTTP服务,Prometheus可以采集到当前主机所有监控指标的样本数据。例如:

# HELP node_cpu Seconds the cpus spent in each mode.
# TYPE node_cpu counter
node_cpu{cpu="cpu0",mode="idle"} 362812.7890625
# HELP node_load1 1m load average.
# TYPE node_load1 gauge
node_load1 3.0703125

其中非#开头的每一行表示当前Node Exporter采集到的一个监控样本:node_cpu和node_load1表明了当前指标的名称、大括号中的标签则反映了当前样本的一些特征和维度、浮点数则是该监控样本的具体值。

样本

Prometheus会将所有采集到的样本数据以时间序列(time-series)的方式保存在内存数据库中,并且定时保存到硬盘上。time-series是按照时间戳和值的序列顺序存放的,我们称之为向量(vector). 每条time-series通过指标名称(metrics name)和一组标签集(labelset)命名。如下所示,可以将time-series理解为一个以时间为Y轴的数字矩阵:

  ^
  │   . . . . . . . . . . . . . . . . .   . .   node_cpu{cpu="cpu0",mode="idle"}
  │     . . . . . . . . . . . . . . . . . . .   node_cpu{cpu="cpu0",mode="system"}
  │     . . . . . . . . . .   . . . . . . . .   node_load1{}
  │     . . . . . . . . . . . . . . . .   . .  
  v
    <------------------ 时间 ---------------->

在time-series中的每一个点称为一个样本(sample),样本由以下三部分组成:

  • 指标(metric):metric name和描述当前样本特征的labelsets;
  • 时间戳(timestamp):一个精确到毫秒的时间戳;
  • 样本值(value): 一个float64的浮点型数据表示当前样本的值。
<--------------- metric ---------------------><-timestamp -><-value->
http_request_total{status="200", method="GET"}@1434417560938 => 94355
http_request_total{status="200", method="GET"}@1434417561287 => 94334

http_request_total{status="404", method="GET"}@1434417560938 => 38473
http_request_total{status="404", method="GET"}@1434417561287 => 38544

http_request_total{status="200", method="POST"}@1434417560938 => 4748
http_request_total{status="200", method="POST"}@1434417561287 => 4785

 

Label and relabel

https://grafana.com/blog/2022/03/21/how-relabeling-in-prometheus-works/#available-actions

Prometheus labels

Labels are sets of key-value pairs that allow us to characterize and organize what’s actually being measured in a Prometheus metric.

For example, when measuring HTTP latency, we might use labels to record the HTTP method and status returned, which endpoint was called, and which server was responsible for the request.

Each unique combination of key-value label pairs is stored as a new time series in Prometheus, so labels are crucial for understanding the data’s cardinality and unbounded sets of values should be avoided as labels.

Internal labels

But what about metrics with no labels? Prometheus also provides some internal labels for us. These begin with two underscores and are removed after all relabeling steps are applied; that means they will not be available unless we explicitly configure them to.

Some of these special labels available to us are

Label nameDescription
__name__ The scraped metric’s name
__address__ host:port of the scrape target
__scheme__ URI scheme of the scrape target
__metrics_path__ Metrics endpoint of the scrape target
__param_<name> is the value of the first URL parameter passed to the target
__scrape_interval__ The target’s scrape interval (experimental)
__scrape_timeout__ The target’s timeout (experimental)
__meta_ Special labels set set by the Service Discovery mechanism
__tmp Special prefix used to temporarily store label values before discarding them

So now that we understand what the input is for the various relabel_config rules, how do we create one? And what can they actually be used for?

 

The base <relabel_config> block

A <relabel_config> consists of seven fields. These are:

  • source_labels
  • separator (default = ;)
  • target_label
  • regex (default = (.*))
  • modulus
  • replacement (default = $1)
  • action (default = replace)

A Prometheus configuration may contain an array of relabeling steps; they are applied to the label set in the order they’re defined in. Omitted fields take on their default value, so these steps will usually be shorter.

source_labels and separator

Let’s start off with source_labels. It expects an array of one or more label names, which are used to select the respective label values. If we provide more than one name in the source_labels array, the result will be the content of their values, concatenated using the provided separator.

As an example, consider the following two metrics

my_custom_counter_total{server="webserver01",subsystem="kata"} 192  1644075044000
my_custom_counter_total{server="sqldatabase",subsystem="kata"} 147  1644075044000

The following relabel_config

source_labels: [subsystem, server]
separator: "@"

would extract these values.

kata@webserver01
kata@sqldatabase

 

PromQL

https://prometheus.io/docs/prometheus/latest/querying/examples/

Simple time series selection

Return all time series with the metric http_requests_total:

http_requests_total

Return all time series with the metric http_requests_total and the given job and handler labels:

http_requests_total{job="apiserver", handler="/api/comments"}

Return a whole range of time (in this case 5 minutes up to the query time) for the same vector, making it a range vector:

http_requests_total{job="apiserver", handler="/api/comments"}[5m]

Note that an expression resulting in a range vector cannot be graphed directly, but viewed in the tabular ("Console") view of the expression browser.

 

Monitoring Linux host metrics with the Node Exporter

https://prometheus.io/docs/guides/node-exporter/

 

Simple Demo

https://github.com/fanqingsong/docker-prometheus

Prometheus Monitoring

This repository contains minimal Prometheus Server, NodeExporter, BlackBoxExporter, AlertManager and Grafana implementation for monitoring various services. You can use this repository to monitor a bare-metal Linux instance or to monitor Apache, NGINX or other HTTP based services using Prometheus.

 

Monitoring a Bare-Metal Linux Server

To monitor a stand-alone Linux Server, you have to checkout against the tag v1.0 of the repository. Where all the configurations for monitoring a stand-alone Linux Server are available. Just docker-compose up -d and you're good to go. (You have to map alerts manually against tag v1.0)

 

Monitoring HTTP-based Web Services

The v1.1 tag of the repository monitors 2 HTTP-based Web Services by default: An Apache httpd server and NGINX server both running in Docker Containers. If either or both of them goes down, an Prometheus will fire alerts in the form emails specified in the config.yml file in the AlertManager folder.

 

https://github.com/prometheus/blackbox_exporter

Checking the results

Visiting http://localhost:9115/probe?target=google.com&module=http_2xx will return metrics for a HTTP probe against google.com. The probe_success metric indicates if the probe succeeded. Adding a debug=true parameter will return debug information for that probe.

 https://www.cnblogs.com/cyleon/p/12876897.html

HTTP 测试: 定义 Request Header 信息、判断 Http status / Http Respones Header / Http Body 内容
TCP 测试:   业务组件端口状态监听、应用层协议定义与监听
ICMP 测试: 主机探活机制
POST 测试: 接口联通性

https://github.com/prometheus/node_exporter

If you are new to Prometheus and node_exporter there is a simple step-by-step guide.

The node_exporter listens on HTTP port 9100 by default. See the --help output for more options.

 

定制数据exporter

https://github.com/prometheus/client_python#counter

from prometheus_client import start_http_server, Summary
import random
import time

# Create a metric to track time spent and requests made.
REQUEST_TIME = Summary('request_processing_seconds', 'Time spent processing request')

# Decorate function with metric.
@REQUEST_TIME.time()
def process_request(t):
    """A dummy function that takes some time."""
    time.sleep(t)

if __name__ == '__main__':
    # Start up the server to expose the metrics.
    start_http_server(8000)
    # Generate some requests.
    while True:
        process_request(random.random())

 

标签:__,labels,Prometheus,server,prometheus,time
From: https://www.cnblogs.com/lightsong/p/17514175.html

相关文章

  • 1普罗米修斯搭建_prometheus
    搭建prometheus一.更新系统时间1.下载ntp工具yuminstall-yntp2.更新系统时间ntpdatepool.ntp.org二、安装prometheus1.在普罗米修斯宿主机创建映射文件mkdir-p/root/PrometheustouchPrometheus.yml在yml文件中添加以下配置:global:scrape_interval:15s......
  • prometheus 使用 ipmi exporter 增加硬件级别监控
    prometheus监控硬件安装ipmitool并加载相应模块yuminstallipmitoolfreeipmi-ymodprobeipmi_msghandlermodprobeipmi_devintfmodprobeipmi_poweroffmodprobeipmi_simodprobeipmi_watchdog下载ipmi_exporter源码包wgethttps://github.com/soundcloud/ipmi_......
  • 使用lightdb-em或Prometheus+grafana监控lightdb/PostgreSQL
    lightdb提供了一体化的运维监控平台lightdb-em,支持集中式的监控所有的lightdb实例以及postgresql,包括单机、高可用、分布式。其架构如下: lightdb-em功能:  详细的使用可以参考官方文档,运维指南。安装包可从lightdb官网下载。如果不想使用lightdb-em......
  • prometheus 监控 hadoop + Hbase + zookeeper + mysql exporter
    1. run JMX exporter as a java agent with all the four daemons. For this I have added EXTRA_JAVA_OPTS in hadoop-env.sh and yarn-env.sh :[root@cloud01hadoop]#catyarn-env.sh|egrep-v'^$|#'exportYARN_RESOURCEMANAGER_OPTS="$YARN_RESOURC......
  • Prometheus文档--1概述
    概述什么是Prometheus?Prometheus是一个开源监控系统和报警工具,Prometheus将其指标收集并存储为时间序列数据,即指标信息与记录的时间戳以及称为标签的可选键值对一起存储。特征:Prometheus的主要特点是:具有指标名称和键/值对标识的时间序列数据的多维数据模型PromQL,一种......
  • prometheus安装和使用记录
    Gettingstarted|PrometheusConfiguration|PrometheusDownload|PrometheusDownloadGrafana|GrafanaLabs #prometheusmkdir-m=777-p/data/{download,app_logs,app/prometheus}cd/data/downloadwgethttps://github.com/prometheus/prometheus/relea......
  • python抓取prometheus容器数据,并实现监控报警
    importjsonimportmathimportpytzimportrequestsfromdatetimeimportdatetimeclassMonitoring(object):def__init__(self):self.namespace_list=["apollo","bhpc-admin-nginx","bluehelix","broker","cer......
  • prometheus报错too many open files解决
    背景:公司用的prometheus突然有一天报toomanyopenfiles错误,通过网上搜索及查看prometheus日志发现,prometheus的openfiles限制只有1024,太小了 但是系统的ulimit上限很大,因此需要解决prometheus上限只有1024的问题解决方案:找到问题以后,解决就好说了,因为我们是用systemd管理......
  • Prometheus 监控
    Prometheus最开始设计是一个面向云原生应用程序的开源的监控&报警工具,在对 Kubernetes服务发现协议分析之前,我们先来梳理下 Prometheus 如何接入云原生,实现对 Kubernetes 集群进行监控。Kubernetes 云原生集群监控主要涉及到如下三类指标:node 物理节点指标、pod&contain......
  • kube-prometheus配置报警
    cd/root/kube-prometheus/manifests[root@aws-k8s-managermanifests]#catalertmanager-secret.yamlapiVersion:v1kind:Secretmetadata:labels:app.kubernetes.io/component:alert-routerapp.kubernetes.io/instance:mainapp.kubernetes.io/name......