https://prometheus.io/download/#alertmanager
https://github.com/prometheus/alertmanager
介绍说明
https://blog.csdn.net/weixin_42171272/article/details/139112335
https://zhuanlan.zhihu.com/p/703090367
https://blog.csdn.net/namelijink/article/details/135487104
wget https://github.com/prometheus/alertmanager/releases/download/v0.27.0/alertmanager-0.27.0.linux-amd64.tar.gz
tar -zxf alertmanager-0.27.0.linux-amd64.tar.gz
cd alertmanager-0.27.0.linux-amd64/
vim alertmanager.yml
进行配置文件修改
点击查看代码
global:
resolve_timeout: 5m #处理超时时间,默认为5分钟
external_url: 'http://xxxxx:8893'
smtp_from: '[email protected]' # smtp_from:指定通知报警的邮箱
smtp_smarthost: 'smtp.qq.com:25' # smtp_smarthost: 使用email打开服务配置
smtp_auth_username: [email protected]' # smtp_auth_username:邮箱用户名
smtp_auth_password: xxxxxxxxxxxxj' # 此处为邮箱授权码
route:
group_by: ['alertname', 'item'] # 传入报警分组在一起的标签,如item=测试和alertname=Disk的多个报警将批处理为单个组
group_wait: 30s # 这个参数设置了在发送第一批警报之后,Alertmanager 等待新警报加入现有组的时间。此处 group_wait 被设置为 30 秒。如果在 30 秒内没有新的警报加入组,那么这个组的警报将被发送出去。
group_interval: 300s # 发送组警报的时间间隔
repeat_interval: 4h # 对同一个警报组的重复通知之间的时间间隔 对于email配置中,此项不可以设置过低,否则将会由于邮件发送太多频繁,被smtp服务器拒绝
receiver: 'email' # 发送警报的接收者的名称,以下receivers name的名称
# 定义模板
#templates:
# - '/usr/local/alertmanager/template/*.tmpl'
receivers:
# 接收邮件的邮箱
- name: 'email'
email_configs:
- to: 'xxxxxxxx'
inhibit_rules: # 抑制规则
- source_match:
severity: 'critical'
target_match:
severity: 'warning'
equal: ['alertname', 'dev', 'instance', 'prod']
启动
cd /usr/local/prometheusAlert/
./alertmanager --web.external-url="http://xxxx:8893" --web.listen-address="0.0.0.0:8893" --cluster.listen-address="0.0.0.0:8894" --config.file=alertmanager.yml
关闭进程
ps -ef |grep alertmanager |awk '{print $2}'|xargs kill -9
静默启动
nohup ./alertmanager --web.external-url="http://xxxxx:8893" --web.listen-address="0.0.0.0:8893" --cluster.listen-address="0.0.0.0:8894" --config.file=alertmanager.yml > server_alertmanager.log 2>&1 &
浏览器访问
http://xxxxxxx:8893/#/alerts
config官网
https://prometheus.io/docs/alerting/latest/configuration/#route
告警json
点击查看代码
{
"receiver": "web\\.hook",
"status": "resolved",
"alerts": [
{
"status": "resolved",
"labels": {
"alertname": "ckExceptionAlert",
"app": "gateway",
"appName": "getindex",
"severity": "warning"
},
"annotations": {
"description": "告警当前值:3.0508474576271185",
"summary": "系统监控最近5分钟服务异常"
},
"startsAt": "2024-08-07T10:51:10.04Z",
"endsAt": "2024-08-07T10:52:40.04Z",
"generatorURL": "http://http://8.219.198.22:9090/graph?g0.expr=sum+by+%28appName%29+%28increase%28app_invoke_error_count_total%5B5m%5D%29%29+%3E+1\u0026g0.tab=1",
"fingerprint": "a01e6c598cd57929"
}
],
"groupLabels": {
"alertname": "ckExceptionAlert"
},
"commonLabels": {
"alertname": "ckExceptionAlert",
"app": "gateway",
"appName": "getindex",
"severity": "warning"
},
"commonAnnotations": {
"description": "告警当前值:3.0508474576271185",
"summary": "系统监控最近5分钟服务异常"
},
"externalURL": "http://http://8.219.198.22:8893",
"version": "4",
"groupKey": "{}:{alertname=\"ckExceptionAlert\"}",
"truncatedAlerts": 0
}