一、环境搭建
docker-compose安装mysql
mkdir /data/mysql -p cd /data/mysql cat > docker-compose.yaml <<"EOF" version: '3.1' services: db: image: mysql:8.0 restart: always container_name: mysql environment: TZ: Asia/Shanghai LANG: en_US.UTF-8 MYSQL_ROOT_PASSWORD: 123456 command: --default-authentication-plugin=mysql_native_password --character-set-server=utf8mb4 --collation-server=utf8mb4_general_ci --lower_case_table_names=1 --performance_schema=1 --sql-mode="" --skip-log-bin volumes: #- /data/mysql/conf:/etc/mysql/conf.d #数据文件挂载 - /data/mysql/data:/var/lib/mysql #数据文件挂载 ports: - 3306:3306 EOF
启动:
docker-compose up -d
二、监控mysql
2.1 创建监控用户
[root@test mysql]# docker exec -it mysql mysql -u root -p Enter password: 输入密码:123456 # 创建用户: CREATE USER 'exporter'@'%' IDENTIFIED BY 'password' WITH MAX_USER_CONNECTIONS 3; GRANT PROCESS, REPLICATION CLIENT, SELECT ON *.* TO 'exporter'@'%'; flushprivileges # 验证登录 docker exec -it mysql mysql -uexporter -p
2.2 mysql_exporter安装
2.2.1 二进制安装
下载地址:https://github.com/prometheus/mysqld_exporter/releases/download/v0.15.1/mysqld_exporter-0.15.1.linux-amd64.tar.gz
https://github.com/prometheus/mysqld_exporter/releases# 1.下载
wget https://github.com/prometheus/mysqld_exporter/releases/download/v0.15.1/mysqld_exporter-0.15.1.linux-amd64.tar.gz tar vxf mysqld_exporter-0.14.0.linux-amd64.tar.gz mkdir /opt/prometheus/ mv mysqld_exporter-0.15.1.linux-amd64 /opt/prometheus/mysqld_exporter # 2.创建用户 useradd -M -s /usr/sbin/nologin prometheus # 3.创建连接数据库文件 cat >/opt/prometheus/mysqld_exporter/.mysqld_exporter.cnf << "EOF" [client] user=exporter password=password host=192.168.10.100 port=3306 EOF # 4.更改exporter 文件夹权限 chmod 600 /opt/prometheus/mysqld_exporter/.mysqld_exporter.cnf chown prometheus:prometheus -R /opt/prometheus # 5.创建systemd服务 cat > /etc/systemd/system/mysqld_exporter.service <<"EOF" [Unit] Description=Prometheus MySQL Exporter After=network.target [Service] Type=simple User=prometheus Group=prometheus Restart=always ExecStart=/opt/prometheus/mysqld_exporter/mysqld_exporter \ --config.my-cnf=/opt/prometheus/mysqld_exporter/.mysqld_exporter.cnf \ --collect.auto_increment.columns \ --collect.info_schema.processlist \ --collect.binlog_size \ --collect.info_schema.tablestats \ --collect.info_schema.innodb_metrics \ --collect.info_schema.userstats \ --collect.info_schema.tables \ --collect.perf_schema.tablelocks \ --collect.perf_schema.indexiowaits \ --collect.perf_schema.tableiowaits \ --collect.slave_status [Install] WantedBy=multi-user.target EOF # 6.启动mysqld_exporter systemctl daemon-reload systemctl start mysqld_exporter systemctl enable mysqld_exporter
2.2.2 docker或docker-compose安装
docker-compose运行安装
mkdir /data/mysqld_exporter -p cd /data/mysqld_exporter cat >docker-compose.yaml<<"EOF" version: '3.3' services: mysqld-exporter: image: prom/mysqld-exporter container_name: mysqld-exporter restart: always command: - '--collect.info_schema.processlist' - '--collect.info_schema.innodb_metrics' - '--collect.info_schema.tablestats' - '--collect.info_schema.tables' - '--collect.info_schema.userstats' - '--collect.engine_innodb_status' - '--config.my-cnf=/my.cnf' volumes: - ./my.cnf:/my.cnf ports: - 9104:9104 EOF
创建my.cnf配置文件
cat > my.cnf << "EOF" [client] host=192.168.11.62 user=exporter password=password port=3306 EOF
启动:docker-compose up -d
2.3 参数解释
Name |
MySQL Version |
Description |
collect.auto_increment.columns |
5.1 |
从 information_schema 收集 auto_increment 列和最大值. |
collect.binlog_size |
5.1 |
收集所有注册的binlog文件的当前大小 |
collect.engine_innodb_status |
5.1 |
收集SHOW ENGINE INNODB STATUS |
collect.engine_tokudb_status |
5.6 |
收集SHOW ENGINE TOKUDB STATUS . |
collect.global_status |
5.1 |
收集SHOW GLOBAL STATUS(默认启用) |
collect.global_variables |
5.1 |
收集SHOW GLOBAL VARIABLES(默认启用) |
collect.info_schema.clientstats |
5.5 |
如果以 userstat=1 运行,设置为 true 以收集客户端统计信息 |
collect.info_schema.innodb_metrics |
5.6 |
从 information_schema.innodb_metrics收集指标 |
collect.info_schema.innodb_tablespaces |
5.7 |
从information_schema.innodb_sys_tablespaces收集指标 |
collect.info_schema.innodb_cmp |
5.5 |
从information_schema.innodb_cmp收集 InnoDB 压缩表指标。 |
collect.info_schema.innodb_cmpmem |
5.5 |
从information_schema.innodb_cmpmem缓冲池压缩指标。 |
collect.info_schema.processlist |
5.1 |
从 information_schema.processlist 收集线程状态计数 |
collect.info_schema.processlist.min_time |
5.1 |
线程必须处于要计算的每个状态的最短时间。 (默认值:0) |
collect.info_schema.query_response_time |
5.5 |
如果 query_response_time_stats 为 ON,则收集查询响应时间分布。 |
collect.info_schema.replica_host |
5.6 |
从 information_schema.replica_host_status 收集指标。 |
collect.info_schema.tables |
5.1 |
从information_schema.tables收集指标。 |
collect.info_schema.tables.databases |
5.1 |
要为其收集表统计信息的数据库列表,或为所有 |
collect.info_schema.tablestats |
5.1 |
如果以 userstat=1 运行,设置为 true 以收集表统计信息。 |
collect.info_schema.schemastats |
5.1 |
如果以 userstat=1 运行,设置为 true 以收集架构统计信息 |
collect.info_schema.userstats |
5.1 |
如果以 userstat=1 运行,设置为 true 以收集用户统计信息。 |
collect.mysql.user |
5.5 |
从 mysql.user 表中收集数据 |
collect.perf_schema.eventsstatements |
5.6 |
从 performance_schema.events_statements_summary_by_digest 收集指标。 |
collect.perf_schema.eventsstatements.digest_text_limit |
5.6 |
规范化语句文本的最大长度。 (默认值:120) |
collect.perf_schema.eventsstatements.limit |
5.6 |
按响应时间限制事件语句摘要的数量。 (默认值:250) |
collect.perf_schema.eventsstatements.timelimit |
5.6 |
以秒为单位限制“last_seen”事件语句的存在时间。 (默认值:86400) |
collect.perf_schema.eventsstatementssum |
5.7 |
从 performance_schema.events_statements_summary_by_digest 汇总收集指标。 |
collect.perf_schema.eventswaits |
5.5 |
从 performance_schema.events_waits_summary_global_by_event_name 收集指标 |
collect.perf_schema.file_events |
5.6 |
从 performance_schema.file_summary_by_event_name 收集指标 |
collect.perf_schema.file_instances |
5.5 |
从 performance_schema.file_summary_by_instance 收集指标。 |
collect.perf_schema.file_instances.remove_prefix |
5.5 |
删除 performance_schema.file_summary_by_instance 中的路径前缀。 |
collect.perf_schema.indexiowaits |
5.6 |
从 performance_schema.table_io_waits_summary_by_index_usage 收集指标。 |
collect.perf_schema.memory_events |
5.7 |
从 performance_schema.memory_summary_global_by_event_name 收集指标。 |
collect.perf_schema.memory_events.remove_prefix |
5.7 |
删除 performance_schema.memory_summary_global_by_event_name 中的仪器前缀。 |
collect.perf_schema.tableiowaits |
5.6 |
从 performance_schema.table_io_waits_summary_by_table 收集指标。 |
collect.perf_schema.tablelocks |
5.6 |
从 performance_schema.table_lock_waits_summary_by_table 收集指标。 |
collect.perf_schema.replication_group_members |
5.7 |
从 performance_schema.replication_group_members 收集指标。 |
collect.perf_schema.replication_group_member_stats |
5.7 |
从 performance_schema.replication_group_member_stats 收集指标。 |
collect.perf_schema.replication_applier_status_by_worker |
5.7 |
从 performance_schema.replication_applier_status_by_worker 收集指标。 |
collect.slave_status |
5.1 |
从 SHOW SLAVE STATUS 收集(默认启用) |
collect.slave_hosts |
5.1 |
从 SHOW SLAVE HOSTS 收集 |
collect.heartbeat |
5.1 |
从心跳收集 |
collect.heartbeat.database |
5.1 |
从哪里收集心跳数据的数据库。 (默认:心跳) |
collect.heartbeat.table |
5.1 |
从哪里收集心跳数据的表。 (默认:心跳) |
collect.heartbeat.utc |
5.1 |
使用 UTC 作为当前服务器的时间戳(使用 调用 )。 (默认值:假) |
2.4 Prometheus配置
配置prometheus去采集(拉取)mysql_exporter的监控样本数据
cd /data/docker-prometheus #在scrape_configs(搜刮配置):下面增加如下配置: cat >> prometheus/prometheus.yml << "EOF" - job_name: 'mysqld_exporter' static_configs: - targets: ['192.168.10.100:9104'] labels: instance: test服务器 EOF
重新加载配置并检查:
curl -X POST http://localhost:9090/-/reload
http://192.168.10.100:9104/metrics
2.5 常用监控指标
mysql_up # 服务器是否在线 mysql_global_status_uptime # 运行时长,单位 s delta(mysql_global_status_bytes_received[1m]) # 网络接收的 bytes delta(mysql_global_status_bytes_sent[1m]) # 网络发送的 bytes mysql_global_status_threads_connected # 当前的客户端连接数 mysql_global_variables_max_connections # 允许的最大连接数 mysql_global_status_threads_running # 正在执行命令的客户端连接数,即非 sleep 状态 delta(mysql_global_status_aborted_connects[1m]) # 客户端建立连接失败的连接数,比如登录失败 delta(mysql_global_status_aborted_clients[1m]) # 客户端连接之后,未正常关闭的连接数 delta(mysql_global_status_commands_total{command="xx"}[1m]) > 0 # 每分钟各种命令的次数 delta(mysql_global_status_handlers_total{handler="xx"}[1m]) > 0 # 每分钟各种操作的次数 delta(mysql_global_status_handlers_total{handler="commit"}[1m]) > 0 # 每分钟 commit 的次数 delta(mysql_global_status_table_locks_immediate[1m]) # 请求获取锁,且立即获得的请求数 delta(mysql_global_status_table_locks_waited[1m]) # 请求获取锁,但需要等待的请求数。该值越少越好 delta(mysql_global_status_queries[1m]) # 每分钟的查询数 delta(mysql_global_status_slow_queries[1m]) # 慢查询数。如果未启用慢查询日志,则为 0 mysql_global_status_innodb_page_size # innodb 数据页的大小,单位 bytes mysql_global_variables_innodb_buffer_pool_size # innodb_buffer_pool 的限制体积 mysql_global_status_buffer_pool_pages{state="data"} # 包含数据的数据页数,包括洁页、脏页 mysql_global_status_buffer_pool_dirty_pages # 脏页数 锁指标 mysql_global_status_innodb_row_lock_current_waits #当前正在等待的 InnoDB 行锁数量。 mysql_global_status_innodb_row_lock_time #从服务器启动以来的总 InnoDB 行锁等待时间(以毫秒为单位)。 mysql_global_status_innodb_row_lock_time_avg #每次等待 InnoDB 行锁的平均时间(以毫秒为单位) mysql_global_status_innodb_row_lock_time_max #单次等待 InnoDB 行锁的最长时间(以毫秒为单位)。 mysql_global_status_innodb_row_lock_waits #从服务器启动以来的总 InnoDB 行锁等待次数。
2.6 mysql触发器告警规则配置
cat >> prometheus/rules/mysqld.yml <<"EOF" groups: - name: MySQL rules: - alert: MysqlDown expr: mysql_up == 0 for: 30s labels: severity: critical annotations: summary: "MySQL Down,实例:{{ $labels.instance }}" description: "MySQL_exporter连不上MySQL了,当前状态为:{{ $value }}" - alert: MysqlTooManyConnections expr: max_over_time(mysql_global_status_threads_connected[1m]) / mysql_global_variables_max_connections * 100 > 80 for: 2m labels: severity: warning annotations: summary: "Mysql连接数过多告警,实例:{{ $labels.instance }}" description: "MySQL连接数>80%,当前值:{{ $value }}" - alert: MysqlHighThreadsRunning expr: max_over_time(mysql_global_status_threads_running[1m]) > 20 for: 2m labels: severity: warning annotations: summary: "Mysql运行的线程过多,实例:{{ $labels.instance }}" description: "Mysql运行的线程 > 20,当前运行的线程:{{ $value }}" - alert: MysqlSlowQueries expr: increase(mysql_global_status_slow_queries[2m]) > 0 for: 2m labels: severity: warning annotations: summary: "Mysql慢日志告警,实例:{{ $labels.instance }}" description: "MySQL在过去2分钟有新的{{ $value }}条慢查询" #MySQL innodb 日志写入停滞 - alert: MysqlInnodbLogWaits expr: rate(mysql_global_status_innodb_log_waits[15m]) > 10 for: 0m labels: severity: warning annotations: summary: "MySQL innodb日志等待,实例:{{ $labels.instance }}" description: "MySQL innodb日志写入停滞,当前值: {{ $value }}" - alert: MysqlRestarted expr: mysql_global_status_uptime < 60 for: 0m labels: severity: info annotations: summary: "MySQL 重启,实例:{{ $labels.instance }}" description: "不到一分钟前,MySQL重启过" - alert: RowLockCurrentWaits expr: mysql_global_status_innodb_row_lock_current_waits > 0 for: 1m labels: severity: info annotations: summary: "MySQL有锁等待,实例:{{ $labels.instance }}" description: "当前有{{ $value }}个锁等待" EOF
检查配置并加载
docker exec -it prometheus promtool check config /etc/prometheus/prometheus.yml
curl -X POST http://localhost:9090/-/reload
页面检查:
http://192.168.10.14:9090/targets?search=
http://192.168.10.14:9090/rules
http://192.168.10.14:9090/alerts?search=
2.7 grafana dashboard展示
如果没有图形显示,可以edit下监控框,将里面的参数拿到http://192.168.10.14:9090/graph中搜索看看是否有这个参数
grafana展示prometheus从mysql_exporter收集到的的数据
https://grafana.com/grafana/dashboards/7362 # 大部分数据都展示不出来
https://github.com/percona/grafana-dashboards/tree/main/dashboards/MySQL
Top Process States和 Process States图形修改为如下:
mysql_info_schema_threads 替换成:
mysql_info_schema_processlist_threads
https://grafana.com/grafana/dashboards/20016-mysql-8-0/
id:20016
数据库表监控
https://grafana.com/grafana/dashboards/9625
注意:2个图表没有数据,是因为只支持percona server 和 mariadb
Buffer pool size of total ram图形问题
如果mysql安装在linux服务上,并且这台linux服务器也是用node_exporter对其监控。那Buffer pool size of total ram这个图形面板显示不正常,解决如下:
(mysql_global_variables_innodb_buffer_pool_size{instance="$host"} * 100) / on (instance) node_memory_MemTotal_bytes{instance="$host"}
修改为,然后在看看图形显示
(label_replace(mysql_global_variables_innodb_buffer_pool_size{instance="$host"}, "nodename", "$1", "instance", "(.*):.*") * 100) / on(nodename) (label_replace(node_memory_MemTotal_bytes, "nodename", "$1", "instance", "(.*):.*"))
标签:status,--,global,collect,prometheus,innodb,监控,mysql,schema From: https://www.cnblogs.com/yangmeichong/p/18156342