Kubernetes部署Prometheus并实现自定义指标HPA（安装、配置、实现全流程）

标签：http 自定义 Kubernetes prometheus metrics Prometheus myapp name

1. 安装kube-prometheus

Kube-Prometheus是一个开箱即用的监控解决方案，用于监控Kubernetes集群。它集成了Prometheus、Prometheus-Adapter、Alertmanager和一系列的导出器（exporters），使你能够轻松地收集和可视化集群中各种资源的监控数据。

1.1 克隆kube-prometheus仓库

git clone https://github.com/coreos/kube-prometheus.git

之后，按照kube-prometheus版本与您的k8s版本的对应关系，切换到合适的版本：

cd kube-prometheus/
git checkout release-0.11

1.2 部署所有组件

kubectl create -f manifests/setup
kubectl create -f manifests/

1.3 检查部署的kube-prometheus

查看pods，应包含Prometheus、prometheus-adaptor等。

kubectl get pods -n monitoring

查看当前的apiservice资源，可以看到由prometheus-adaptor提供的 “v1beta.metrics.k8s.io” 。

使用这个api，我们可以查看pod、node的cpu、内存使用情况。到此可验证kube-prometheus安装完成。

但我们的目标是实现自定义指标的HPA，还需进一步操作。

1.4 （可选）为了可以从外网访问prometheus的web ui，需要对prometheus的yaml文件进行修改。

进入kube-prometheus/manifests目录，修改 prometheus-service.yaml 文件，改为NodePort类型，设置固定端口。

使用kubectl apply -f prometheus-service.yaml 使修改生效后，访问web ui的地址

2. 配置custom-api 以及 adaptor-config 实现简单的自定义指标测试

参考

prometheus-adapter结合custom metrics API 实现kubetnetes自定义HPAblog.csdn.net/weixin_43391291/article/details/142212854编辑https://link.zhihu.com/?target=https%3A//blog.csdn.net/weixin_43391291/article/details/142212854

2.1 新建custom-metrics.yaml文件

如1.3中所示，prometheus-adaptor只提供了 metrics.k8s.io 并没有实现 custom.metrics.k8s.io

我们可以新建custom-metrics.yaml文件，以实现 custom.metrics.k8s.io。

apiVersion: apiregistration.k8s.io/v1beta1
kind: APIService
metadata:
  name: v1beta1.custom.metrics.k8s.io
spec:
  service:
    name: prometheus-adapter
    namespace: monitoring
  group: custom.metrics.k8s.io
  version: v1beta1
  insecureSkipTLSVerify: true
  groupPriorityMinimum: 100
  versionPriority: 100

2.2 利用现有指标新增简单的自定义规则

修改kube-prometheus/manifests下的prometheus-adapter-configMap.yaml文件，新增规则

adaptor-config的规则用于定义如何从Prometheus收集到的数据转换为符合Kubernetes自定义指标API的格式。

- seriesQuery: 'container_memory_working_set_bytes{job="kubelet", metrics_path="/metrics/cadvisor", container!="", image!=""}'
  seriesFilters: []
  resources:
    overrides:
      namespace:
        resource: namespace
      pod:
        resource: pod
  name:
    matches: "^(.*)"
    as: "custom_mem"
  metricsQuery: sum(<<.Series>>{<<.LabelMatchers>>,job="kubelet", metrics_path="/metrics/cadvisor", container!="", image!=""}) by (<<.GroupBy>>)

也就是说将prometheus从metricsQuery收集到的数据，转换为K8s中可用的API（custom_mem）。

2.3 （可选）自定义规则说明

name：定义了将从prometheus中查询到数据注册到custom metrics API时，它的查询名称，如本文中将它保存为cuntom_mem

seriesQuery：定义了要从prometheus中查询的指标，上面的

container_memory_working_set_bytes{job="kubelet", metrics_path="/metrics/cadvisor", container!="", image!=""}

这个语句查询了所有由kubelet收集，并通过/metrics/cadvisor暴露的容器中非空容器和非空镜像的内存工作集字节数。我们可以在prometheus的webui中使用

container_memory_working_set_bytes{job="kubelet", metrics_path="/metrics/cadvisor", namespace="monitoring", container="POD", image!=""}

语句查询monitoring命名空间下的所有POD的内存。

metricsQuery: 定义了具体的PromQL表达式，其中

<<.Series>> 为seriesQuery定义的查询指标

<<.LabelMatchers>>,为resources中定义的标签选择器的查询标签

<<.GroupBy>>为实际的分组标签

2.4 使用新增的自定义指标

使用kubectl apply将刚才修改的文件生效，重启prometheus-adaptor（直接使用kubectl delete pod prometheus-adaptor-xxx 即可，删除后会自定创建新的pod），即可看到配置的 custom api。

使用下面命令查看 v1beta1.custom.metrics.k8s.io api的指标，可以看到配置的custom_mem

kubectl get --raw="/apis/custom.metrics.k8s.io/v1beta1"   | jq

2.5 总结

至此，我们完成了简单的自定义指标测试，但是这只是在现有监控指标的基础上修改了一下（自定义了名称）我们需要实现例如http_requests_total的自定义监控指标。

3 自定义Prometheus监控程序

参考

使用Kubernetes演示金丝雀发布 - 昀溪 - 博客园www.cnblogs.com/rexcheny/p/10740536.html编辑https://link.zhihu.com/?target=https%3A//www.cnblogs.com/rexcheny/p/10740536.html

要实现自定义指标，需要编写一个程序，来收集你关心的指标数据并将其暴露在/metrics端点上。然后在Kubernetes中部署该程序（Pod、Service），确保其Pod暴露/metrics端点。接着，通过部署ServiceMonitor资源，令Prometheus可以发现、抓取这些指标数据。最终，你可以在Prometheus的WebUI中查询和可视化这些自定义指标。

3.1 创建自定义的tomcat镜像

新建myapp目录，创建Dockerfile文件。

# 使用官方提供的 Python 开发镜像作为基础镜像
FROM python:3.7.3-slim

# 创建目录
RUN mkdir /app

# 将工作目录切换为 /app 该目录为容器中的目录，相当于cd进入这个目录
WORKDIR /app

# 将Dockerfile所在目录下的这两个文件拷贝到 /app 下
ADD myapp.py requirements.txt /app/

# 使用 pip 命令安装这个应用所需要的依赖，这里通过-r指定依赖包的名称文件
RUN pip install --trusted-host mirrors.aliyun.com -r requirements.txt

# 允许外界访问容器的 5555 端口
EXPOSE 5555

# 设置版本号
ENV VERSION 1.0

# 设置容器进程为：python myapp.py，即：这个 Python 应用的启动命令
CMD ["python", "myapp.py"]

创建myapp.py

import prometheus_client
from prometheus_client import Counter, Gauge
from prometheus_client import Summary, CollectorRegistry
from flask import Response, Flask
import time
import random
import os


app = Flask(__name__)

# 定义一个注册器，注册器可以把指标都收集起来，然后最后返回注册器数据
REGISTRY = CollectorRegistry(auto_describe=False)

# 定义一个Counter类型的变量，这个变量不是指标名称，这种Counter类型只增加
# 不减少，程序重启的时候会被重新设置为0，构造函数第一个参数是定义 指标名称，
# 第二个是定义HELP中显示的内容，都属于文本
# 第三个参数是标签列表，也就是给这个指标加labels，这个也可以不设置
http_requests_total = Counter("http_requests", "Total request cout of the host", ['method', 'endpoint'], registry=REGISTRY)

# Summary类型，它可以统计2个时间
# request_processing_seconds_count 该函数被调用的数量
# request_processing_seconds_sum  该函数执行所花的时长
request_time = Summary('request_processing_seconds', 'Time spent processing request', registry=REGISTRY)


@app.route("/metrics")
def requests_count():
    """
    当访问/metrics这个URL的时候就执行这个方法，并返回相关信息。
    :return:
    """
    return Response(prometheus_client.generate_latest(REGISTRY),
                    mimetype="text/plain")

# 这个是健康检查用的
@app.route('/healthy')
def healthy():
    return "healthy"


@app.route('/')
@request_time.time()  # 这个必须要放在app.route的下面
def hello_world():
    # .inc()表示增加，默认是加1，你可以设置为加1.5，比如.inc(1.5)
    # http_requests_total.inc()
    # 下面这种写法就是为这个指标加上标签，但是这里的method和endpoint
    # 都在Counter初始化的时候放进去的。
    # 你想统计那个ULR的访问量就把这个放在哪里
    http_requests_total.labels(method="get", endpoint="/").inc()
    # 这里设置0-1之间随机数用于模拟页面响应时长
    time.sleep(random.random())
    html = "Hello World!" \
           "App Version: {version}"
    # 这里我会读取一个叫做VERSION的环境变量，
    # 这个变量会随Dockerfile设置到镜像中
    return html.format(version=os.getenv("VERSION", "888"))


if __name__ == '__main__':
    app.run(host="0.0.0.0", port="5555")

创建requirements.txt文件

Flask
prometheus_client

使用下面的命令构建镜像

docker build -t myapp:v1.0 .

使用docker save -o myapp.tar myapp:v1.0命令导出该镜像，然后将 myapp.tar 拷贝到Kubernetes集群中所有node节点上，然后在node节点使用命令进行导入镜像docker load -i ./myapp.tar。

3.2 k8s中部署自定义的镜像

创建myapp-dep.yaml，在其中部署deploy、service、servicemonitor

apiVersion: v1
kind: Service
metadata:
  annotations:
    prometheus.io/scrape: "true"
    prometheus.io/port: "5555"
  name: myapp-svc2
  labels:
    appname: myapp-svc2
spec:
  type: ClusterIP
  ports:
  - name: http
    port: 5555
    targetPort: 5555
  selector:
    appname: myapp2
---
kind: ServiceMonitor
apiVersion: monitoring.coreos.com/v1
metadata:
  name: myapp-svc-monitor2 
  labels:
    appname: myapp-svc2
spec:
  selector:
    matchLabels:
      appname: myapp-svc2
  endpoints: 
  - port: http
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-deploy-v2.0
  labels:
    appname: myapp2
spec:
  replicas: 2
  selector:
    matchLabels:
      appname: myapp2
      release: 1.0.0
  template:
    metadata:
      name: myapp2
      labels:
        appname: myapp2
        release: 1.0.0
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "5555"
    spec:
      containers:
      - name: myapp2
        image: myapp:v2.0
        imagePullPolicy: IfNotPresent
        resources:
          requests:
            cpu: "250m"
            memory: "128Mi"
          limits:
            cpu: "500m"
            memory: "256Mi"
        ports:
        - name: http
          containerPort: 5555
          protocol: TCP
        livenessProbe:
          httpGet:
            path: /healthy
            port: http
          initialDelaySeconds: 20
          periodSeconds: 10
          timeoutSeconds: 2
        readinessProbe:
          httpGet:
            path: /healthy
            port: http
          initialDelaySeconds: 20
          periodSeconds: 10
  revisionHistoryLimit: 10
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
    type: RollingUpdate

使用kubectl create命令部署这三个资源

3.3 在prometheus中查看指标

在Prometheus的web ui中查看，可以看到自定义的servicemonitor。

使用

http_requests_total{job="myapp-svc2", pod!="", namespace="default"}

命令查询，可看到两个pod的http_requests_total的值。（查不到时，先curl http://service-ip:5555发几次请求，之后就能看到了）

现在，我们成功部署了自定义程序，向prometheus提供自定义的http_requests_total监控指标。

下面我们需要新增规则。

4 配置Prometheus-adaptor的规则

我们已经配置好了自定义监控指标，下一步就是配置Prometheus-adaptor，新增规则。

来定义如何从Prometheus收集到的数据转换为符合Kubernetes自定义指标API的格式

- seriesQuery: '{__name__=~"http_requests_total"}'
  resources:
    overrides:
      namespace:
        resource: namespace
      pod:
        resource: pod
  name:
    matches: "^(.*)_total"
    as: "lzg_http_requests_per_minute"
  metricsQuery: sum(rate(http_requests_total{job="myapp-svc2"}[1m])) by (pod, namespace) * 60

name：定义名称为lzg_http_requests_per_minute的API

metricsQuery：Prometheus使用rate函数来计算每分钟的请求数。

使用kubectl apply命令使修改生效，重启Prometheus-adaptor。

通过以下命令，可查看新配制的指标

kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/*/lzg_http_requests_per_minute" |jq

部署hpa

kind: HorizontalPodAutoscaler
apiVersion: autoscaling/v2beta2
metadata:
  name: test-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp-deploy-v2.0
  # autoscale between 1 and 10 replicas
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Pods
    pods:
      metric:
        name: lzg_http_requests_per_minute
      target:
        type: Value
        averageValue: 400

5 测试HPA

使用hey工具发送请求测试hpa。

hey -n 2000 -c 20 http://service-ip:5555

自动扩缩容

标签：http,自定义,Kubernetes,prometheus,metrics,Prometheus,myapp,name
From： https://blog.csdn.net/m0_57949696/article/details/143066976