1. 背景.
我打算在K8S集群部署一套Prometheus监控系统,以监控系统和各应用的各项指标,如资源、性能及自定义监控指标,具体部署方案和细节就不和大家详细说了,后面再和大家分享,这次先说我遇到问题。在Prometheus各组件都部署成功时候,我发现grafana的service的类型为 "ClusterIP",这意味着我无法在浏览器访问,于是我决定通过编辑grafana的yaml,将"ClusterIP" 改为 "NodePort" 类型,却发现在浏览器使用nodeip+端口方式还是访问不了。
2. 执行 "kubectl get all -n monitoring" 发现grafana为ClusterIP类型.
# kubectl get all -n monitoring
NAME READY STATUS RESTARTS AGE
pod/alertmanager-main-0 2/2 Running 0 4h47m
pod/blackbox-exporter-5d668b5c6-f9fds 3/3 Running 0 5h17m
pod/grafana-68fd49fd99-jhs25 1/1 Running 0 5h17m
pod/kube-state-metrics-78ddfd78fd-8blqk 3/3 Running 0 5h17m
pod/node-exporter-6lvps 2/2 Running 0 5h17m
pod/node-exporter-8pw78 2/2 Running 0 5h17m
pod/node-exporter-mmnbc 2/2 Running 0 5h17m
pod/node-exporter-p49nq 2/2 Running 1 (5h13m ago) 5h17m
pod/node-exporter-v7fvb 2/2 Running 0 5h17m
pod/prometheus-adapter-5485575f49-8r4gd 1/1 Running 0 5h17m
pod/prometheus-adapter-5485575f49-974m8 1/1 Running 0 5h17m
pod/prometheus-k8s-0 2/2 Running 0 5h15m
pod/prometheus-operator-5b687bfbb8-7djk2 2/2 Running 0 5h17m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/alertmanager-main ClusterIP 10.96.23.99 <none> 9093/TCP,8080/TCP 5h17m
service/alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 5h15m
service/blackbox-exporter ClusterIP 10.96.38.236 <none> 9115/TCP,19115/TCP 5h17m
service/grafana ClusterIP 10.96.112.113 <none> 3000/TCP 5h17m
service/kube-state-metrics ClusterIP None <none> 8443/TCP,9443/TCP 5h17m
service/node-exporter ClusterIP None <none> 9100/TCP 5h17m
service/prometheus-adapter ClusterIP 10.96.216.177 <none> 443/TCP 5h17m
service/prometheus-k8s ClusterIP 10.96.113.84 <none> 9090/TCP,8080/TCP 5h17m
service/prometheus-operated ClusterIP None <none> 9090/TCP 5h15m
service/prometheus-operator ClusterIP None <none> 8443/TCP 5h17m
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/node-exporter 5 5 5 5 5 kubernetes.io/os=linux 5h17m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/blackbox-exporter 1/1 1 1 5h17m
deployment.apps/grafana 1/1 1 1 5h17m
deployment.apps/kube-state-metrics 1/1 1 1 5h17m
deployment.apps/prometheus-adapter 2/2 2 2 5h17m
deployment.apps/prometheus-operator 1/1 1 1 5h17m
NAME DESIRED CURRENT READY AGE
replicaset.apps/blackbox-exporter-5d668b5c6 1 1 1 5h17m
replicaset.apps/grafana-68fd49fd99 1 1 1 5h17m
replicaset.apps/kube-state-metrics-78ddfd78fd 1 1 1 5h17m
replicaset.apps/prometheus-adapter-5485575f49 2 2 2 5h17m
replicaset.apps/prometheus-operator-5b687bfbb8 1 1 1 5h17m
NAME READY AGE
statefulset.apps/alertmanager-main 1/1 5h15m
statefulset.apps/prometheus-k8s 1/1 5h15m
3. 将 grafana的"ClusterIP" 改为 "NodePort" 类型.
# kubectl edit svc grafana -n monitoring
service/grafana edited
# kubectl get svc grafana -n monitoring -oyaml
apiVersion: v1
kind: Service
metadata:
creationTimestamp: "2024-04-10T01:16:27Z"
labels:
app.kubernetes.io/component: grafana
app.kubernetes.io/name: grafana
app.kubernetes.io/part-of: kube-prometheus
app.kubernetes.io/version: 10.4.0
name: grafana
namespace: monitoring
resourceVersion: "804268"
uid: c6b751b4-9710-4159-8051-0e73660577ca
spec:
clusterIP: 10.96.112.113
clusterIPs:
- 10.96.112.113
externalTrafficPolicy: Cluster
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- name: http
nodePort: 32440
port: 3000
protocol: TCP
targetPort: http
selector:
app.kubernetes.io/component: grafana
app.kubernetes.io/name: grafana
app.kubernetes.io/part-of: kube-prometheus
sessionAffinity: None
type: NodePort
###将 type: NodePort 改为 clusterIP 即可.
###将 type: NodePort 改为 clusterIP 即可.
###将 type: NodePort 改为 clusterIP 即可.
status:
loadBalancer: {}
4. 执行 "kubectl get all -n monitoring" 发现grafana变为了NodePort类型.
# kubectl get all -n monitoring
NAME READY STATUS RESTARTS AGE
pod/alertmanager-main-0 2/2 Running 0 5h1m
pod/blackbox-exporter-5d668b5c6-f9fds 3/3 Running 0 5h32m
pod/grafana-68fd49fd99-jhs25 1/1 Running 0 5h32m
pod/kube-state-metrics-78ddfd78fd-8blqk 3/3 Running 0 5h32m
pod/node-exporter-6lvps 2/2 Running 0 5h31m
pod/node-exporter-8pw78 2/2 Running 0 5h31m
pod/node-exporter-mmnbc 2/2 Running 0 5h31m
pod/node-exporter-p49nq 2/2 Running 1 (5h27m ago) 5h31m
pod/node-exporter-v7fvb 2/2 Running 0 5h31m
pod/prometheus-adapter-5485575f49-8r4gd 1/1 Running 0 5h31m
pod/prometheus-adapter-5485575f49-974m8 1/1 Running 0 5h31m
pod/prometheus-k8s-0 2/2 Running 0 5h29m
pod/prometheus-operator-5b687bfbb8-7djk2 2/2 Running 0 5h31m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/alertmanager-main ClusterIP 10.96.23.99 <none> 9093/TCP,8080/TCP 5h32m
service/alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 5h29m
service/blackbox-exporter ClusterIP 10.96.38.236 <none> 9115/TCP,19115/TCP 5h32m
service/grafana NodePort 10.96.112.113 <none> 3000:31735/TCP 5h32m
service/kube-state-metrics ClusterIP None <none> 8443/TCP,9443/TCP 5h32m
service/node-exporter ClusterIP None <none> 9100/TCP 5h31m
service/prometheus-adapter ClusterIP 10.96.216.177 <none> 443/TCP 5h31m
service/prometheus-k8s ClusterIP 10.96.113.84 <none> 9090/TCP,8080/TCP 5h31m
service/prometheus-operated ClusterIP None <none> 9090/TCP 5h29m
service/prometheus-operator ClusterIP None <none> 8443/TCP 5h31m
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/node-exporter 5 5 5 5 5 kubernetes.io/os=linux 5h32m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/blackbox-exporter 1/1 1 1 5h32m
deployment.apps/grafana 1/1 1 1 5h32m
deployment.apps/kube-state-metrics 1/1 1 1 5h32m
deployment.apps/prometheus-adapter 2/2 2 2 5h31m
deployment.apps/prometheus-operator 1/1 1 1 5h31m
NAME DESIRED CURRENT READY AGE
replicaset.apps/blackbox-exporter-5d668b5c6 1 1 1 5h32m
replicaset.apps/grafana-68fd49fd99 1 1 1 5h32m
replicaset.apps/kube-state-metrics-78ddfd78fd 1 1 1 5h32m
replicaset.apps/prometheus-adapter-5485575f49 2 2 2 5h31m
replicaset.apps/prometheus-operator-5b687bfbb8 1 1 1 5h31m
NAME READY AGE
statefulset.apps/alertmanager-main 1/1 5h29m
statefulset.apps/prometheus-k8s 1/1 5h29m
4. 采用10.0.0.104+31735端口方式访问grafana发现访问失败.
我节点ip地址为 10.0.0.104,你需要使用你自己集群节点是ip+自己NodePort端口访问grafana服务。
我节点ip地址为 10.0.0.104,你需要使用你自己集群节点是ip+自己NodePort端口访问grafana服务。
我节点ip地址为 10.0.0.104,你需要使用你自己集群节点是ip+自己NodePort端口访问grafana服务。
5. 我开始在网络上查找资料,发现是网络限制原因.
解决方法是删除monitoring命名空间下的网络策略让其从新加载pod间网络,稍微等待一会哦,在浏览器就可以正常访问了。
kubectl delete networkpolicy --all -n monitoring