第八周资源限制及亲和度相关

标签：限制 name tomcat app 亲和度 magedu 第八 logstash log

第八周资源限制及亲和度相关

1.Kubernetes Container、Pod、Namespace内存及CPU限制

1）如果运行的容器没有定义资源（memory、CPU等限制，但是在namespace定义了LimitRange限制，那么该容器会继承LimitRange中的默认限制。

2）如果namespace没有定义LimitRange限制，那么该容器可以占用宿主机的最大可用资源，直到无资源可用而触发缩主机（OOM Killer)

CPU以核心为单位进行限制，单位可以是整核、浮点核心数或毫核（m/milli)

2=2核心=200% 0.5=500m=50% 1.2=1200m=120%

memory以字节为单位，单位可以是E、P、T、G、M、K、Ei、Pi、Ti、Gi、Mi、Ki

1536Mi=1.5Gi

requests(请求)为kubernetes scheduler执行pod调度时node节点至少需要拥有的资源

limits(限制)为pod运行成功后最多可以使用的资源上限

举例说明针对单个容器的相关限制

无限制时的cpu及内存的yaml配置

#apiVersion: extensions/v1beta1
apiVersion: apps/v1
kind: Deployment
metadata:
  name: limit-test-deployment
  namespace: test
spec:
  replicas: 1
  selector:
    matchLabels: #rs or deployment
      app: limit-test-pod
#    matchExpressions:
#      - {key: app, operator: In, values: [ng-deploy-80,ng-rs-81]}
  template:
    metadata:
      labels:
        app: limit-test-pod
    spec:
      containers:
      - name: limit-test-container
        image: lorel/docker-stress-ng
        #resources:
        #  limits:
        #    memory: "1024Mi"
        #  requests:
        #    memory: "1024Mi"
        #command: ["stress"]
        args: ["--vm", "3", "--vm-bytes", "256M"]
      #nodeSelector:
      #  env: group1

增加CPU及内存限制后的配置

#apiVersion: extensions/v1beta1
apiVersion: apps/v1
kind: Deployment
metadata:
  name: limit-test-deployment
  namespace: test
spec:
  replicas: 1
  selector:
    matchLabels: #rs or deployment
      app: limit-test-pod
#    matchExpressions:
#      - {key: app, operator: In, values: [ng-deploy-80,ng-rs-81]}
  template:
    metadata:
      labels:
        app: limit-test-pod
    spec:
      containers:
      - name: limit-test-container
        image: lorel/docker-stress-ng
        resources:
          limits:
            cpu: "1"
            memory: "512Mi"
          requests:
            memory: "512M"
            cpu: "1"
        #command: ["stress"]
        args: ["--vm", "3", "--vm-bytes", "256M"]
      #nodeSelector:
      #  env: group1

如查询结果，cpu，及内存均己限制到相关要求以后

创建LimitRange

apiVersion: v1
kind: LimitRange
metadata:
  name: limitrange-magedu
  namespace: test
spec:
  limits:
  - type: Container       #限制的资源类型
    max:
      cpu: "2"            #限制单个容器的最大CPU
      memory: "2Gi"       #限制单个容器的最大内存
    min:
      cpu: "500m"         #限制单个容器的最小CPU
      memory: "512Mi"     #限制单个容器的最小内存
    default:
      cpu: "500m"         #默认单个容器的CPU限制
      memory: "512Mi"     #默认单个容器的内存限制
    defaultRequest:
      cpu: "500m"         #默认单个容器的CPU创建请求
      memory: "512Mi"     #默认单个容器的内存创建请求
    maxLimitRequestRatio:
      cpu: 2              #限制CPU limit/request比值最大为2  
      memory: 2         #限制内存limit/request比值最大为1.5
  - type: Pod
    max:
      cpu: "4"            #限制单个Pod的最大CPU
      memory: "4Gi"       #限制单个Pod最大内存
  - type: PersistentVolumeClaim
    max:
      storage: 50Gi        #限制PVC最大的requests.storage
    min:
      storage: 30Gi        #限制PVC最小的requests.storage

将CPU在POD中限制值改为大于4，查看是否可以创建成功

kind: Deployment
apiVersion: apps/v1
metadata:
  labels:
    app: magedu-wordpress-deployment-label
  name: magedu-wordpress-deployment
  namespace: test
spec:
  replicas: 1
  selector:
    matchLabels:
      app: magedu-wordpress-selector
  template:
    metadata:
      labels:
        app: magedu-wordpress-selector
    spec:
      containers:
      - name: magedu-wordpress-nginx-container
        image: nginx:1.16.1
        imagePullPolicy: Always
        ports:
        - containerPort: 80
          protocol: TCP
          name: http
        env:
        - name: "password"
          value: "123456"
        - name: "age"
          value: "18"
        resources:
          limits:
            cpu: 3
            memory: 1Gi
          requests:
            cpu: 800m
            memory: 512Mi

      - name: magedu-wordpress-php-container
        image: php:5.6-fpm-alpine 
        imagePullPolicy: Always
        ports:
        - containerPort: 80
          protocol: TCP
          name: http
        env:
        - name: "password"
          value: "123456"
        - name: "age"
          value: "18"
        resources:
          limits:
            cpu: 2
            #cpu: 2
            memory: 1Gi
          requests:
            cpu: 2000m
            memory: 512Mi


---
kind: Service
apiVersion: v1
metadata:
  labels:
    app: magedu-wordpress-service-label
  name: magedu-wordpress-service
  namespace: test
spec:
  type: NodePort
  ports:
  - name: http
    port: 80
    protocol: TCP
    targetPort: 8080
    nodePort: 30133
  selector:
    app: magedu-wordpress-selector

创建失败，通过查看生成的deploy的yaml文件可以看到失败原因如下

kubectl get deployments -n test  magedu-wordpress-deployment -o yaml

修改为预期值后再创建，limit改为2，请求值与最大值比值符合2倍，最大limit符合4

kind: Deployment
apiVersion: apps/v1
metadata:
  labels:
    app: magedu-wordpress-deployment-label
  name: magedu-wordpress-deployment
  namespace: test
spec:
  replicas: 1
  selector:
    matchLabels:
      app: magedu-wordpress-selector
  template:
    metadata:
      labels:
        app: magedu-wordpress-selector
    spec:
      containers:
      - name: magedu-wordpress-nginx-container
        image: nginx:1.16.1
        imagePullPolicy: Always
        ports:
        - containerPort: 80
          protocol: TCP
          name: http
        env:
        - name: "password"
          value: "123456"
        - name: "age"
          value: "18"
        resources:
          limits:
            cpu: 512m
            memory: 512Mi
          requests:
            cpu: 512m
            memory: 512Mi

      - name: magedu-wordpress-php-container
        image: php:5.6-fpm-alpine 
        imagePullPolicy: Always
        ports:
        - containerPort: 80
          protocol: TCP
          name: http
        env:
        - name: "password"
          value: "123456"
        - name: "age"
          value: "18"
        resources:
          limits:
            cpu: 512m
            #cpu: 2
            memory: 512Mi
          requests:
            cpu: 512m
            memory: 512Mi


---
kind: Service
apiVersion: v1
metadata:
  labels:
    app: magedu-wordpress-service-label
  name: magedu-wordpress-service
  namespace: test
spec:
  type: NodePort
  ports:
  - name: http
    port: 80
    protocol: TCP
    targetPort: 8080
    nodePort: 30133
  selector:
    app: magedu-wordpress-selector

2.nodeSelector、nodeName、node亲和与反亲和

nodeSelector

需要在node上打上相印的标签，因为节点选择，是用于将特定的pod部署到带有特定标签的节点，如GPU节点，或SSD节点等。

将node节点打上ssd标签

kubectl label node 192.168.44.16 disktype="ssd"
kubectl label node 192.168.44.17 disktype="ssd"

将节点调度到相关打了ssd节点的标签上，相关yaml

kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
  labels:
    app: magedu-tomcat-app2-deployment-label
  name: magedu-tomcat-app2-deployment
  namespace: test
spec:
  replicas: 2
  selector:
    matchLabels:
      app: magedu-tomcat-app2-selector
  template:
    metadata:
      labels:
        app: magedu-tomcat-app2-selector
    spec:
      containers:
      - name: magedu-tomcat-app2-container
        image: tomcat:7.0.94-alpine 
        imagePullPolicy: IfNotPresent
        #imagePullPolicy: Always
        ports:
        - containerPort: 8080
          protocol: TCP
          name: http
        env:
        - name: "password"
          value: "123456"
        - name: "age"
          value: "18"
        resources:
          limits:
            cpu: 1
            memory: "512Mi"
          requests:
            cpu: 500m
            memory: "512Mi"
      nodeSelector:
        disktype: ssd

执行部署，同时查看

kubectl apply -f case1-nodeSelector.yaml

3.pod Affinity与pod antiaffinity

1.pod Affinity与pod antiaffinity简介

Pod 亲和性与反亲和性的合法操作符(operator)有 In、Notln、Exists、 DoesNotExist.

在Pod反亲和性中配置中，requiredDuringSchedulinglgnoredDuringExecution和preferredDuringSchedulinglnoredDuringExecution 中，topologyKey也不可以为空(Empty topologyKeyis not allowed.).

对于requiredDuringSchedulinglgnoredDuringExecution要求的Pod反亲和性，准入控制器LimitPodHardAntiAffinityTopology被引入以确保topologyKey只能是 kubernetes.io/hostname，如果希望 topologyKey 也可用于其他定制拓扑逻辑，可以更改准入控制器或者禁用。

2.相关测试

创建一个nginx容器，用于测试pod亲和与反亲和

kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
  labels:
    app: python-nginx-deployment-label
  name: python-nginx-deployment
  namespace: magedu
spec:
  replicas: 1
  selector:
    matchLabels:
      app: python-nginx-selector
  template:
    metadata:
      labels:
        app: python-nginx-selector
        project: python
    spec:
      containers:
      - name: python-nginx-container
        image: nginx:1.20.2-alpine
        #command: ["/apps/tomcat/bin/run_tomcat.sh"]
        #imagePullPolicy: IfNotPresent
        imagePullPolicy: Always
        ports:
        - containerPort: 80
          protocol: TCP
          name: http
        - containerPort: 443
          protocol: TCP
          name: https
        env:
        - name: "password"
          value: "123456"
        - name: "age"
          value: "18"
#        resources:
#          limits:
#            cpu: 2
#            memory: 2Gi
#          requests:
#            cpu: 500m
#            memory: 1Gi


---
kind: Service
apiVersion: v1
metadata:
  labels:
    app: python-nginx-service-label
  name: python-nginx-service
  namespace: magedu
spec:
  type: NodePort
  ports:
  - name: http
    port: 80
    protocol: TCP
    targetPort: 80
    nodePort: 30014
  - name: https
    port: 443
    protocol: TCP
    targetPort: 443
    nodePort: 30453
  selector:
    app: python-nginx-selector
    project: python #一个或多个selector，至少能匹配目标pod的一个标签

kubectl apply -f case4-4.1-nginx.yaml

root@k8s-master1:/opt/dockerfile/k8s-data/20220821/Affinit-case# kubectl get po -n test
NAME                                       READY   STATUS      RESTARTS   AGE
python-nginx-deployment-5c658bf86b-nv7sm   1/1     Running     0          45s
ubuntu1804                                 0/1     Completed   0          31d

pod Affinity 软亲和

实现软亲和调度到同一个节点

kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
  labels:
    app: magedu-tomcat-app2-deployment-label
  name: magedu-tomcat-app2-deployment
  namespace: test
spec:
  replicas: 1
  selector:
    matchLabels:
      app: magedu-tomcat-app2-selector
  template:
    metadata:
      labels:
        app: magedu-tomcat-app2-selector
    spec:
      containers:
      - name: magedu-tomcat-app2-container
        image: tomcat:7.0.94-alpine
        imagePullPolicy: IfNotPresent
        #imagePullPolicy: Always
        ports:
        - containerPort: 8080
          protocol: TCP
          name: http
      affinity:
        podAffinity:  #Pod亲和
          #requiredDuringSchedulingIgnoredDuringExecution: #硬亲和，必须匹配成功才调度，如果匹配失败则拒绝调度。
          preferredDuringSchedulingIgnoredDuringExecution: #软亲和，能匹配成功就调度到一个topology，匹配不成功会由kubernetes自行调度。
          - weight: 100
            podAffinityTerm:
              labelSelector: #标签选择
                matchExpressions: #正则匹配
                - key: project
                  operator: In
                  values:
                    - python
              topologyKey: kubernetes.io/hostname 
              namespaces:
                - test

kubectl apply -f case4-4.2-podaffinity-preferredDuring.yaml
root@k8s-master1:/opt/dockerfile/k8s-data/20220821/Affinit-case# kubectl get po -o wide -n test
NAME                                             READY   STATUS      RESTARTS   AGE   IP               NODE            NOMINATED NODE   READINESS GATES
magedu-tomcat-app2-deployment-6cf5fbbbb8-gln59   1/1     Running     0          39s   172.20.169.137   192.168.44.16   <none>           <none>
python-nginx-deployment-5c658bf86b-nv7sm         1/1     Running     0          12m   172.20.169.175   192.168.44.16   <none>           <none>

pod Affinity-硬亲和

kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
  labels:
    app: magedu-tomcat-app2-deployment-label
  name: magedu-tomcat-app2-deployment
  namespace: test
spec:
  replicas: 2
  selector:
    matchLabels:
      app: magedu-tomcat-app2-selector
  template:
    metadata:
      labels:
        app: magedu-tomcat-app2-selector
    spec:
      containers:
      - name: magedu-tomcat-app2-container
        image: tomcat:7.0.94-alpine
        imagePullPolicy: IfNotPresent
        #imagePullPolicy: Always
        ports:
        - containerPort: 8080
          protocol: TCP
          name: http
      affinity:
        podAffinity:
          requiredDuringSchedulingIgnoredDuringExecution: #硬亲和
          - labelSelector:
              matchExpressions:
              - key: project
                operator: In
                values:
                  - python
            topologyKey: "kubernetes.io/hostname"
            namespaces:
              - test

kubectl apply -f  case4-4.3-podaffinity-requiredDuring.yaml
root@k8s-master1:/opt/dockerfile/k8s-data/20220821/Affinit-case# kubectl get po -n test -o wide
NAME                                             READY   STATUS      RESTARTS   AGE     IP               NODE            NOMINATED NODE   READINESS GATES
magedu-tomcat-app2-deployment-5b7848b84b-252nm   1/1     Running     0          13s     172.20.169.188   192.168.44.16   <none>           <none>
magedu-tomcat-app2-deployment-5b7848b84b-wcrbd   1/1     Running     0          13s     172.20.169.190   192.168.44.16   <none>           <none>
python-nginx-deployment-5c658bf86b-nv7sm         1/1     Running     0          5h30m   172.20.169.175   192.168.44.16   <none>           <none>

podAntiAffinity -硬反亲和

kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
  labels:
    app: magedu-tomcat-app2-deployment-label
  name: magedu-tomcat-app2-deployment
  namespace: test
spec:
  replicas: 1
  selector:
    matchLabels:
      app: magedu-tomcat-app2-selector
  template:
    metadata:
      labels:
        app: magedu-tomcat-app2-selector
    spec:
      containers:
      - name: magedu-tomcat-app2-container
        image: tomcat:7.0.94-alpine
        imagePullPolicy: IfNotPresent
        #imagePullPolicy: Always
        ports:
        - containerPort: 8080
          protocol: TCP
          name: http
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: project
                operator: In
                values:
                  - python
            topologyKey: "kubernetes.io/hostname"
            namespaces:
              - test

kubectl apply -f case4-4.4-podAntiAffinity-requiredDuring.yaml
root@k8s-master1:/opt/dockerfile/k8s-data/20220821/Affinit-case# kubectl get po -n test -o wide
NAME                                             READY   STATUS      RESTARTS   AGE     IP               NODE            NOMINATED NODE   READINESS GATES
magedu-tomcat-app2-deployment-56d999f8cd-k8sb2   1/1     Running     0          9s      172.20.108.176   192.168.44.17   <none>           <none>
python-nginx-deployment-5c658bf86b-nv7sm         1/1     Running     0          5h40m   172.20.169.175   192.168.44.16   <none>           <none>

podAntiAffinity -软反亲和

ind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
  labels:
    app: magedu-tomcat-app2-deployment-label
  name: magedu-tomcat-app2-deployment
  namespace: test
spec:
  replicas: 1
  selector:
    matchLabels:
      app: magedu-tomcat-app2-selector
  template:
    metadata:
      labels:
        app: magedu-tomcat-app2-selector
    spec:
      containers:
      - name: magedu-tomcat-app2-container
        image: tomcat:7.0.94-alpine
        imagePullPolicy: IfNotPresent
        #imagePullPolicy: Always
        ports:
        - containerPort: 8080
          protocol: TCP
          name: http
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: project 
                  operator: In
                  values:
                    - pythonx
              topologyKey: kubernetes.io/hostname 
              namespaces: 
                - test

root@k8s-master1:/opt/dockerfile/k8s-data/20220821/Affinit-case# kubectl get po -n test -o wide
NAME                                             READY   STATUS      RESTARTS   AGE     IP               NODE            NOMINATED NODE   READINESS GATES
magedu-tomcat-app2-deployment-7d4c777b6d-mcp5v   1/1     Running     0          73s     172.20.169.186   192.168.44.16   <none>           <none>
python-nginx-deployment-5c658bf86b-nv7sm         1/1     Running     0          5h47m   172.20.169.175   192.168.44.16   <none>           <none>

由于是软反亲和，两个pod并没有实现落到不同的node上。

4.污点和容忍度

污点(taints),用于node节点排斥Pod调度，与亲和的作用是完全相反的,即taint的node和pod是排斥调度关系

容忍(toleration),用于Pod容忍node节点的污点信息，即node有污点信息也会将新的pod调度到node

https://kubernetes.io/zh/docs/concepts/scheduling-eviction/taint-and-toleration/

污点的三种类型:

NoSchedule: 表示k8s将不会将Pod调度到具有该污点的Node上

kubectl taint nodes 172.31.7.111 key1=value1:NoSchedule #设置污点

node/172.31.7.111 tainted

kubectl describe node 172.31.7.111 #查看污点

Taints:key1=value1:NoSchedule

kubectl taint node 172.31.7.111 key1:NoSchedule- #取消污点

node/172.31.7.111 untainted

PreferNoSchedule: 表示k8s将尽量避免将Pod调度到具有该污点的Node 上

NoExecute: 表示k8s将不会将POO调度到具有该污点的Node上，同时会将Nooe上已经存在的POO强制驱逐出去

kubectl taint nodes 172.31.7.111 key1=value1:NoExecute

tolerations容忍

定义 Pod 的容忍度(可以接受node的哪些污点)，容忍后可以将Pod调度至含有该污点的node。

基于operator的污点匹配

如果operator是Exists，则容忍度不需要value而是直接匹配污点类型。

如果operator是Equal，则需要指定value并且value的值需要等于tolerations的key。

举例

节点打上污点

kubectl taint nodes 192.168.44.16 key1=value1:NoSchedule
node/192.168.44.16 tainted

kubectl describe nodes 192.168.44.16
Taints:             key1=value1:NoSchedule

设置pod的容忍度

kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
  labels:
    app: magedu-tomcat-app1-deployment-label
  name: magedu-tomcat-app1-deployment
  namespace: test
spec:
  replicas: 3
  selector:
    matchLabels:
      app: magedu-tomcat-app1-selector
  template:
    metadata:
      labels:
        app: magedu-tomcat-app1-selector
    spec:
      containers:
      - name: magedu-tomcat-app1-container
        #image: harbor.magedu.net/magedu/tomcat-app1:v7
        image: tomcat:7.0.93-alpine 
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 8080
          protocol: TCP
          name: http
      tolerations: 
      - key: "key1"
        operator: "Equal"
        value: "value1"
        effect: "NoSchedule"

---
kind: Service
apiVersion: v1
metadata:
  labels:
    app: magedu-tomcat-app1-service-label
  name: magedu-tomcat-app1-service
  namespace: test
spec:
  type: NodePort
  ports:
  - name: http
    port: 80
    protocol: TCP
    targetPort: 8080
    #nodePort: 40003
  selector:
    app: magedu-tomcat-app1-selector

kubectl apply -f case5.1-taint-tolerations.yaml 
root@k8s-master1:/opt/dockerfile/k8s-data/20220821/Affinit-case# kubectl get po -n test -o wide
NAME                                             READY   STATUS      RESTARTS   AGE     IP               NODE            NOMINATED NODE   READINESS GATES
magedu-tomcat-app1-deployment-785c799fc9-fj2rj   1/1     Running     0          55s     172.20.169.191   192.168.44.16   <none>           <none>

设置不容忍，查看调度情况

kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
  labels:
    app: magedu-tomcat-app1-deployment-label
  name: magedu-tomcat-app1-deployment
  namespace: test
spec:
  replicas: 3
  selector:
    matchLabels:
      app: magedu-tomcat-app1-selector
  template:
    metadata:
      labels:
        app: magedu-tomcat-app1-selector
    spec:
      containers:
      - name: magedu-tomcat-app1-container
        #image: harbor.magedu.net/magedu/tomcat-app1:v7
        image: tomcat:7.0.93-alpine 
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 8080
          protocol: TCP
          name: http

---
kind: Service
apiVersion: v1
metadata:
  labels:
    app: magedu-tomcat-app1-service-label
  name: magedu-tomcat-app1-service
  namespace: test
spec:
  type: NodePort
  ports:
  - name: http
    port: 80
    protocol: TCP
    targetPort: 8080
    #nodePort: 40003
  selector:
    app: magedu-tomcat-app1-selector

kubectl apply -f case5.2-notaint-tolerations.yaml
root@k8s-master1:/opt/dockerfile/k8s-data/20220821/Affinit-case# kubectl get po -o wide -n test
NAME                                             READY   STATUS      RESTARTS   AGE     IP               NODE            NOMINATED NODE   READINESS GATES
magedu-tomcat-app1-deployment-5849d9d85b-d5rgq   1/1     Running     0          7s      172.20.108.182   192.168.44.17   <none>           <none>
magedu-tomcat-app1-deployment-5849d9d85b-fzkh8   1/1     Running     0          7s      172.20.38.21     192.168.44.15   <none>           <none>
magedu-tomcat-app1-deployment-5849d9d85b-x4npw   1/1     Running     0          7s      172.20.38.16     192.168.44.15   <none>           <none>

去掉容忍度后，无法调度到16节点。

取消节点污点

kubectl taint nodes 192.168.44.16 key1:NoSchedule-
kubectl describe nodes 192.168.44.16
Taints:             <none>

4.pod驱逐

力驱逐是由各kubelet进程主动终止Pod，以回收节点上的内存、磁盘空间等资源的过程，kubelet
监控当前node节点的CPU、内存、磁盘空间和文件系统的inode等资源，当这些资源中的一个或者多个
达到特定的消耗水平，kubelet就会主动地将节点上一个或者多个Pod强制驱逐，以防止当前node节点
资源无法正常分配而引发的OOM(OutOfMemory)。

https://kubernetes.io/zh/docs/concepts/scheduling-eviction/node-pressure-eviction/

宿主机内存：

memory.available #node节点可用内存，默认 <100Mi

nodefs是节点的主要文件系统,用于保存本地磁盘卷、emptyDir、日志存储等数据，默认是/var/lib/kubelet/，或者是通过kubelet通过
--root-dir所指定的磁盘挂载目录

nodefs.inodesFree #nodefs的可用inode，默认<5%

nodefs.available #nodefs的可用空间,默认<10%

imagefs是可选文件系统，用于给容器提供运行时存储容器镜像和容器可写层。

imagefs.inodesFree #imagefs的inode可用百分比

imagefs.available #imagefs的磁盘空间可用百分比，默认<15%
pid.available #可用pid百分比

示例：

evictionHard:

 imagefs.inodesFree: 5%
 imagefs.available: 15%
 memory.available: 300Mi
 nodefs.available: 10%
 nodefs.inodesFree: 5%
 pid.available: 5%

kube-controller-manager实现 eviction:

node宕机后的驱逐

kubelet实现的eviction

基于node负载、资源利用率等进行pod驱逐。

驱逐(eviction，节点驱逐)，用于当node节点资源不足的时候自动将pod进行强制驱逐，以保证当前node节点的正常运行。
Kubernetes基于是QoS(服务质量等级)驱逐Pod , Qos等级包括目前包括以下三个:

Guaranteed: #limits和request的值相等，等级最高、最后被驱逐
resources:
limits:
cpu: 500m
memory: 256Mi
requests:
cpu: 500m
memory: 256Mi

Burstable: #limit和request不相等，等级折中、中间被驱逐
resources:
limits:
cpu: 500m
memory: 256Mi
requests:
cpu: 256m
memory: 128Mi

BestEffort: #没有限制，即resources为空，等级最低、最先被驱逐

驱逐条件

eviction-signal：kubelet捕获node节点驱逐触发信号，进行判断是否驱逐，比如通过cgroupfs获取memory.available的值来
进行下一步匹配。

operator：操作符，通过操作符对比条件是否匹配资源量是否触发驱逐。

quantity：使用量，即基于指定的资源使用值进行判断，如memory.available: 300Mi、nodefs.available: 10%等。

比如：nodefs.available<10%

以上公式当node节点磁盘空间可用率低于10%会触发驱逐信号

软驱逐

软驱逐不会立即驱逐pod，可以自定义宽限期，在条件持续到宽限期还没有恢复，kubelet再强制杀死pod并触发驱逐

软驱逐条件

	eviction-soft： 软驱逐触发条件，比如memory.available < 1.5Gi，如果驱逐条件持续时长超过指定的宽限期，可

以触发Pod驱逐。
• eviction-soft-grace-period：软驱逐宽限期，如 memory.available=1m30s，定义软驱逐条件在触发Pod驱逐之前
必须保持多长时间。
• eviction-max-pod-grace-period：终止pod的宽限期，即在满足软驱逐条件而终止Pod时使用的最大允许宽限期
（以秒为单位）。

硬驱逐

硬驱逐条件没有宽限期，当达到硬驱逐条件时，kubelet 会强制立即杀死 pod并驱逐：

kubelet 具有以下默认硬驱逐条件(可以自行调整)：

memory.available<100Mi
nodefs.available<10%
imagefs.available<15%
nodefs.inodesFree<5%（Linux 节点）

kubelet service文件：

vim /etc/systemd/system/kubelet.service

kubelet配置文件：

vim /var/lib/kubelet/config.yaml

evictionHard:
imagefs.available: 15%
memory.available: 300Mi
nodefs.available: 10%
nodefs.inodesFree: 5%

5.搭建elk

安装elasticsearch

dpkg -i elasticsearch-7.12.1-amd64.deb

编辑配置文件

root@es3:~# cat /etc/elasticsearch/elasticsearch.yml|grep -v "#"
cluster.name: es-cluster
node.name: node-3
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 192.168.44.22
http.port: 9200
discovery.seed_hosts: ["192.168.44.20", "192.168.44.21","192.168.44.22"]
cluster.initial_master_nodes: ["192.168.44.20", "192.168.44.21","192.168.44.22"]
action.destructive_requires_name: true

启动es服务

systemctl restart elasticsearch.service

安装elasticsearch-header管理es

安装kibana

dpkg -i kibana-7.12.1-amd64.deb

编辑配置文件

root@es1:~# cat /etc/kibana/kibana.yml |egrep -v "#|^$"
server.port: 5601
server.host: "192.168.44.20"
elasticsearch.hosts: ["http://192.168.44.21:9200"]
i18n.locale: "zh-CN"

启动服务

systemctl restart kibana.service

安装zk,kafka

由于资源紧张，只安装一台zk,kafka，不做集群部署

安装jdk

apt update
apt install openjdk-8-jdk

安装zk

zoo.cfg配置文件

# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just 
# example sakes.
dataDir=/data/zookeeper
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the 
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1

## Metrics Providers
#
# https://prometheus.io Metrics Exporter
#metricsProvider.className=org.apache.zookeeper.metrics.prometheus.PrometheusMetricsProvider
#metricsProvider.httpPort=7000
#metricsProvider.exportJvmInfo=true

#server.1=172.31.4.101:2888:3888
#server.2=172.31.4.102:2888:3888
#server.3=172.31.4.103:2888:3888

tar -zxf apache-zookeeper-3.6.3-bin.tar.gz -C /apps/
cd /apps/
 cd apache-zookeeper-3.6.3-bin/
 ./zkServer.sh start
 
root@ubuntu20:/apps/apache-zookeeper-3.6.3-bin/conf# ss -tnl|grep 2181
LISTEN  0        50                     *:2181                 *:*

安装kafka

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# see kafka.server.KafkaConfig for additional details and defaults

############################# Server Basics #############################

# The id of the broker. This must be set to a unique integer for each broker.
broker.id=0

############################# Socket Server Settings #############################

# The address the socket server listens on. It will get the value returned from 
# java.net.InetAddress.getCanonicalHostName() if not configured.
#   FORMAT:
#     listeners = listener_name://host_name:port
#   EXAMPLE:
#     listeners = PLAINTEXT://your.host.name:9092
listeners=PLAINTEXT://192.168.44.23:9092

# Hostname and port the broker will advertise to producers and consumers. If not set, 
# it uses the value for "listeners" if configured.  Otherwise, it will use the value
# returned from java.net.InetAddress.getCanonicalHostName().
#advertised.listeners=PLAINTEXT://your.host.name:9092

# Maps listener names to security protocols, the default is for them to be the same. See the config documentation for more details
#listener.security.protocol.map=PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL

# The number of threads that the server uses for receiving requests from the network and sending responses to the network
num.network.threads=3

# The number of threads that the server uses for processing requests, which may include disk I/O
num.io.threads=8

# The send buffer (SO_SNDBUF) used by the socket server
socket.send.buffer.bytes=102400

# The receive buffer (SO_RCVBUF) used by the socket server
socket.receive.buffer.bytes=102400

# The maximum size of a request that the socket server will accept (protection against OOM)
socket.request.max.bytes=104857600


############################# Log Basics #############################

# A comma separated list of directories under which to store log files
log.dirs=/data/kafka-logs

# The default number of log partitions per topic. More partitions allow greater
# parallelism for consumption, but this will also result in more files across
# the brokers.
num.partitions=1

# The number of threads per data directory to be used for log recovery at startup and flushing at shutdown.
# This value is recommended to be increased for installations with data dirs located in RAID array.
num.recovery.threads.per.data.dir=1

############################# Internal Topic Settings  #############################
# The replication factor for the group metadata internal topics "__consumer_offsets" and "__transaction_state"
# For anything other than development testing, a value greater than 1 is recommended to ensure availability such as 3.
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1

############################# Log Flush Policy #############################

# Messages are immediately written to the filesystem but by default we only fsync() to sync
# the OS cache lazily. The following configurations control the flush of data to disk.
# There are a few important trade-offs here:
#    1. Durability: Unflushed data may be lost if you are not using replication.
#    2. Latency: Very large flush intervals may lead to latency spikes when the flush does occur as there will be a lot of data to flush.
#    3. Throughput: The flush is generally the most expensive operation, and a small flush interval may lead to excessive seeks.
# The settings below allow one to configure the flush policy to flush data after a period of time or
# every N messages (or both). This can be done globally and overridden on a per-topic basis.

# The number of messages to accept before forcing a flush of data to disk
#log.flush.interval.messages=10000

# The maximum amount of time a message can sit in a log before we force a flush
#log.flush.interval.ms=1000

############################# Log Retention Policy #############################

# The following configurations control the disposal of log segments. The policy can
# be set to delete segments after a period of time, or after a given size has accumulated.
# A segment will be deleted whenever *either* of these criteria are met. Deletion always happens
# from the end of the log.

# The minimum age of a log file to be eligible for deletion due to age
log.retention.hours=2

# A size-based retention policy for logs. Segments are pruned from the log unless the remaining
# segments drop below log.retention.bytes. Functions independently of log.retention.hours.
#log.retention.bytes=1073741824

# The maximum size of a log segment file. When this size is reached a new log segment will be created.
log.segment.bytes=1073741824

# The interval at which log segments are checked to see if they can be deleted according
# to the retention policies
log.retention.check.interval.ms=300000

############################# Zookeeper #############################

# Zookeeper connection string (see zookeeper docs for details).
# This is a comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
# You can also append an optional chroot string to the urls to specify the
# root directory for all kafka znodes.
zookeeper.connect=localhost:2181

# Timeout in ms for connecting to zookeeper
zookeeper.connection.timeout.ms=18000


############################# Group Coordinator Settings #############################

# The following configuration specifies the time, in milliseconds, that the GroupCoordinator will delay the initial consumer rebalance.
# The rebalance will be further delayed by the value of group.initial.rebalance.delay.ms as new members join the group, up to a maximum of max.poll.interval.ms.
# The default value for this is 3 seconds.
# We override this to 0 here as it makes for a better out-of-the-box experience for development and testing.
# However, in production environments the default value of 3 seconds is more suitable as this will help to avoid unnecessary, and potentially expensive, rebalances during application startup.
group.initial.rebalance.delay.ms=0

bin/kafka-server-start.sh -daemon config/server.properties
root@ubuntu20:/apps/kafka_2.13-3.1.1/config# ss -tnl|grep 9092
LISTEN  0        50                     *:9092                 *:*

使用kafkatool进行查看

使用daemonset部署logstach

打镜像

基于dameset模式部署logstach

将logstach做为日志收集agent部署到各node节点上。

构建logstach镜像

dockfile

FROM logstash:7.12.1
  

USER root
WORKDIR /usr/share/logstash
#RUN rm -rf config/logstash-sample.conf
ADD logstash.yml /usr/share/logstash/config/logstash.yml
ADD logstash.conf /usr/share/logstash/pipeline/logstash.conf

logstash.conf

input {
  file {
    path => "/var/log/applog/catalina.out"
    start_position => "beginning"
    type => "app1-sidecar-catalina-log"
  }
  file {
    path => "/var/log/applog/localhost_access_log.*.txt"
    start_position => "beginning"
    type => "app1-sidecar-access-log"
  }
}

output {
  if [type] == "app1-sidecar-catalina-log" {
    kafka {
      bootstrap_servers => "${KAFKA_SERVER}"
      topic_id => "${TOPIC_ID}"
      batch_size => 16384  #logstash每次向ES传输的数据量大小,单位为字节
      codec => "${CODEC}" 
   } }

  if [type] == "app1-sidecar-access-log" {
    kafka {
      bootstrap_servers => "${KAFKA_SERVER}"
      topic_id => "${TOPIC_ID}"
      batch_size => 16384
      codec => "${CODEC}"
  }}
}

logstash.yml

http.host: "0.0.0.0"
#xpack.monitoring.elasticsearch.hosts: [ "http://elasticsearch:9200" ]

执行构建

nerdctl build -t harbor.magedu.net/baseimages/logstash:v7.12.1-sidecar .
nerdctl push harbor.magedu.net/baseimages/logstash:v7.12.1-sidecar

创建daemonset收集任务

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: logstash-elasticsearch
  namespace: kube-system
  labels:
    k8s-app: logstash-logging
spec:
  selector:
    matchLabels:
      name: logstash-elasticsearch
  template:
    metadata:
      labels:
        name: logstash-elasticsearch
    spec:
      tolerations:
      # this toleration is to have the daemonset runnable on master nodes
      # remove it if your masters can't run pods
      - key: node-role.kubernetes.io/master
        operator: Exists
        effect: NoSchedule
      containers:
      - name: logstash-elasticsearch
        image: harbor.jackedu.net/baseimage/logstash:v7.12.1-json-file-log-v1 
        env:
        - name: "KAFKA_SERVER"
          value: "192.168.44.23:9092"
        - name: "TOPIC_ID"
          value: "jsonfile-log-topic"
        - name: "CODEC"
          value: "json"
        volumeMounts:
        - name: varlog #定义宿主机系统日志挂载路径
          mountPath: /var/log #宿主机系统日志挂载点
        - name: varlibdockercontainers #定义容器日志挂载路径,和logstash配置文件中的收集路径保持一直
          #mountPath: /var/lib/docker/containers #docker挂载路径
          mountPath: /var/log/pods #containerd挂载路径,此路径与logstash的日志收集路径必须一致
          readOnly: false
      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log #宿主机系统日志
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers #docker的宿主机日志路径
          path: /var/log/pods #containerd的宿主机日志路径

kube-system            logstash-elasticsearch-dnrxl                 1/1     Running            0               5s
kube-system            logstash-elasticsearch-fkjgw                 1/1     Running            0               5s
kube-system            logstash-elasticsearch-jvqjv                 1/1     Running            0               5s
kube-system            logstash-elasticsearch-zgv4j                 1/1     Running            0               5s

验证kafka数据

登录kfaka工具，查看logstash收集的日志

验证elasticsearch

创建kibana索引

1.stack management

2.Kibana索引

创建索引

3 .创建索引名称

创建所有日志索引，命令索引名称

选择索引模式

查看设置索引

kibana异地收集日志

选择analytics--->discover

展示查询日志

sidcar日志收集

编写Dockerfile文件

FROM logstash:7.12.1


USER root
WORKDIR /usr/share/logstash 
ADD logstash.yml /usr/share/logstash/config/logstash.yml
ADD logstash.conf /usr/share/logstash/pipeline/logstash.conf

准备配置文件

logstash.yml

http.host: "0.0.0.0"
#xpack.monitoring.elasticsearch.hosts: [ "http://elasticsearch:9200" ]

logstash.conf

input {
  file {
    path => "/var/log/applog/catalina.out"
    start_position => "beginning"
    type => "app1-sidecar-catalina-log"
  }
  file {
    path => "/var/log/applog/localhost_access_log.*.txt"
    start_position => "beginning"
    type => "app1-sidecar-access-log"
  }
}

output {
  if [type] == "app1-sidecar-catalina-log" {
    kafka {
      bootstrap_servers => "${KAFKA_SERVER}"
      topic_id => "${TOPIC_ID}"
      batch_size => 16384  #logstash每次向ES传输的数据量大小,单位为字节
      codec => "${CODEC}" 
   } }

  if [type] == "app1-sidecar-access-log" {
    kafka {
      bootstrap_servers => "${KAFKA_SERVER}"
      topic_id => "${TOPIC_ID}"
      batch_size => 16384
      codec => "${CODEC}"
  }}
}

构建镜像

nerdctl  build -t harbor.jackedu.net/baseimage/logstash:v7.12.1-sidecar .
nerdctl push harbor.jackedu.net/baseimage/logstash:v7.12.1-sidecar

运行web服务 --与tomcat一并部署到一个pod

部署tomcat与logstach到同一个pod

kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
  labels:
    app: web-tomcat-app1-deployment-label
  name: web-tomcat-app1-deployment #当前版本的deployment 名称
  namespace: test
spec:
  replicas: 1
  selector:
    matchLabels:
      app: web-tomcat-app1-selector
  template:
    metadata:
      labels:
        app: web-tomcat-app1-selector
    spec:
      containers:
      - name: sidecar-container
        image: harbor.jackedu.net/baseimage/logstash:v7.12.1-sidecar
        imagePullPolicy: IfNotPresent
        #imagePullPolicy: Always
        env:
        - name: "KAFKA_SERVER"
          value: "192.168.44.23:9092"
        - name: "TOPIC_ID"
          value: "tomcat-app1-topic"
        - name: "CODEC"
          value: "json"
        volumeMounts:
        - name: applogs
          mountPath: /var/log/applog
      - name: web-tomcat-app1-container
        #image: registry.cn-hangzhou.aliyuncs.com/zhangshijie/tomcat-app1:v1
        image: harbor.jackedu.net/app/tomcat-app1:v3
        imagePullPolicy: IfNotPresent
        #imagePullPolicy: Always
        ports:
        - containerPort: 8080
          protocol: TCP
          name: http
        env:
        - name: "password"
          value: "123456"
        - name: "age"
          value: "18"
        resources:
          limits:
            cpu: 1
            memory: "512Mi"
          requests:
            cpu: 500m
            memory: "512Mi"
        volumeMounts:
        - name: applogs
          mountPath: /apps/tomcat/logs
        startupProbe:
          httpGet:
            path: /myapp/index.html
            port: 8080
          initialDelaySeconds: 5 #首次检测延迟5s
          failureThreshold: 3  #从成功转为失败的次数
          periodSeconds: 3 #探测间隔周期
        readinessProbe:
          httpGet:
            #path: /monitor/monitor.html
            path: /myapp/index.html
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 3
          timeoutSeconds: 5
          successThreshold: 1
          failureThreshold: 3
        livenessProbe:
          httpGet:
            #path: /monitor/monitor.html
            path: /myapp/index.html
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 3
          timeoutSeconds: 5
          successThreshold: 1
          failureThreshold: 3
      volumes:
      - name: applogs #定义通过emptyDir实现业务容器与sidecar容器的日志共享，以让sidecar收集业务容器中的日志
        emptyDir: {}

编写service yaml文件

---
kind: Service
apiVersion: v1
metadata:
  labels:
    app: magedu-tomcat-app1-service-label
  name: magedu-tomcat-app1-service
  namespace: test
spec:
  type: NodePort
  ports:
  - name: http
    port: 80
    protocol: TCP
    targetPort: 8080
    nodePort: 40080
  selector:
    app: web-tomcat-app1-selector

配置logstash服务器

修改配置文件

input {
  kafka {
    bootstrap_servers => "192.168.44.23:9092"
    topics => ["tomcat-app1-topic"]
    codec => "json"
  }
}




output {
  #if [fields][type] == "app1-access-log" {
  if [type] == "app1-sidecar-access-log" {
    elasticsearch {
      hosts => ["192.168.44.20:9200","192.168.44.21:9200","192.168.44.22:9200"]
      index => "sidecar-app1-accesslog-%{+YYYY.MM.dd}"
    }
  }

  #if [fields][type] == "app1-catalina-log" {
  if [type] == "app1-sidecar-catalina-log" {
    elasticsearch {
      hosts => ["192.168.44.20:9200","192.168.44.21:9200","192.168.44.22:9200"]
      index => "sidecar-app1-catalinalog-%{+YYYY.MM.dd}"
    }
  }

#  stdout { 
#    codec => rubydebug
#  }
}

说明：其中type值与sidcar镜像中logstash.conf配置文件里面自定义type值进行匹配。

检查配置文件语法

/usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/logstash-sidecar-kafka-to-es.conf

重启服务

systemctl restart logstash.service

es相关展示

kibana展示

标签：限制,name,tomcat,app,亲和度,magedu,第八,logstash,log
From： https://www.cnblogs.com/jackwu81/p/17388913.html

第八周资源限制及亲和度相关