第八周 资源限制及亲和度相关
1.Kubernetes Container、Pod、Namespace内存及CPU限制
1)如果运行的容器没有定义资源(memory、CPU等限制,但是在namespace定义了LimitRange限制,那么该容器会继承LimitRange中的默认限制。
2)如果namespace没有定义LimitRange限制,那么该容器可以占用宿主机的最大可用资源,直到无资源可用而触发缩主机(OOM Killer)
CPU以核心为单位进行限制,单位可以是整核、浮点核心数或毫核(m/milli)
2=2核心=200% 0.5=500m=50% 1.2=1200m=120%
memory以字节为单位,单位可以是E、P、T、G、M、K、Ei、Pi、Ti、Gi、Mi、Ki
1536Mi=1.5Gi
requests(请求)为kubernetes scheduler执行pod调度时node节点至少需要拥有的资源
limits(限制)为pod运行成功后最多可以使用的资源上限
举例说明针对单个容器的相关限制
无限制时的cpu及内存的yaml配置
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
kind: Deployment
metadata:
name: limit-test-deployment
namespace: test
spec:
replicas: 1
selector:
matchLabels: #rs or deployment
app: limit-test-pod
# matchExpressions:
# - {key: app, operator: In, values: [ng-deploy-80,ng-rs-81]}
template:
metadata:
labels:
app: limit-test-pod
spec:
containers:
- name: limit-test-container
image: lorel/docker-stress-ng
#resources:
# limits:
# memory: "1024Mi"
# requests:
# memory: "1024Mi"
#command: ["stress"]
args: ["--vm", "3", "--vm-bytes", "256M"]
#nodeSelector:
# env: group1
增加CPU及内存限制后的配置
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
kind: Deployment
metadata:
name: limit-test-deployment
namespace: test
spec:
replicas: 1
selector:
matchLabels: #rs or deployment
app: limit-test-pod
# matchExpressions:
# - {key: app, operator: In, values: [ng-deploy-80,ng-rs-81]}
template:
metadata:
labels:
app: limit-test-pod
spec:
containers:
- name: limit-test-container
image: lorel/docker-stress-ng
resources:
limits:
cpu: "1"
memory: "512Mi"
requests:
memory: "512M"
cpu: "1"
#command: ["stress"]
args: ["--vm", "3", "--vm-bytes", "256M"]
#nodeSelector:
# env: group1
如查询结果,cpu,及内存均己限制到相关要求以后
创建LimitRange
apiVersion: v1
kind: LimitRange
metadata:
name: limitrange-magedu
namespace: test
spec:
limits:
- type: Container #限制的资源类型
max:
cpu: "2" #限制单个容器的最大CPU
memory: "2Gi" #限制单个容器的最大内存
min:
cpu: "500m" #限制单个容器的最小CPU
memory: "512Mi" #限制单个容器的最小内存
default:
cpu: "500m" #默认单个容器的CPU限制
memory: "512Mi" #默认单个容器的内存限制
defaultRequest:
cpu: "500m" #默认单个容器的CPU创建请求
memory: "512Mi" #默认单个容器的内存创建请求
maxLimitRequestRatio:
cpu: 2 #限制CPU limit/request比值最大为2
memory: 2 #限制内存limit/request比值最大为1.5
- type: Pod
max:
cpu: "4" #限制单个Pod的最大CPU
memory: "4Gi" #限制单个Pod最大内存
- type: PersistentVolumeClaim
max:
storage: 50Gi #限制PVC最大的requests.storage
min:
storage: 30Gi #限制PVC最小的requests.storage
将CPU在POD中限制值改为大于4,查看是否可以创建成功
kind: Deployment
apiVersion: apps/v1
metadata:
labels:
app: magedu-wordpress-deployment-label
name: magedu-wordpress-deployment
namespace: test
spec:
replicas: 1
selector:
matchLabels:
app: magedu-wordpress-selector
template:
metadata:
labels:
app: magedu-wordpress-selector
spec:
containers:
- name: magedu-wordpress-nginx-container
image: nginx:1.16.1
imagePullPolicy: Always
ports:
- containerPort: 80
protocol: TCP
name: http
env:
- name: "password"
value: "123456"
- name: "age"
value: "18"
resources:
limits:
cpu: 3
memory: 1Gi
requests:
cpu: 800m
memory: 512Mi
- name: magedu-wordpress-php-container
image: php:5.6-fpm-alpine
imagePullPolicy: Always
ports:
- containerPort: 80
protocol: TCP
name: http
env:
- name: "password"
value: "123456"
- name: "age"
value: "18"
resources:
limits:
cpu: 2
#cpu: 2
memory: 1Gi
requests:
cpu: 2000m
memory: 512Mi
---
kind: Service
apiVersion: v1
metadata:
labels:
app: magedu-wordpress-service-label
name: magedu-wordpress-service
namespace: test
spec:
type: NodePort
ports:
- name: http
port: 80
protocol: TCP
targetPort: 8080
nodePort: 30133
selector:
app: magedu-wordpress-selector
创建失败,通过查看生成的deploy的yaml文件可以看到失败原因如下
kubectl get deployments -n test magedu-wordpress-deployment -o yaml
修改为预期值后再创建,limit改为2,请求值与最大值比值符合2倍,最大limit符合4
kind: Deployment
apiVersion: apps/v1
metadata:
labels:
app: magedu-wordpress-deployment-label
name: magedu-wordpress-deployment
namespace: test
spec:
replicas: 1
selector:
matchLabels:
app: magedu-wordpress-selector
template:
metadata:
labels:
app: magedu-wordpress-selector
spec:
containers:
- name: magedu-wordpress-nginx-container
image: nginx:1.16.1
imagePullPolicy: Always
ports:
- containerPort: 80
protocol: TCP
name: http
env:
- name: "password"
value: "123456"
- name: "age"
value: "18"
resources:
limits:
cpu: 512m
memory: 512Mi
requests:
cpu: 512m
memory: 512Mi
- name: magedu-wordpress-php-container
image: php:5.6-fpm-alpine
imagePullPolicy: Always
ports:
- containerPort: 80
protocol: TCP
name: http
env:
- name: "password"
value: "123456"
- name: "age"
value: "18"
resources:
limits:
cpu: 512m
#cpu: 2
memory: 512Mi
requests:
cpu: 512m
memory: 512Mi
---
kind: Service
apiVersion: v1
metadata:
labels:
app: magedu-wordpress-service-label
name: magedu-wordpress-service
namespace: test
spec:
type: NodePort
ports:
- name: http
port: 80
protocol: TCP
targetPort: 8080
nodePort: 30133
selector:
app: magedu-wordpress-selector
2.nodeSelector、nodeName、node亲和与反亲和
nodeSelector
需要在node上打上相印的标签,因为节点选择,是用于将特定的pod部署到带有特定标签的节点,如GPU节点,或SSD节点等。
将node节点打上ssd标签
kubectl label node 192.168.44.16 disktype="ssd"
kubectl label node 192.168.44.17 disktype="ssd"
将节点调度到相关打了ssd节点的标签上,相关yaml
kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
labels:
app: magedu-tomcat-app2-deployment-label
name: magedu-tomcat-app2-deployment
namespace: test
spec:
replicas: 2
selector:
matchLabels:
app: magedu-tomcat-app2-selector
template:
metadata:
labels:
app: magedu-tomcat-app2-selector
spec:
containers:
- name: magedu-tomcat-app2-container
image: tomcat:7.0.94-alpine
imagePullPolicy: IfNotPresent
#imagePullPolicy: Always
ports:
- containerPort: 8080
protocol: TCP
name: http
env:
- name: "password"
value: "123456"
- name: "age"
value: "18"
resources:
limits:
cpu: 1
memory: "512Mi"
requests:
cpu: 500m
memory: "512Mi"
nodeSelector:
disktype: ssd
执行部署,同时查看
kubectl apply -f case1-nodeSelector.yaml
3.pod Affinity与pod antiaffinity
1.pod Affinity与pod antiaffinity简介
Pod 亲和性与反亲和性的合法操作符(operator)有 In、Notln、Exists、 DoesNotExist.
在Pod反亲和性中配置中,requiredDuringSchedulinglgnoredDuringExecution和preferredDuringSchedulinglnoredDuringExecution 中,topologyKey也不可以为空(Empty topologyKeyis not allowed.).
对于requiredDuringSchedulinglgnoredDuringExecution要求的Pod反亲和性,准入控制器LimitPodHardAntiAffinityTopology被引入以确保topologyKey只能是 kubernetes.io/hostname,如果希望 topologyKey 也可用于其他定制拓扑逻辑,可以更改准入控制器或者禁用。
2.相关测试
创建一个nginx容器,用于测试pod亲和与反亲和
kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
labels:
app: python-nginx-deployment-label
name: python-nginx-deployment
namespace: magedu
spec:
replicas: 1
selector:
matchLabels:
app: python-nginx-selector
template:
metadata:
labels:
app: python-nginx-selector
project: python
spec:
containers:
- name: python-nginx-container
image: nginx:1.20.2-alpine
#command: ["/apps/tomcat/bin/run_tomcat.sh"]
#imagePullPolicy: IfNotPresent
imagePullPolicy: Always
ports:
- containerPort: 80
protocol: TCP
name: http
- containerPort: 443
protocol: TCP
name: https
env:
- name: "password"
value: "123456"
- name: "age"
value: "18"
# resources:
# limits:
# cpu: 2
# memory: 2Gi
# requests:
# cpu: 500m
# memory: 1Gi
---
kind: Service
apiVersion: v1
metadata:
labels:
app: python-nginx-service-label
name: python-nginx-service
namespace: magedu
spec:
type: NodePort
ports:
- name: http
port: 80
protocol: TCP
targetPort: 80
nodePort: 30014
- name: https
port: 443
protocol: TCP
targetPort: 443
nodePort: 30453
selector:
app: python-nginx-selector
project: python #一个或多个selector,至少能匹配目标pod的一个标签
kubectl apply -f case4-4.1-nginx.yaml
root@k8s-master1:/opt/dockerfile/k8s-data/20220821/Affinit-case# kubectl get po -n test
NAME READY STATUS RESTARTS AGE
python-nginx-deployment-5c658bf86b-nv7sm 1/1 Running 0 45s
ubuntu1804 0/1 Completed 0 31d
pod Affinity 软亲和
实现软亲和调度到同一个节点
kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
labels:
app: magedu-tomcat-app2-deployment-label
name: magedu-tomcat-app2-deployment
namespace: test
spec:
replicas: 1
selector:
matchLabels:
app: magedu-tomcat-app2-selector
template:
metadata:
labels:
app: magedu-tomcat-app2-selector
spec:
containers:
- name: magedu-tomcat-app2-container
image: tomcat:7.0.94-alpine
imagePullPolicy: IfNotPresent
#imagePullPolicy: Always
ports:
- containerPort: 8080
protocol: TCP
name: http
affinity:
podAffinity: #Pod亲和
#requiredDuringSchedulingIgnoredDuringExecution: #硬亲和,必须匹配成功才调度,如果匹配失败则拒绝调度。
preferredDuringSchedulingIgnoredDuringExecution: #软亲和,能匹配成功就调度到一个topology,匹配不成功会由kubernetes自行调度。
- weight: 100
podAffinityTerm:
labelSelector: #标签选择
matchExpressions: #正则匹配
- key: project
operator: In
values:
- python
topologyKey: kubernetes.io/hostname
namespaces:
- test
kubectl apply -f case4-4.2-podaffinity-preferredDuring.yaml
root@k8s-master1:/opt/dockerfile/k8s-data/20220821/Affinit-case# kubectl get po -o wide -n test
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
magedu-tomcat-app2-deployment-6cf5fbbbb8-gln59 1/1 Running 0 39s 172.20.169.137 192.168.44.16 <none> <none>
python-nginx-deployment-5c658bf86b-nv7sm 1/1 Running 0 12m 172.20.169.175 192.168.44.16 <none> <none>
pod Affinity-硬亲和
kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
labels:
app: magedu-tomcat-app2-deployment-label
name: magedu-tomcat-app2-deployment
namespace: test
spec:
replicas: 2
selector:
matchLabels:
app: magedu-tomcat-app2-selector
template:
metadata:
labels:
app: magedu-tomcat-app2-selector
spec:
containers:
- name: magedu-tomcat-app2-container
image: tomcat:7.0.94-alpine
imagePullPolicy: IfNotPresent
#imagePullPolicy: Always
ports:
- containerPort: 8080
protocol: TCP
name: http
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution: #硬亲和
- labelSelector:
matchExpressions:
- key: project
operator: In
values:
- python
topologyKey: "kubernetes.io/hostname"
namespaces:
- test
kubectl apply -f case4-4.3-podaffinity-requiredDuring.yaml
root@k8s-master1:/opt/dockerfile/k8s-data/20220821/Affinit-case# kubectl get po -n test -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
magedu-tomcat-app2-deployment-5b7848b84b-252nm 1/1 Running 0 13s 172.20.169.188 192.168.44.16 <none> <none>
magedu-tomcat-app2-deployment-5b7848b84b-wcrbd 1/1 Running 0 13s 172.20.169.190 192.168.44.16 <none> <none>
python-nginx-deployment-5c658bf86b-nv7sm 1/1 Running 0 5h30m 172.20.169.175 192.168.44.16 <none> <none>
podAntiAffinity -硬反亲和
kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
labels:
app: magedu-tomcat-app2-deployment-label
name: magedu-tomcat-app2-deployment
namespace: test
spec:
replicas: 1
selector:
matchLabels:
app: magedu-tomcat-app2-selector
template:
metadata:
labels:
app: magedu-tomcat-app2-selector
spec:
containers:
- name: magedu-tomcat-app2-container
image: tomcat:7.0.94-alpine
imagePullPolicy: IfNotPresent
#imagePullPolicy: Always
ports:
- containerPort: 8080
protocol: TCP
name: http
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: project
operator: In
values:
- python
topologyKey: "kubernetes.io/hostname"
namespaces:
- test
kubectl apply -f case4-4.4-podAntiAffinity-requiredDuring.yaml
root@k8s-master1:/opt/dockerfile/k8s-data/20220821/Affinit-case# kubectl get po -n test -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
magedu-tomcat-app2-deployment-56d999f8cd-k8sb2 1/1 Running 0 9s 172.20.108.176 192.168.44.17 <none> <none>
python-nginx-deployment-5c658bf86b-nv7sm 1/1 Running 0 5h40m 172.20.169.175 192.168.44.16 <none> <none>
podAntiAffinity -软反亲和
ind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
labels:
app: magedu-tomcat-app2-deployment-label
name: magedu-tomcat-app2-deployment
namespace: test
spec:
replicas: 1
selector:
matchLabels:
app: magedu-tomcat-app2-selector
template:
metadata:
labels:
app: magedu-tomcat-app2-selector
spec:
containers:
- name: magedu-tomcat-app2-container
image: tomcat:7.0.94-alpine
imagePullPolicy: IfNotPresent
#imagePullPolicy: Always
ports:
- containerPort: 8080
protocol: TCP
name: http
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: project
operator: In
values:
- pythonx
topologyKey: kubernetes.io/hostname
namespaces:
- test
root@k8s-master1:/opt/dockerfile/k8s-data/20220821/Affinit-case# kubectl get po -n test -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
magedu-tomcat-app2-deployment-7d4c777b6d-mcp5v 1/1 Running 0 73s 172.20.169.186 192.168.44.16 <none> <none>
python-nginx-deployment-5c658bf86b-nv7sm 1/1 Running 0 5h47m 172.20.169.175 192.168.44.16 <none> <none>
由于是软反亲和,两个pod并没有实现落到不同的node上。
4.污点和容忍度
污点(taints),用于node节点排斥Pod调度,与亲和的作用是完全相反的,即taint的node和pod是排斥调度关系
容忍(toleration),用于Pod容忍node节点的污点信息,即node有污点信息也会将新的pod调度到node
https://kubernetes.io/zh/docs/concepts/scheduling-eviction/taint-and-toleration/
污点的三种类型:
NoSchedule: 表示k8s将不会将Pod调度到具有该污点的Node上
kubectl taint nodes 172.31.7.111 key1=value1:NoSchedule #设置污点
node/172.31.7.111 tainted
kubectl describe node 172.31.7.111 #查看污点
Taints:key1=value1:NoSchedule
kubectl taint node 172.31.7.111 key1:NoSchedule- #取消污点
node/172.31.7.111 untainted
PreferNoSchedule: 表示k8s将尽量避免将Pod调度到具有该污点的Node 上
NoExecute: 表示k8s将不会将POO调度到具有该污点的Node上,同时会将Nooe上已经存在的POO强制驱逐出去
kubectl taint nodes 172.31.7.111 key1=value1:NoExecute
tolerations容忍
定义 Pod 的容忍度(可以接受node的哪些污点),容忍后可以将Pod调度至含有该污点的node。
基于operator的污点匹配
如果operator是Exists,则容忍度不需要value而是直接匹配污点类型。
如果operator是Equal,则需要指定value并且value的值需要等于tolerations的key。
举例
节点打上污点
kubectl taint nodes 192.168.44.16 key1=value1:NoSchedule
node/192.168.44.16 tainted
kubectl describe nodes 192.168.44.16
Taints: key1=value1:NoSchedule
设置pod的容忍度
kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
labels:
app: magedu-tomcat-app1-deployment-label
name: magedu-tomcat-app1-deployment
namespace: test
spec:
replicas: 3
selector:
matchLabels:
app: magedu-tomcat-app1-selector
template:
metadata:
labels:
app: magedu-tomcat-app1-selector
spec:
containers:
- name: magedu-tomcat-app1-container
#image: harbor.magedu.net/magedu/tomcat-app1:v7
image: tomcat:7.0.93-alpine
imagePullPolicy: IfNotPresent
ports:
- containerPort: 8080
protocol: TCP
name: http
tolerations:
- key: "key1"
operator: "Equal"
value: "value1"
effect: "NoSchedule"
---
kind: Service
apiVersion: v1
metadata:
labels:
app: magedu-tomcat-app1-service-label
name: magedu-tomcat-app1-service
namespace: test
spec:
type: NodePort
ports:
- name: http
port: 80
protocol: TCP
targetPort: 8080
#nodePort: 40003
selector:
app: magedu-tomcat-app1-selector
kubectl apply -f case5.1-taint-tolerations.yaml
root@k8s-master1:/opt/dockerfile/k8s-data/20220821/Affinit-case# kubectl get po -n test -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
magedu-tomcat-app1-deployment-785c799fc9-fj2rj 1/1 Running 0 55s 172.20.169.191 192.168.44.16 <none> <none>
设置不容忍,查看调度情况
kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
labels:
app: magedu-tomcat-app1-deployment-label
name: magedu-tomcat-app1-deployment
namespace: test
spec:
replicas: 3
selector:
matchLabels:
app: magedu-tomcat-app1-selector
template:
metadata:
labels:
app: magedu-tomcat-app1-selector
spec:
containers:
- name: magedu-tomcat-app1-container
#image: harbor.magedu.net/magedu/tomcat-app1:v7
image: tomcat:7.0.93-alpine
imagePullPolicy: IfNotPresent
ports:
- containerPort: 8080
protocol: TCP
name: http
---
kind: Service
apiVersion: v1
metadata:
labels:
app: magedu-tomcat-app1-service-label
name: magedu-tomcat-app1-service
namespace: test
spec:
type: NodePort
ports:
- name: http
port: 80
protocol: TCP
targetPort: 8080
#nodePort: 40003
selector:
app: magedu-tomcat-app1-selector
kubectl apply -f case5.2-notaint-tolerations.yaml
root@k8s-master1:/opt/dockerfile/k8s-data/20220821/Affinit-case# kubectl get po -o wide -n test
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
magedu-tomcat-app1-deployment-5849d9d85b-d5rgq 1/1 Running 0 7s 172.20.108.182 192.168.44.17 <none> <none>
magedu-tomcat-app1-deployment-5849d9d85b-fzkh8 1/1 Running 0 7s 172.20.38.21 192.168.44.15 <none> <none>
magedu-tomcat-app1-deployment-5849d9d85b-x4npw 1/1 Running 0 7s 172.20.38.16 192.168.44.15 <none> <none>
去掉容忍度后,无法调度到16节点。
取消节点污点
kubectl taint nodes 192.168.44.16 key1:NoSchedule-
kubectl describe nodes 192.168.44.16
Taints: <none>
4.pod驱逐
力驱逐是由各kubelet进程主动终止Pod,以回收节点上的内存、磁盘空间等资源的过程,kubelet
监控当前node节点的CPU、内存、磁盘空间和文件系统的inode等资源,当这些资源中的一个或者多个
达到特定的消耗水平,kubelet就会主动地将节点上一个或者多个Pod强制驱逐,以防止当前node节点
资源无法正常分配而引发的OOM(OutOfMemory)。
https://kubernetes.io/zh/docs/concepts/scheduling-eviction/node-pressure-eviction/
宿主机内存:
memory.available #node节点可用内存,默认 <100Mi
nodefs是节点的主要文件系统,用于保存本地磁盘卷、emptyDir、日志存储等数据,默认是/var/lib/kubelet/,或者是通过kubelet通过
--root-dir所指定的磁盘挂载目录
nodefs.inodesFree #nodefs的可用inode,默认<5%
nodefs.available #nodefs的可用空间,默认<10%
imagefs是可选文件系统,用于给容器提供运行时存储容器镜像和容器可写层。
imagefs.inodesFree #imagefs的inode可用百分比
imagefs.available #imagefs的磁盘空间可用百分比,默认<15%
pid.available #可用pid百分比
示例:
evictionHard:
imagefs.inodesFree: 5%
imagefs.available: 15%
memory.available: 300Mi
nodefs.available: 10%
nodefs.inodesFree: 5%
pid.available: 5%
kube-controller-manager实现 eviction:
node宕机后的驱逐
kubelet实现的eviction
基于node负载、资源利用率等进行pod驱逐。
驱逐(eviction,节点驱逐),用于当node节点资源不足的时候自动将pod进行强制驱逐,以保证当前node节点的正常运行。
Kubernetes基于是QoS(服务质量等级)驱逐Pod , Qos等级包括目前包括以下三个:
Guaranteed: #limits和request的值相等,等级最高、最后被驱逐
resources:
limits:
cpu: 500m
memory: 256Mi
requests:
cpu: 500m
memory: 256Mi
Burstable: #limit和request不相等,等级折中、中间被驱逐
resources:
limits:
cpu: 500m
memory: 256Mi
requests:
cpu: 256m
memory: 128Mi
BestEffort: #没有限制,即resources为空,等级最低、最先被驱逐
驱逐条件
eviction-signal:kubelet捕获node节点驱逐触发信号,进行判断是否驱逐,比如通过cgroupfs获取memory.available的值来
进行下一步匹配。
operator:操作符,通过操作符对比条件是否匹配资源量是否触发驱逐。
quantity:使用量,即基于指定的资源使用值进行判断,如memory.available: 300Mi、nodefs.available: 10%等。
比如:nodefs.available<10%
以上公式当node节点磁盘空间可用率低于10%会触发驱逐信号
软驱逐
软驱逐不会立即驱逐pod,可以自定义宽限期,在条件持续到宽限期还没有恢复,kubelet再强制杀死pod并触发驱逐
软驱逐条件
eviction-soft: 软驱逐触发条件,比如memory.available < 1.5Gi,如果驱逐条件持续时长超过指定的宽限期,可
以触发Pod驱逐。
• eviction-soft-grace-period:软驱逐宽限期, 如 memory.available=1m30s,定义软驱逐条件在触发Pod驱逐之前
必须保持多长时间。
• eviction-max-pod-grace-period:终止pod的宽限期,即在满足软驱逐条件而终止Pod时使用的最大允许宽限期
(以秒为单位)。
硬驱逐
硬驱逐条件没有宽限期,当达到硬驱逐条件时,kubelet 会强制立即杀死 pod并驱逐:
kubelet 具有以下默认硬驱逐条件(可以自行调整):
memory.available<100Mi
nodefs.available<10%
imagefs.available<15%
nodefs.inodesFree<5%(Linux 节点)
kubelet service文件:
vim /etc/systemd/system/kubelet.service
kubelet配置文件:
vim /var/lib/kubelet/config.yaml
evictionHard:
imagefs.available: 15%
memory.available: 300Mi
nodefs.available: 10%
nodefs.inodesFree: 5%
5.搭建elk
安装elasticsearch
dpkg -i elasticsearch-7.12.1-amd64.deb
编辑配置文件
root@es3:~# cat /etc/elasticsearch/elasticsearch.yml|grep -v "#"
cluster.name: es-cluster
node.name: node-3
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 192.168.44.22
http.port: 9200
discovery.seed_hosts: ["192.168.44.20", "192.168.44.21","192.168.44.22"]
cluster.initial_master_nodes: ["192.168.44.20", "192.168.44.21","192.168.44.22"]
action.destructive_requires_name: true
启动es服务
systemctl restart elasticsearch.service
安装elasticsearch-header管理es
安装kibana
dpkg -i kibana-7.12.1-amd64.deb
编辑配置文件
root@es1:~# cat /etc/kibana/kibana.yml |egrep -v "#|^$"
server.port: 5601
server.host: "192.168.44.20"
elasticsearch.hosts: ["http://192.168.44.21:9200"]
i18n.locale: "zh-CN"
启动服务
systemctl restart kibana.service
安装zk,kafka
由于资源紧张,只安装一台zk,kafka,不做集群部署
安装jdk
apt update
apt install openjdk-8-jdk
安装zk
zoo.cfg配置文件
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/data/zookeeper
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
## Metrics Providers
#
# https://prometheus.io Metrics Exporter
#metricsProvider.className=org.apache.zookeeper.metrics.prometheus.PrometheusMetricsProvider
#metricsProvider.httpPort=7000
#metricsProvider.exportJvmInfo=true
#server.1=172.31.4.101:2888:3888
#server.2=172.31.4.102:2888:3888
#server.3=172.31.4.103:2888:3888
tar -zxf apache-zookeeper-3.6.3-bin.tar.gz -C /apps/
cd /apps/
cd apache-zookeeper-3.6.3-bin/
./zkServer.sh start
root@ubuntu20:/apps/apache-zookeeper-3.6.3-bin/conf# ss -tnl|grep 2181
LISTEN 0 50 *:2181 *:*
安装kafka
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# see kafka.server.KafkaConfig for additional details and defaults
############################# Server Basics #############################
# The id of the broker. This must be set to a unique integer for each broker.
broker.id=0
############################# Socket Server Settings #############################
# The address the socket server listens on. It will get the value returned from
# java.net.InetAddress.getCanonicalHostName() if not configured.
# FORMAT:
# listeners = listener_name://host_name:port
# EXAMPLE:
# listeners = PLAINTEXT://your.host.name:9092
listeners=PLAINTEXT://192.168.44.23:9092
# Hostname and port the broker will advertise to producers and consumers. If not set,
# it uses the value for "listeners" if configured. Otherwise, it will use the value
# returned from java.net.InetAddress.getCanonicalHostName().
#advertised.listeners=PLAINTEXT://your.host.name:9092
# Maps listener names to security protocols, the default is for them to be the same. See the config documentation for more details
#listener.security.protocol.map=PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL
# The number of threads that the server uses for receiving requests from the network and sending responses to the network
num.network.threads=3
# The number of threads that the server uses for processing requests, which may include disk I/O
num.io.threads=8
# The send buffer (SO_SNDBUF) used by the socket server
socket.send.buffer.bytes=102400
# The receive buffer (SO_RCVBUF) used by the socket server
socket.receive.buffer.bytes=102400
# The maximum size of a request that the socket server will accept (protection against OOM)
socket.request.max.bytes=104857600
############################# Log Basics #############################
# A comma separated list of directories under which to store log files
log.dirs=/data/kafka-logs
# The default number of log partitions per topic. More partitions allow greater
# parallelism for consumption, but this will also result in more files across
# the brokers.
num.partitions=1
# The number of threads per data directory to be used for log recovery at startup and flushing at shutdown.
# This value is recommended to be increased for installations with data dirs located in RAID array.
num.recovery.threads.per.data.dir=1
############################# Internal Topic Settings #############################
# The replication factor for the group metadata internal topics "__consumer_offsets" and "__transaction_state"
# For anything other than development testing, a value greater than 1 is recommended to ensure availability such as 3.
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
############################# Log Flush Policy #############################
# Messages are immediately written to the filesystem but by default we only fsync() to sync
# the OS cache lazily. The following configurations control the flush of data to disk.
# There are a few important trade-offs here:
# 1. Durability: Unflushed data may be lost if you are not using replication.
# 2. Latency: Very large flush intervals may lead to latency spikes when the flush does occur as there will be a lot of data to flush.
# 3. Throughput: The flush is generally the most expensive operation, and a small flush interval may lead to excessive seeks.
# The settings below allow one to configure the flush policy to flush data after a period of time or
# every N messages (or both). This can be done globally and overridden on a per-topic basis.
# The number of messages to accept before forcing a flush of data to disk
#log.flush.interval.messages=10000
# The maximum amount of time a message can sit in a log before we force a flush
#log.flush.interval.ms=1000
############################# Log Retention Policy #############################
# The following configurations control the disposal of log segments. The policy can
# be set to delete segments after a period of time, or after a given size has accumulated.
# A segment will be deleted whenever *either* of these criteria are met. Deletion always happens
# from the end of the log.
# The minimum age of a log file to be eligible for deletion due to age
log.retention.hours=2
# A size-based retention policy for logs. Segments are pruned from the log unless the remaining
# segments drop below log.retention.bytes. Functions independently of log.retention.hours.
#log.retention.bytes=1073741824
# The maximum size of a log segment file. When this size is reached a new log segment will be created.
log.segment.bytes=1073741824
# The interval at which log segments are checked to see if they can be deleted according
# to the retention policies
log.retention.check.interval.ms=300000
############################# Zookeeper #############################
# Zookeeper connection string (see zookeeper docs for details).
# This is a comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
# You can also append an optional chroot string to the urls to specify the
# root directory for all kafka znodes.
zookeeper.connect=localhost:2181
# Timeout in ms for connecting to zookeeper
zookeeper.connection.timeout.ms=18000
############################# Group Coordinator Settings #############################
# The following configuration specifies the time, in milliseconds, that the GroupCoordinator will delay the initial consumer rebalance.
# The rebalance will be further delayed by the value of group.initial.rebalance.delay.ms as new members join the group, up to a maximum of max.poll.interval.ms.
# The default value for this is 3 seconds.
# We override this to 0 here as it makes for a better out-of-the-box experience for development and testing.
# However, in production environments the default value of 3 seconds is more suitable as this will help to avoid unnecessary, and potentially expensive, rebalances during application startup.
group.initial.rebalance.delay.ms=0
bin/kafka-server-start.sh -daemon config/server.properties
root@ubuntu20:/apps/kafka_2.13-3.1.1/config# ss -tnl|grep 9092
LISTEN 0 50 *:9092 *:*
使用kafkatool进行查看
使用daemonset部署logstach
打镜像
相关Dockfile
FROM logstash:7.12.1
USER root
WORKDIR /usr/share/logstash
#RUN rm -rf config/logstash-sample.conf
ADD logstash.yml /usr/share/logstash/config/logstash.yml
ADD logstash.conf /usr/share/logstash/pipeline/logstash.conf
执行相关命令
nerdctl build -t harbor.jackedu.net/baseimage/logstash:v7.12.1-json-file-log-v1 .
nerdctl push harbor.jackedu.net/baseimage/logstash:v7.12.1-json-file-log-v1
安装logstach
dpkg -i logstash-7.12.1-amd64.deb
配置logstach文件
input {
kafka {
bootstrap_servers => "192.168.44.23:9092"
topics => ["jsonfile-log-topic"]
codec => "json"
}
}
output {
#if [fields][type] == "app1-access-log" {
if [type] == "jsonfile-daemonset-applog" {
elasticsearch {
hosts => ["192.168.44.20:9200","192.168.44.21:9200","192.168.44.22:9200"]
index => "jsonfile-daemonset-applog-%{+YYYY.MM.dd}"
}}
if [type] == "jsonfile-daemonset-syslog" {
elasticsearch {
hosts => ["192.168.44.20:9200","192.168.44.21:9200","192.168.44.22:9200"]
index => "jsonfile-daemonset-syslog-%{+YYYY.MM.dd}"
}}
}
启动
systemctl start logstach
基于dameset模式部署logstach
将logstach做为日志收集agent部署到各node节点上。
构建logstach镜像
dockfile
FROM logstash:7.12.1
USER root
WORKDIR /usr/share/logstash
#RUN rm -rf config/logstash-sample.conf
ADD logstash.yml /usr/share/logstash/config/logstash.yml
ADD logstash.conf /usr/share/logstash/pipeline/logstash.conf
logstash.conf
input {
file {
path => "/var/log/applog/catalina.out"
start_position => "beginning"
type => "app1-sidecar-catalina-log"
}
file {
path => "/var/log/applog/localhost_access_log.*.txt"
start_position => "beginning"
type => "app1-sidecar-access-log"
}
}
output {
if [type] == "app1-sidecar-catalina-log" {
kafka {
bootstrap_servers => "${KAFKA_SERVER}"
topic_id => "${TOPIC_ID}"
batch_size => 16384 #logstash每次向ES传输的数据量大小,单位为字节
codec => "${CODEC}"
} }
if [type] == "app1-sidecar-access-log" {
kafka {
bootstrap_servers => "${KAFKA_SERVER}"
topic_id => "${TOPIC_ID}"
batch_size => 16384
codec => "${CODEC}"
}}
}
logstash.yml
http.host: "0.0.0.0"
#xpack.monitoring.elasticsearch.hosts: [ "http://elasticsearch:9200" ]
执行构建
nerdctl build -t harbor.magedu.net/baseimages/logstash:v7.12.1-sidecar .
nerdctl push harbor.magedu.net/baseimages/logstash:v7.12.1-sidecar
创建daemonset收集任务
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: logstash-elasticsearch
namespace: kube-system
labels:
k8s-app: logstash-logging
spec:
selector:
matchLabels:
name: logstash-elasticsearch
template:
metadata:
labels:
name: logstash-elasticsearch
spec:
tolerations:
# this toleration is to have the daemonset runnable on master nodes
# remove it if your masters can't run pods
- key: node-role.kubernetes.io/master
operator: Exists
effect: NoSchedule
containers:
- name: logstash-elasticsearch
image: harbor.jackedu.net/baseimage/logstash:v7.12.1-json-file-log-v1
env:
- name: "KAFKA_SERVER"
value: "192.168.44.23:9092"
- name: "TOPIC_ID"
value: "jsonfile-log-topic"
- name: "CODEC"
value: "json"
volumeMounts:
- name: varlog #定义宿主机系统日志挂载路径
mountPath: /var/log #宿主机系统日志挂载点
- name: varlibdockercontainers #定义容器日志挂载路径,和logstash配置文件中的收集路径保持一直
#mountPath: /var/lib/docker/containers #docker挂载路径
mountPath: /var/log/pods #containerd挂载路径,此路径与logstash的日志收集路径必须一致
readOnly: false
terminationGracePeriodSeconds: 30
volumes:
- name: varlog
hostPath:
path: /var/log #宿主机系统日志
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers #docker的宿主机日志路径
path: /var/log/pods #containerd的宿主机日志路径
kube-system logstash-elasticsearch-dnrxl 1/1 Running 0 5s
kube-system logstash-elasticsearch-fkjgw 1/1 Running 0 5s
kube-system logstash-elasticsearch-jvqjv 1/1 Running 0 5s
kube-system logstash-elasticsearch-zgv4j 1/1 Running 0 5s
验证kafka数据
登录kfaka工具,查看logstash收集的日志
验证elasticsearch
创建kibana索引
1.stack management
2.Kibana索引
创建索引
3 .创建索引名称
创建所有日志索引,命令索引名称
选择索引模式
查看设置索引
kibana异地收集日志
选择analytics--->discover
展示查询日志
sidcar日志收集
编写Dockerfile文件
FROM logstash:7.12.1
USER root
WORKDIR /usr/share/logstash
ADD logstash.yml /usr/share/logstash/config/logstash.yml
ADD logstash.conf /usr/share/logstash/pipeline/logstash.conf
准备配置文件
logstash.yml
http.host: "0.0.0.0"
#xpack.monitoring.elasticsearch.hosts: [ "http://elasticsearch:9200" ]
logstash.conf
input {
file {
path => "/var/log/applog/catalina.out"
start_position => "beginning"
type => "app1-sidecar-catalina-log"
}
file {
path => "/var/log/applog/localhost_access_log.*.txt"
start_position => "beginning"
type => "app1-sidecar-access-log"
}
}
output {
if [type] == "app1-sidecar-catalina-log" {
kafka {
bootstrap_servers => "${KAFKA_SERVER}"
topic_id => "${TOPIC_ID}"
batch_size => 16384 #logstash每次向ES传输的数据量大小,单位为字节
codec => "${CODEC}"
} }
if [type] == "app1-sidecar-access-log" {
kafka {
bootstrap_servers => "${KAFKA_SERVER}"
topic_id => "${TOPIC_ID}"
batch_size => 16384
codec => "${CODEC}"
}}
}
构建镜像
nerdctl build -t harbor.jackedu.net/baseimage/logstash:v7.12.1-sidecar .
nerdctl push harbor.jackedu.net/baseimage/logstash:v7.12.1-sidecar
运行web服务 --与tomcat一并部署到一个pod
部署tomcat与logstach到同一个pod
kind: Deployment
#apiVersion: extensions/v1beta1
apiVersion: apps/v1
metadata:
labels:
app: web-tomcat-app1-deployment-label
name: web-tomcat-app1-deployment #当前版本的deployment 名称
namespace: test
spec:
replicas: 1
selector:
matchLabels:
app: web-tomcat-app1-selector
template:
metadata:
labels:
app: web-tomcat-app1-selector
spec:
containers:
- name: sidecar-container
image: harbor.jackedu.net/baseimage/logstash:v7.12.1-sidecar
imagePullPolicy: IfNotPresent
#imagePullPolicy: Always
env:
- name: "KAFKA_SERVER"
value: "192.168.44.23:9092"
- name: "TOPIC_ID"
value: "tomcat-app1-topic"
- name: "CODEC"
value: "json"
volumeMounts:
- name: applogs
mountPath: /var/log/applog
- name: web-tomcat-app1-container
#image: registry.cn-hangzhou.aliyuncs.com/zhangshijie/tomcat-app1:v1
image: harbor.jackedu.net/app/tomcat-app1:v3
imagePullPolicy: IfNotPresent
#imagePullPolicy: Always
ports:
- containerPort: 8080
protocol: TCP
name: http
env:
- name: "password"
value: "123456"
- name: "age"
value: "18"
resources:
limits:
cpu: 1
memory: "512Mi"
requests:
cpu: 500m
memory: "512Mi"
volumeMounts:
- name: applogs
mountPath: /apps/tomcat/logs
startupProbe:
httpGet:
path: /myapp/index.html
port: 8080
initialDelaySeconds: 5 #首次检测延迟5s
failureThreshold: 3 #从成功转为失败的次数
periodSeconds: 3 #探测间隔周期
readinessProbe:
httpGet:
#path: /monitor/monitor.html
path: /myapp/index.html
port: 8080
initialDelaySeconds: 5
periodSeconds: 3
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 3
livenessProbe:
httpGet:
#path: /monitor/monitor.html
path: /myapp/index.html
port: 8080
initialDelaySeconds: 5
periodSeconds: 3
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 3
volumes:
- name: applogs #定义通过emptyDir实现业务容器与sidecar容器的日志共享,以让sidecar收集业务容器中的日志
emptyDir: {}
编写service yaml文件
---
kind: Service
apiVersion: v1
metadata:
labels:
app: magedu-tomcat-app1-service-label
name: magedu-tomcat-app1-service
namespace: test
spec:
type: NodePort
ports:
- name: http
port: 80
protocol: TCP
targetPort: 8080
nodePort: 40080
selector:
app: web-tomcat-app1-selector
配置logstash服务器
修改配置文件
input {
kafka {
bootstrap_servers => "192.168.44.23:9092"
topics => ["tomcat-app1-topic"]
codec => "json"
}
}
output {
#if [fields][type] == "app1-access-log" {
if [type] == "app1-sidecar-access-log" {
elasticsearch {
hosts => ["192.168.44.20:9200","192.168.44.21:9200","192.168.44.22:9200"]
index => "sidecar-app1-accesslog-%{+YYYY.MM.dd}"
}
}
#if [fields][type] == "app1-catalina-log" {
if [type] == "app1-sidecar-catalina-log" {
elasticsearch {
hosts => ["192.168.44.20:9200","192.168.44.21:9200","192.168.44.22:9200"]
index => "sidecar-app1-catalinalog-%{+YYYY.MM.dd}"
}
}
# stdout {
# codec => rubydebug
# }
}
说明:其中type值与sidcar镜像中logstash.conf配置文件里面自定义type值进行匹配。
检查配置文件语法
/usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/logstash-sidecar-kafka-to-es.conf
重启服务
systemctl restart logstash.service
es相关展示
kibana展示
标签:限制,name,tomcat,app,亲和度,magedu,第八,logstash,log From: https://www.cnblogs.com/jackwu81/p/17388913.html