1、调度概念
在 Kubernetes 中,调度(scheduling)指的是确保 Pod 匹配到合适的节点, 以便 kubelet 能够运行它们。 抢占(Preemption)指的是终止低优先级的 Pod 以便高优先级的 Pod 可以调度运行的过程。 驱逐(Eviction)是在资源匮乏的节点上,主动让一个或多个 Pod 失效的过程。
2、CronJob 计划任务
在k8s中周期性运行计划任务,与linux中的crontab相同
注意点:CronJob执行的时间是controllerr-manager的时间,所以一定要确保controller-manager时间是准确的。
2.1 配置文件
apiVersion: batch/v1
kind: CronJob # 定时任务
metadata:
name: cron-job-test # 定时任务名字
spec:
concurrencyPolicy: Allow # 并发调度策略:Allow 允许并发调度,Forbid:不允许并发执行,Replace:如果之前的任务还没执行完,就直接执行新的,放弃上一个任务
failedJobsHistoryLimit: 1 # 保留多少个失败的任务
successfulJobsHistoryLimit: 3 # 保留多少个成功的任务
suspend: false # 是否挂起任务,若为 true 则该任务不会执行
# startingDeadlineSeconds: 30 # 间隔多长时间检测失败的任务并重新执行,时间不能小于 10
schedule: "* * * * *" # 调度策略
jobTemplate:
spec:
template:
spec:
containers:
- name: busybox
image: busybox:1.28
imagePullPolicy: IfNotPresent
command:
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster
restartPolicy: OnFailure
2.2 CronJob执行
[root@k8s-master job]# kubectl create -f cron-job-pd.yaml
cronjob.batch/cron-job-test created
[root@k8s-master job]# kubectl get cronjobs
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
cron-job-test * * * * * False 0 <none> 11s
[root@k8s-master job]# kubectl get po
NAME READY STATUS RESTARTS AGE
configfile-po 0/1 Completed 0 26h
dns-test 1/1 Running 2 (36h ago) 3d21h
emptydir-volume-pod 2/2 Running 44 (58m ago) 23h
fluentd-59k8k 1/1 Running 1 (36h ago) 3d3h
fluentd-hhtls 1/1 Running 1 (36h ago) 3d3h
host-volume-pod 1/1 Running 0 23h
nfs-volume-pod-1 1/1 Running 0 21h
nfs-volume-pod-2 1/1 Running 0 21h
nginx-deploy-6fb8d6548-8khhv 1/1 Running 29 (54m ago) 29h
nginx-deploy-6fb8d6548-fd9tx 1/1 Running 29 (54m ago) 29h
nginx-sc-0 1/1 Running 0 3h20m
[root@k8s-master job]# kubectl get cronjobs
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
cron-job-test * * * * * False 0 42s 2m16s
[root@k8s-master job]# kubectl get po
NAME READY STATUS RESTARTS AGE
configfile-po 0/1 Completed 0 26h
cron-job-test-28484150-wkbgp 0/1 Completed 0 2m4s
cron-job-test-28484151-886j6 0/1 Completed 0 64s
cron-job-test-28484152-srjb4 0/1 Completed 0 4s
dns-test 1/1 Running 2 (36h ago) 3d21h
emptydir-volume-pod 2/2 Running 46 (35s ago) 23h
fluentd-59k8k 1/1 Running 1 (36h ago) 3d3h
fluentd-hhtls 1/1 Running 1 (36h ago) 3d3h
host-volume-pod 1/1 Running 0 23h
nfs-volume-pod-1 1/1 Running 0 21h
nfs-volume-pod-2 1/1 Running 0 21h
nginx-deploy-6fb8d6548-8khhv 1/1 Running 29 (56m ago) 29h
nginx-deploy-6fb8d6548-fd9tx 1/1 Running 29 (56m ago) 29h
nginx-sc-0 1/1 Running 0 3h22m
[root@k8s-master job]# kubectl logs -f cron-job-test-28484150-wkbgp
Tue Feb 27 15:50:19 UTC 2024
Hello from the Kubernetes cluster
3、初始化容器 InitContainer
- 相对于postStart来说,首先InitController能够保证一定在EntryPoint之前执行,而postStart不能,其次postStart更适合去执行一些命令操作,而InitController实际就是一个容器,可以在其他基础容器环境下执行更复杂的初始化功能。
3.1 在pod创建的模板中配置 initContainers 参数:
spec:
template:
spec:
initContainers:
- image: nginx:1.20
imagePullPolicy: IfNotPresent
command: [ sh,"-c","sleep 10 ;echo 'inited' >> /init.log "]
name: init-test
3.2 修改存在的deploy资源,如下
3.3 更新过deploy后,新的pod有个init的过程
4、污点和容忍
- 节点亲和性 是 Pod 的一种属性,它使 Pod 被吸引到一类特定的节点 (这可能出于一种偏好,也可能是硬性要求)。 污点(Taint) 则相反——它使节点能够排斥一类特定的 Pod。
- 容忍度(Toleration) 是应用于 Pod 上的。容忍度允许调度器调度带有对应污点的 Pod。 容忍度允许调度但并不保证调度:作为其功能的一部分, 调度器也会评估其他参数。
- 污点和容忍度(Toleration)相互配合,可以用来避免 Pod 被分配到不合适的节点上。 每个节点上都可以应用一个或多个污点,这表示对于那些不能容忍这些污点的 Pod, 是不会被该节点接受的。
4.1 污点(Taint)
- 污点:是标注在节点上的,给一个节点打上污点以后,k8s回认为尽量不要将Pod调度到该节点上,除非该pod上面表示可以容忍该污点,且一个节点可以打多个污点,此时需要pod容忍所有污点才会被调度该节点。
- 污点的影响:
- NoSchedule:不能容忍的pod不能被调度到该节点,但是已经存在的节点不会被驱逐。
- NoExecute:不能容忍的pod会被立即清除,能容忍的pod则会存在于节点上
- tolerationSeconds属性,如果没设置,则可以一直运行,
- tolerationSeconds:3600属性,则该pod还能继续在该节点运行3600,超过3600后会被重新调度
4.1.1 为node1节点打上污点
root@k8s-master volume]# kubectl taint node k8s-node-01 tag=test:NoSchedule
node/k8s-node-01 tainted
[root@k8s-master volume]# kubectl describe no k8s-node-01
Name: k8s-node-01
Roles: <none>
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
ingress=true
kubernetes.io/arch=amd64
kubernetes.io/hostname=k8s-node-01
kubernetes.io/os=linux
type=microsvc
Annotations: flannel.alpha.coreos.com/backend-data: {"VNI":1,"VtepMAC":"66:39:6c:7a:92:99"}
flannel.alpha.coreos.com/backend-type: vxlan
flannel.alpha.coreos.com/kube-subnet-manager: true
flannel.alpha.coreos.com/public-ip: 10.10.10.178
kubeadm.alpha.kubernetes.io/cri-socket: /var/run/cri-dockerd.sock
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Mon, 19 Feb 2024 22:58:42 +0800
# 这个地方可以看到这个新加的污点信息
Taints: tag=test:NoSchedule
Unschedulable: false
Lease:
HolderIdentity: k8s-node-01
AcquireTime: <unset>
RenewTime: Wed, 28 Feb 2024 01:17:51 +0800
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
NetworkUnavailable False Mon, 26 Feb 2024 11:32:55 +0800 Mon, 26 Feb 2024 11:32:55 +0800 FlannelIsUp Flannel is running on this node
MemoryPressure False Wed, 28 Feb 2024 01:17:30 +0800 Mon, 26 Feb 2024 11:32:41 +0800 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Wed, 28 Feb 2024 01:17:30 +0800 Mon, 26 Feb 2024 11:32:41 +0800 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Wed, 28 Feb 2024 01:17:30 +0800 Mon, 26 Feb 2024 11:32:41 +0800 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Wed, 28 Feb 2024 01:17:30 +0800 Tue, 27 Feb 2024 01:59:54 +0800 KubeletReady kubelet is posting ready status
Addresses:
InternalIP: 10.10.10.177
Hostname: k8s-node-01
Capacity:
cpu: 2
ephemeral-storage: 62575768Ki
hugepages-2Mi: 0
memory: 3861288Ki
pods: 110
Allocatable:
cpu: 2
ephemeral-storage: 57669827694
hugepages-2Mi: 0
memory: 3758888Ki
pods: 110
System Info:
Machine ID: 9ee2b84718d0437fa9ea4380bdb34024
System UUID: A90F4D56-48C7-6739-A05A-A22B33EC7C5F
Boot ID: 6cb5ab07-c82b-4404-8f48-7b9abafe52f1
Kernel Version: 3.10.0-1160.el7.x86_64
OS Image: CentOS Linux 7 (Core)
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://25.0.3
Kubelet Version: v1.25.0
Kube-Proxy Version: v1.25.0
PodCIDR: 10.2.2.0/24
PodCIDRs: 10.2.2.0/24
Non-terminated Pods: (7 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age
--------- ---- ------------ ---------- --------------- ------------- ---
default fluentd-59k8k 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3d4h
default nginx-deploy-69ccc996f9-wqp55 100m (5%) 200m (10%) 128Mi (3%) 128Mi (3%) 61m
ingress-nginx ingress-nginx-controller-jn65t 100m (5%) 0 (0%) 90Mi (2%) 0 (0%) 2d5h
kube-flannel kube-flannel-ds-glkkb 100m (5%) 0 (0%) 50Mi (1%) 0 (0%) 8d
kube-system coredns-c676cc86f-pdsl6 100m (5%) 0 (0%) 70Mi (1%) 170Mi (4%) 6d14h
kube-system kube-proxy-n2w92 0 (0%) 0 (0%) 0 (0%) 0 (0%) 8d
kube-system metrics-server-7bb86dcf48-hfpb5 100m (5%) 0 (0%) 200Mi (5%) 0 (0%) 3d3h
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 500m (25%) 200m (10%)
memory 538Mi (14%) 298Mi (8%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
Events: <none>
4.1.2 查看下master的污点
- 之前在安装ingress-nginx的时候无法安装再master节点就是因为master节点有污点
- master节点的污点是:Taints: node-role.kubernetes.io/control-plane:NoSchedule
[root@k8s-master volume]# kubectl describe no k8s-master
Name: k8s-master
Roles: control-plane
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
ingress=true
kubernetes.io/arch=amd64
kubernetes.io/hostname=k8s-master
kubernetes.io/os=linux
node-role.kubernetes.io/control-plane=
node.kubernetes.io/exclude-from-external-load-balancers=
Annotations: flannel.alpha.coreos.com/backend-data: {"VNI":1,"VtepMAC":"c2:fd:ef:b4:ea:aa"}
flannel.alpha.coreos.com/backend-type: vxlan
flannel.alpha.coreos.com/kube-subnet-manager: true
flannel.alpha.coreos.com/public-ip: 10.10.10.100
kubeadm.alpha.kubernetes.io/cri-socket: unix:///var/run/cri-dockerd.sock
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Mon, 19 Feb 2024 22:04:42 +0800
Taints: node-role.kubernetes.io/control-plane:NoSchedule
Unschedulable: false
Lease:
HolderIdentity: k8s-master
AcquireTime: <unset>
RenewTime: Wed, 28 Feb 2024 01:21:44 +0800
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
NetworkUnavailable False Thu, 22 Feb 2024 18:30:31 +0800 Thu, 22 Feb 2024 18:30:31 +0800 FlannelIsUp Flannel is running on this node
MemoryPressure False Wed, 28 Feb 2024 01:18:30 +0800 Mon, 19 Feb 2024 22:04:38 +0800 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Wed, 28 Feb 2024 01:18:30 +0800 Mon, 19 Feb 2024 22:04:38 +0800 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Wed, 28 Feb 2024 01:18:30 +0800 Mon, 19 Feb 2024 22:04:38 +0800 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Wed, 28 Feb 2024 01:18:30 +0800 Mon, 19 Feb 2024 23:35:15 +0800 KubeletReady kubelet is posting ready status
Addresses:
InternalIP: 10.10.10.100
Hostname: k8s-master
Capacity:
cpu: 2
ephemeral-storage: 62575768Ki
hugepages-2Mi: 0
memory: 3861288Ki
pods: 110
Allocatable:
cpu: 2
ephemeral-storage: 57669827694
hugepages-2Mi: 0
memory: 3758888Ki
pods: 110
System Info:
Machine ID: 9ee2b84718d0437fa9ea4380bdb34024
System UUID: AE134D56-9F2E-B64D-9BA2-6368B1379B3A
Boot ID: e0bf44a5-6a0d-4fc0-923d-f5d63089b93f
Kernel Version: 3.10.0-1160.el7.x86_64
OS Image: CentOS Linux 7 (Core)
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://25.0.3
Kubelet Version: v1.25.0
Kube-Proxy Version: v1.25.0
PodCIDR: 10.2.0.0/24
PodCIDRs: 10.2.0.0/24
Non-terminated Pods: (7 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age
--------- ---- ------------ ---------- --------------- ------------- ---
kube-flannel kube-flannel-ds-tpm8x 100m (5%) 0 (0%) 50Mi (1%) 0 (0%) 8d
kube-system coredns-c676cc86f-q7hcw 100m (5%) 0 (0%) 70Mi (1%) 170Mi (4%) 6d14h
kube-system etcd-k8s-master 100m (5%) 0 (0%) 100Mi (2%) 0 (0%) 8d
kube-system kube-apiserver-k8s-master 250m (12%) 0 (0%) 0 (0%) 0 (0%) 8d
kube-system kube-controller-manager-k8s-master 200m (10%) 0 (0%) 0 (0%) 0 (0%) 8d
kube-system kube-proxy-xtllb 0 (0%) 0 (0%) 0 (0%) 0 (0%) 8d
kube-system kube-scheduler-k8s-master 100m (5%) 0 (0%) 0 (0%) 0 (0%) 8d
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 850m (42%) 0 (0%)
memory 220Mi (5%) 170Mi (4%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
Events: <none>
4.1.3 污点的删除
[root@k8s-master volume]# kubectl taint no k8s-master node-role.kubernetes.io/control-plane:NoSchedule-
node/k8s-master untainted
4.1.4 查看k8s上的pod节点信息
[root@k8s-master volume]# kubectl get po -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
dns-test 1/1 Running 2 (37h ago) 3d22h 10.2.1.58 k8s-node-02 <none> <none>
fluentd-59k8k 1/1 Running 1 (37h ago) 3d5h 10.2.2.34 k8s-node-01 <none> <none>
fluentd-hhtls 1/1 Running 1 (37h ago) 3d4h 10.2.1.59 k8s-node-02 <none> <none>
nginx-deploy-69ccc996f9-stgcl 1/1 Running 1 (8m23s ago) 68m 10.2.1.78 k8s-node-02 <none> <none>
nginx-deploy-69ccc996f9-wqp55 1/1 Running 1 (8m8s ago) 68m 10.2.2.71 k8s-node-01 <none> <none>
4.1.5 测试这个场景,把nginx的pod删除,是否可以再master创建
4.1.6 测试把master的污点给加上,但是污点属性是:NoExecute
- NoExecute 属性会把目前在该节点上的pod都迁移到别的节点
- 目前master给打上污点了,node1节点也给打上污点了,最终新的pod会跑到node2上。
4.2 容忍(Toleration)
- 容忍:是标注在pod上的,当pod被调度时,如果没有配置容忍,则该pod不会被调度到有污点的节点上,只有该pod上标注了满足某个节点的所有污点,则会被调度到这些节点
4.2.1 k8s-node-01上配置了污点,影响是:NoSchedule
- 污点的key-value是:tag=test
- 属性是:NoSchedule ”不能容忍的pod不能被调度到该节点,但是已经存在的节点不会被驱逐。“
[root@k8s-master volume]# kubectl describe no k8s-node-01 | grep -i tain
Taints: tag=test:NoSchedule
Container Runtime Version: docker://25.0.3
4.2.2 pod的spec下面配置容忍度影响是:NoSchedule ,操作是 Equal
- 容忍操作 operator: “Equal” 表示pod上的容忍度和节点的污点相同,才会匹配到该节点
tolerations:
- key: tag # 污点的key
value: test #污点的value
effect: "NoSchedule" # 污点产生的影响
operator: "Equal" # 表示 value与污点的value要相等,也可以设置为Exists表示存在key即可,此时可以不用配置value
4.2.3 修改deploy资源中容器的容忍值
4.2.4 查看pod的变动
- 能够容忍污点的节点上,pod会被创建到这个节点上,之前的节点并不会改变
4.2.5 修改deploy资源中容器的的容忍属性:容忍度影响是:NoSchedule ,操作是Exists
4.2.6 查看pod的变动
- 容忍影响是:NoSchedule,代表节点有污点,不能被使用,但是有容忍的操作是匹配到污点的key就可以存在,所以更新deploy后,会有pod创建到这个节点上。