Resource scheduling (nodeSelector, nodeAffinity, tail, Tolrations)

1.nodeSelector

nodeSelector is the simplest way to constrain. nodeSelector is pod A field of the spec

Use -- show labels to view the labels of the specified node

[root@master haproxy]# kubectl get node node1 --show-labels
NAME    STATUS   ROLES    AGE     VERSION   LABELS
node1   Ready    <none>   4d12h   v1.20.0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node1,kubernetes.io/os=linux
[root@master haproxy]#

If no additional nodes labels are added, you will see the default labels as shown above. We can add labels to the specified node through the kubectl label node command:

[root@master haproxy]# Kubectl get node node1 -- show labels / / you can view it now
NAME    STATUS   ROLES    AGE     VERSION   LABELS
node1   Ready    <none>   4d12h   v1.20.0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,disktype=ssd,kubernetes.io/arch=amd64,kubernetes.io/hostname=node1,kubernetes.io/os=linux
[root@master haproxy]#

You can also delete the specified labels through kubectl label node

[root@master haproxy]# kubectl label node node1 disktype-
node/node1 labeled
[root@master haproxy]# kubectl get node node1 --show-labels
NAME    STATUS   ROLES    AGE     VERSION   LABELS
node1   Ready    <none>   4d12h   v1.20.0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=node1,kubernetes.io/os=linux
[root@master haproxy]#

Create a pod and specify the nodeSelector option to bind nodes:

[root@master haproxy]# kubectl label node node1 disktype=ssd
node/node1 labeled
[root@master haproxy]# kubectl get node node1 --show-labels
NAME    STATUS   ROLES    AGE     VERSION   LABELS
node1   Ready    <none>   4d12h   v1.20.0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,disktype=ssd,kubernetes.io/arch=amd64,kubernetes.io/hostname=node1,kubernetes.io/os=linux
[root@master haproxy]#

[root@master haproxy]# cat test.yml 
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: httpd2
  name: httpd2
spec:
  replicas: 1
  selector:
    matchLabels:
      app: httpd2
  template:
    metadata:
      labels:
        app: httpd2
    spec:
      containers:
      - image: 3199560936/httpd:v0.4
        name: httpd2
---
apiVersion: v1
kind: Service
metadata:
  name: httpd2
spec:
  ports:
  - port: 80
    targetPort: 80
  selector:
    app: httpd2

[root@master haproxy]# kubectl create -f test.yml 
deployment.apps/httpd2 created
service/httpd2 created
[root@master haproxy]#

View nodes scheduled by pod

[root@master haproxy]# kubectl get pod -o wide
NAME                     READY   STATUS    RESTARTS   AGE   IP            NODE    NOMINATED NODE   READINESS GATES
httpd2-fd86fb676-l7gk4   1/1     Running   0          61s   10.244.1.45   node1   <none>           <none>
[root@master haproxy]#

It can be seen that the created pod is forcibly scheduled to the node of the labs of disktype=ssd

2.nodeAffinity

nodeAffinity means node affinity scheduling policy. Is a new scheduling policy to replace nodeSelector. At present, there are two kinds of node affinity expression:

RequiredDuringSchedulingIgnoredDuringExecution:
The rules must be met before the pode can be scheduled to the Node. Equivalent to hard limit
PreferredDuringSchedulingIgnoreDuringExecution:
It is emphasized that priority is given to meeting the established rules. The scheduler will try to schedule the pod to the Node, but it is not mandatory, which is equivalent to soft restriction. Multiple priority rules can also set weight values to define the order of execution.
Ignored during execution means:

If the label of the node where a pod is located changes during the operation of the pod, which does not meet the node affinity requirements of the pod, the system will ignore the change of label on the node, and the pod can run on the node by machine.

The operators supported by NodeAffinity syntax include:

The value of In: label is In a list
NotIn: the value of label is not in a list
Exists: a label exists
DoesNotExit: a label does not exist
Gt: the value of label is greater than a certain value
Lt: the value of label is less than a certain value

Precautions for nodeAffinity rule setting are as follows:

If both nodeSelector and nodeAffinity are defined, name must meet both conditions before pod can finally run on the specified node.
If nodeasffinity specifies multiple nodeSelectorTerms, one of them can match successfully.
If there are multiple matchExpressions in nodeSelectorTerms, a node must meet all matchExpressions to run the pod.

[root@master haproxy]# cat test.yml 
apiVersion: v1
kind: Pod
metadata:
  name: test1
  labels:
    app: nginx
spec:
  containers:
  - name: test1
    image: nginx
    imagePullPolicy: IfNotPresent
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: disktype
            values:
            - ssd
            operator: In
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 10
        preference:
          matchExpressions:
          - key: name
            values:
            - test
            operator: In 
[root@master haproxy]#

The node2 host is also labeled disktype=ssd

[root@master haproxy]# kubectl label node node2 disktype=ssd
node/node2 labeled
[root@master haproxy]# kubectl get node node2 --show-labels
NAME    STATUS   ROLES    AGE     VERSION   LABELS
node2   Ready    <none>   4d12h   v1.20.0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,disktype=ssd,kubernetes.io/arch=amd64,kubernetes.io/hostname=node2,kubernetes.io/os=linux
[root@master haproxy]#

test

Label node1 with name=sb

[root@master ~]# kubectl label node node1 name=sb
node/node1 labeled
[root@master ~]# kubectl get node node1 --show-labels
NAME    STATUS   ROLES    AGE     VERSION   LABELS
node1   Ready    <none>   4d12h   v1.20.0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,disktype=ssd,kubernetes.io/arch=amd64,kubernetes.io/hostname=node1,kubernetes.io/os=linux,name=sb
[root@master ~]#

Create a pod to view the results

[root@master haproxy]# cat httpd.yml 
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: httpd2
  name: httpd2
spec:
  replicas: 1
  selector:
    matchLabels:
      app: httpd2
  template:
    metadata:
      labels:
        app: httpd2
    spec:
      containers:
      - image: 3199560936/httpd:v0.4
        name: httpd2
---
apiVersion: v1
kind: Service
metadata:
  name: httpd2
spec:
  ports:
  - port: 80
    targetPort: 80
  selector:
    app: httpd2
[root@master haproxy]# 
[root@master haproxy]# kubectl apply -f httpd.yml 
deployment.apps/httpd2 created
service/httpd2 created
[root@master haproxy]#

[root@master haproxy]# kubectl get pod -o wide
NAME                     READY   STATUS    RESTARTS   AGE   IP            NODE    NOMINATED NODE   READINESS GATES
httpd2-fd86fb676-b8pqx   1/1     Running   0          13s   10.244.1.46   node1   <none>           <none>
[root@master haproxy]#

Delete name=sb and view the test results

[root@master haproxy]# kubectl label node node1 name-
node/node1 labeled
[root@master haproxy]# kubectl get node node1 --show-labels
NAME    STATUS   ROLES    AGE     VERSION   LABELS
node1   Ready    <none>   4d12h   v1.20.0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,disktype=ssd,kubernetes.io/arch=amd64,kubernetes.io/hostname=node1,kubernetes.io/os=linux
[root@master haproxy]#

[root@master haproxy]# kubectl apply -f haproxy.yml 
deployment.apps/haproxy created
service/haproxy created
[root@master haproxy]#  kubectl get pod -o wide
NAME                       READY   STATUS              RESTARTS   AGE   IP            NODE    NOMINATED NODE   READINESS GATES
haproxy-74f8f5c6cf-6pf9w   0/1     ContainerCreating   0          8s    <none>        node2   <none>           <none>
httpd2-fd86fb676-xggxk     1/1     Running             0          65s   10.244.1.47   node1   <none>           <none>
[root@master haproxy]#

The above pod is first required to run on nodes with the label disktype=ssd. If multiple nodes have this label, it is preferred to create it on the label name=sb

3. Taint and tolerances

Taints: avoid point scheduling to a specific Node
Accelerations: allow Pod to be scheduled to the Node holding Taints
Application scenario:

Private Node: nodes are grouped and managed according to the business line. It is hoped that this Node will not be scheduled by default. Allocation is allowed only when stain tolerance is configured

Equipped with special hardware: some nodes are equipped with SSD, hard disk and GPU. It is hoped that the Node will not be scheduled by default. The allocation is allowed only when stain tolerance is configured
Taint based expulsion

effect description

In the above example, the value of effect is NoSchedule. The following is a brief description of the value of effect:

NoSchedule: if a pod does not declare tolerance for this Taint, the system will not schedule the pod to the node with this Taint
PreferNoSchedule: the soft restricted version of NoSchedule. If a pod does not declare tolerance for this Taint, the system will try to avoid scheduling this pod to this node, but it is not mandatory.
NoExecute: defines the expulsion behavior of pod to deal with node failure. The Taint effect of NoExecute has the following effects on the running pods on the node:
Pod s without a tolerance will be expelled immediately
The pod corresponding to the tolerance is configured. If no value is assigned to the tolerance seconds, it will remain in this node all the time
If the pod corresponding to the tolerance is configured and the tolerance seconds value is specified, it will be expelled after the specified time
Since kubernetes version 1.6, an alpha version has been introduced, That is, the node fault is marked as Taint (currently only for node unreachable and node not ready, and the values of the corresponding NodeCondition "Ready" are Unknown and False). After activating the TaintBasedEvictions function (adding TaintBasedEvictions=true in the – feature gates parameter), the NodeController will automatically set Taint for the node and the status is "Ready" The normal expulsion logic previously set on the node of will be disabled. Note that in case of node failure, in order to maintain the existing speed limit setting of pod expulsion, the system will gradually set Taint for the node in the speed limit mode, which can prevent the consequences of a large number of pods being expelled in some specific cases (such as the temporary loss of contact with the master). This function is compatible with tolerationSeconds and allows the pod to define how long it will be expelled in case of node failure.

Taint

[root@master ~]# kubectl describe node master
 Slightly......
detach: true
CreationTimestamp:  Sat, 18 Dec 2021 22:07:52 -0500
Taints:             node-role.kubernetes.io/master:NoSchedule     //Avoid pod scheduling to specific node s
Unschedulable:      false

Tolerances (stain tolerance)

[root@master ~]# kubectl describe pod httpd2-fd86fb676-xnrcc
 Slightly......
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s    //Allow pod to schedule to the node of existing Taints
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  57s   default-scheduler  Successfully assigned default/httpd2-fd86fb676-xnrcc to node1
  Normal  Pulled     56s   kubelet            Container image "3199560936/httpd:v0.4" already present on machine
  Normal  Created    56s   kubelet            Created container httpd2
  Normal  Started    56s   kubelet            Started container httpd2
[root@master ~]#

Add five points to the node

format: kubectl taint node [node] key=value:[effect]

Where [effect] can be taken as:

NoSchedule: must not be scheduled
PreferNoSchedule: try not to schedule. Tolerance must be configured
NoExecute: not only will it not be scheduled, it will also expel the existing Pod on the Node

Add the stain tolerance field to the Pod configuration

Add stain disktype

[root@master ~]# kubectl taint node node1 disktype:NoSchedule
node/node1 tainted
[root@master ~]#

see

[root@master ~]#  kubectl describe node node1
 Slightly......
detach: true
CreationTimestamp:  Sat, 18 Dec 2021 22:10:36 -0500
Taints:             disktype:NoSchedule    //Here you can see the successful addition
Unschedulable:      false
 Slightly......
[root@master ~]#

Create a container to test

[root@master haproxy]# cat haproxy.yml 
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: haproxy
  namespace: default
spec: 
  replicas: 1
  selector:
    matchLabels:
      app: haproxy
  template:
    metadata:
      labels:
        app: haproxy
    spec:
      containers:
      - image: 93quan/haproxy:v1-alpine
        imagePullPolicy: Always
        env: 
        - name: RSIP
          value: "10.106.56.19 10.96.149.182"
        name: haproxy
        ports:
        - containerPort: 80
          hostPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: haproxy
  namespace: default
spec:
  ports:
  - port: 80
    protocol: TCP
    targetPort: 80
  selector:
    app: haproxy
  type: NodePort
[root@master haproxy]#  
[root@master haproxy]# kubectl create -f haprxoy.yml 
deployment.apps/haproxy created
service/haproxy created

see

[root@master haproxy]# kubectl get pods -o wide  #Here you can find that there are stains on node1, so the created container will appear on node2
NAME                       READY   STATUS              RESTARTS   AGE     IP            NODE    NOMINATED NODE   READINESS GATES
haproxy-74f8f5c6cf-k8867   0/1     ContainerCreating   0          2m47s   <none>        node2   <none>           <none>
httpd2-fd86fb676-xnrcc     1/1     Running             0          17m     10.244.1.44   node1   <none>           <none>
[root@master haproxy]#

Remove stains

Syntax: kubectl taint node [node] key:[effect]-

[root@master haproxy]# kubectl taint node node1 disktype-
node/node1 untainted
[root@master haproxy]#

see

[root@master haproxy]# kubectl describe node node1
 Slightly......
detach: true
CreationTimestamp:  Sat, 18 Dec 2021 22:10:36 -0500
Taints:             <none>     //It can be seen that the stain removal is successful
Unschedulable:      false

Keywords: Linux Docker Kubernetes

Added by AE117 on Mon, 27 Dec 2021 01:57:40 +0200

Programming VIP

Resource scheduling (nodeSelector, nodeAffinity, tail, Tolrations)

Resource scheduling (nodeSelector, nodeAffinity, tail, Tolrations)

1.nodeSelector

2.nodeAffinity

3. Taint and tolerances

Popular Keywords