Simply put, k8s taints and tolerations

Taint and acceleration

Taint and tolerance cooperate with each other to avoid pod being allocated to inappropriate nodes.

One or more taints can be applied to each node, which means that pod s that cannot tolerate these taints will not be connected by this node.

If tolerance is applied to pods, it means that these pods can (but are not required to) be scheduled to nodes with matching taint s

Stain (Taint)

1. Composition of Taint

Using the kubectl taint command, you can set a stain on a Node. After the Node is set with a stain, there is a mutually exclusive relationship between the Node and the Pod. The corresponding Pod can not be scheduled to this Node node, and even the existing Pod of the Node can be expelled.

key=value:effect

Each stain has a key and value as the label of the stain, where value can be empty, and effect describes the function of the stain. Currently, tainteffect supports the following three options:

● NoSchedule: indicates k8s that the Pod will not be scheduled to the Node with the stain
● PreferNoSchedule: it means k8s try to avoid scheduling the Pod to the Node with this stain
● NoExecute: indicates k8s that the Pod will not be scheduled to the Node with the stain, and the existing Pod on the Node will be expelled

2. Setting, viewing and removal of stains

# Set stain
kubectl taint nodes node1 key1=value1:NoSchedule
# In the node description, look for the Taints field
kubectl describe pod  pod-name  
#result:
`Taints:             key1=value1:NoSchedule`
# Remove stains
kubectl taint nodes node1 key1:NoSchedule-

example:

Stain node1

# Set stain
[root@localhost ~]# kubectl taint nodes node1 key1=value1:NoSchedule

Test the Deployment of three copies:

[root@localhost ~]# vi dp.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
   name: myapp-deploy
   namespace: default
spec:
   replicas: 3
   selector:
     matchLabels:
       app: myapp
       release: stabel
   template:
     metadata:
       labels:
         app: myapp
         release: stabel
         env: test
     spec: 
       containers:
        - name: myapp
          image: nginx
          imagePullPolicy: IfNotPresent
          ports:
           - name: http
             containerPort: 80

Apply DP yaml

[root@localhost ~]# kubectl apply -f dp.yml

It is found that all three pod s are dispatched to node2 node (node1 is stained)

[root@localhost ~]# kubectl get pod -o wide
NAME                            READY   STATUS    RESTARTS   AGE   IP            NODE    NOMINATED NODE   READINESS GATES
myapp-deploy-5db79b5cb6-c6phz   1/1     Running   0          26s   10.244.2.30   node2   <none>           <none>
myapp-deploy-5db79b5cb6-gkqlc   1/1     Running   0          26s   10.244.2.29   node2   <none>           <none>
myapp-deploy-5db79b5cb6-qkjb7   1/1     Running   0          26s   10.244.2.28   node2   <none>           <none>

Remove stains

[root@localhost ~]# kubectl taint nodes node1 key1:NoSchedule-

After deleting a pod, the service can be dispatched to node1 node

[root@localhost ~]# kubectl get pod -o wide
NAME                            READY   STATUS    RESTARTS   AGE   IP            NODE    NOMINATED NODE   READINESS GATES
myapp-deploy-5db79b5cb6-c6phz   1/1     Running   0          35m   10.244.2.30   node2   <none>           <none>
myapp-deploy-5db79b5cb6-gkqlc   1/1     Running   0          35m   10.244.2.29   node2   <none>           <none>
myapp-deploy-5db79b5cb6-qkjb7   1/1     Running   0          35m   10.244.2.28   node2   <none>           <none>

[root@localhost ~]# kubectl delete pod myapp-deploy-5db79b5cb6-c6phz
pod "myapp-deploy-5db79b5cb6-c6phz" deleted

[root@localhost ~]# kubectl get pod -o wide
NAME                            READY   STATUS    RESTARTS   AGE   IP            NODE    NOMINATED NODE   READINESS GATES
myapp-deploy-5db79b5cb6-dpcjf   1/1     Running   0          6s    10.244.1.34   node1   <none>           <none>
myapp-deploy-5db79b5cb6-gkqlc   1/1     Running   0          35m   10.244.2.29   node2   <none>           <none>
myapp-deploy-5db79b5cb6-qkjb7   1/1     Running   0          35m   10.244.2.28   node2   <none>           <none>

Node1 node uses NoExecute: (it is found that the pod in node1 node is expelled to node2)

[root@localhost ~]# kubectl get pod -o wide
NAME                            READY   STATUS    RESTARTS   AGE   IP            NODE    NOMINATED NODE   READINESS GATES
myapp-deploy-5db79b5cb6-dpcjf   1/1     Running   0          6s    10.244.1.34   node1   <none>           <none>
myapp-deploy-5db79b5cb6-gkqlc   1/1     Running   0          35m   10.244.2.29   node2   <none>           <none>
myapp-deploy-5db79b5cb6-qkjb7   1/1     Running   0          35m   10.244.2.28   node2   <none>           <none>

[root@localhost ~]# kubectl taint nodes node1 key1=value1:NoExecute
node/node1 tainted

[root@localhost ~]# kubectl get pod -o wide
NAME                            READY   STATUS    RESTARTS   AGE   IP            NODE    NOMINATED NODE   READINESS GATES
myapp-deploy-5db79b5cb6-gkqlc   1/1     Running   0          52m   10.244.2.29   node2   <none>           <none>
myapp-deploy-5db79b5cb6-qkjb7   1/1     Running   0          52m   10.244.2.28   node2   <none>           <none>
myapp-deploy-5db79b5cb6-x9kc9   1/1     Running   0          2s    10.244.2.31   node2   <none>           <none>

Tolerances

Node s with stains will generate mutually exclusive relationships among NoSchedule, PreferNoSchedule, NoExecute and Pod according to the effect of taint, and Pod will not be scheduled to nodes to a certain extent. However, we can set tolerance on the Pod, which means that the Pod with tolerance can tolerate the existence of stains and can be scheduled to the nodes with stains

pod.spec.tolerations

tolerations:
 - key: "key1"
  operator: "Equal"
  value: "value1"  
  effect: "NoSchedule"  
  tolerationSeconds: 3600
 - key: "key1"  
  operator: "Equal"  
  value: "value1"  
  effect: "NoExecute"
 - key: "key2"  
  operator: "Exists"  
  effect: "NoSchedule"

● the key, vaule and effect should be consistent with the taint set on the Node
● if the value of operator is Exists, the value value will be ignored
● used to describe the time that pooled can continue to run when pooled needs to be expelled

1. When the key value is not specified, it means that all dirty keys are tolerated (the master node will also be deployed):

tolerations:
- operator: "Exists"

2. When the effect value is not specified, it means that all stain effects are tolerated

tolerations:
- key: "key"
  operator: "Exists"

3. To prevent resource waste when there are multiple masters, you can set the following settings (not tested yet)

kubectl taint nodes Node-Name node-role.kubernetes.io/master=:PreferNoSchedule

example:

Example 1: use the following tolerance

tolerations:
 - key: "key1"  
  operator: "Equal"  
  value: "value1"  
  effect: "NoExecute"
[root@localhost ~]# vi dep-tolerator.yml 
apiVersion: apps/v1
kind: Deployment
metadata:
   name: myapp-deploy
   namespace: default
spec:
   replicas: 3
   selector:
     matchLabels:
       app: myapp
       release: stabel
   template:
     metadata:
       labels:
         app: myapp
         release: stabel
         env: test
     spec: 
       tolerations:
        - key: key1
          operator: Equal
          value: value1
          effect: NoExecute
       containers:
        - name: myapp
          image: nginx
          imagePullPolicy: IfNotPresent
          ports:
           - name: http
             containerPort: 80

Viewing the results, you can see that node1 and node2 are deployed

[root@localhost ~]# kubectl apply -f dep-tolerator.yml 
deployment.apps/myapp-deploy created
[root@localhost ~]# kubectl get pod -o wide
NAME                            READY   STATUS    RESTARTS   AGE   IP            NODE    NOMINATED NODE   READINESS GATES
myapp-deploy-67d655bffc-7bn6n   1/1     Running   0          12s   10.244.2.32   node2   <none>           <none>
myapp-deploy-67d655bffc-9fqc2   1/1     Running   0          12s   10.244.1.36   node1   <none>           <none>
myapp-deploy-67d655bffc-vmmd4   1/1     Running   0          12s   10.244.1.35   node1   <none>           <none>

Example 2, using

tolerations:
- operator: "Exists"
[root@localhost ~]# vi dep-tolerator-all.yml 
apiVersion: apps/v1
kind: Deployment
metadata:
   name: myapp-deploy
   namespace: default
spec:
   replicas: 3
   selector:
     matchLabels:
       app: myapp
       release: stabel
   template:
     metadata:
       labels:
         app: myapp
         release: stabel
         env: test
     spec: 
       tolerations:
        - operator: Exists
       containers:
        - name: myapp
          image: nginx
          imagePullPolicy: IfNotPresent
          ports:
           - name: http
             containerPort: 80
[root@localhost ~]#  kubectl apply -f dep-tolerator-all.yml

Result: even the master node is scheduled

[root@localhost ~]# kubectl get pod -o wide
NAME                          READY   STATUS    RESTARTS   AGE   IP            NODE     NOMINATED NODE   READINESS GATES
myapp-deploy-b9d684ff-2525s   1/1     Running   0          21m   10.244.2.33   node2    <none>           <none>
myapp-deploy-b9d684ff-bsrbg   1/1     Running   0          21m   10.244.0.8    master   <none>           <none>
myapp-deploy-b9d684ff-fggcg   1/1     Running   0          21m   10.244.1.37   node1    <none>           <none>

Example 3: (tolerates the with key: "key1")

tolerations:
- key: "key1"
  operator: "Exists"

Keywords: Kubernetes

Added by enemeth on Mon, 07 Feb 2022 05:15:12 +0200