k8s workload controller
1.k8s workload controller
Workloads are applications running on kubernetes.
Whether your load is a single component or multiple components working together, you can run it in a set of Pods in Kubernetes. In Kuberneres, pod represents a set of containers running on the cluster.
Kubernetes Pods has a defined life cycle. For example, when a Pod runs in your cluster and a fatal error occurs on the node where the Pod runs, all Pods on that node will fail. Kubernetes regards such failures as the final state: even if the node returns to normal operation later, you need to create a new Pod to restore the application.
However, in order to make life a little easier for users, you don't need to manage each Pod directly. Instead, you can use load resources to manage a set of Pods for you. These resource configuration controllers ensure that the number of Pods of the appropriate type and running state is correct and consistent with the state you specify.
Common workload controllers:
A Deployment provides declarative update capabilities for Pods and ReplicaSets.
You are responsible for describing the target state in the Deployment, and the Deployment controller changes the actual state to the desired state at a controlled rate. You can define a Deployment to create a new ReplicaSet, or delete an existing Deployment and adopt its resources through the new Deployment.
Deployment is very suitable for managing stateless applications on your cluster. All pod s in deployment are equivalent to each other and are replaced when necessary.
[root@master haproxy]# cat nginx.yml apiVersion: apps/v1 kind: Deployment metadata: name: deploy labels: app: nginx spec: replicas: 5 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:latest ports: - containerPort: 80 [root@master haproxy]# kubectl apply -f nginx.yml deployment.apps/deploy created [root@master haproxy]# kubectl get pod NAME READY STATUS RESTARTS AGE deploy-585449566-7hsbk 0/1 ContainerCreating 0 9s deploy-585449566-l2m4l 0/1 ContainerCreating 0 9s deploy-585449566-pldrf 0/1 ContainerCreating 0 9s deploy-585449566-rnh8q 0/1 ContainerCreating 0 9s deploy-585449566-t6dw4 0/1 ContainerCreating 0 9s [root@master haproxy]#
In this example:
A deployment named deploy (indicated by the. metadata.name field) was created
This deployment creates five (indicated by the replicas field) copies of the pod
The selector field defines how Deployment finds Pods to manage. Here, you can select the tag (app: nginx) defined in the Pod template. However, more complex selection rules are also possible, as long as the Pod template itself meets the given rules.
The template field contains the following subfields:
Pod is labeled app: nginx using the labels field.
The Pod template specification (i.e. the. template.spec field) instructs Pods to run an nginx container that runs the nginx Docker Hub image of version 1.14.2.
Create a container and name it nginx using the name field.
The purpose of ReplicaSet is to maintain a stable set of Pod replicas that are running at any time. Therefore, it is usually used to ensure the availability of a given number of identical pods.
How ReplicaSet works
RepicaSet is defined by a set of fields, including a selection operator that identifies the set of available pods, a value used to indicate the number of copies that should be maintained, a Pod template used to specify that a new Pod should be created to meet the conditions for the number of copies, and so on. Each ReplicaSet uses the provided Pod template when it creates a new Pod as needed.
ReplicaSet via metadata. On Pod The ownerReferences field is connected to the affiliated Pod, which gives the master resource of the current object. The Pod obtained by the ReplicaSet contains the identification information of the master ReplicaSet in its ownerReferences field. It is through this connection that ReplicaSet knows the state of the Pod set it maintains and plans its operation behavior accordingly.
The ReplicaSet uses its selection operator to identify the Pod set to obtain. If a Pod does not have an OwnerReference or its OwnerReference is not a controller, and it matches the selection operator of a ReplicaSet, the Pod is immediately obtained by the ReplicaSet.
ReplicaSet ensures that a specified number of Pod replicas are running at any time. However, Deployment is a more advanced concept that manages the ReplicaSet and provides declarative updates and many other useful functions to the Pod. Therefore, we recommend using Deployment instead of directly using ReplicaSet, unless you need to customize the update business process or do not need to update at all.
This actually means that you may never need to manipulate the ReplicaSet object: instead, use Deployment and define your application in the spec section.
[root@master haproxy]# cat replicaset.yaml apiVersion: apps/v1 kind: ReplicaSet metadata: name: replicaset labels: app: httpd tier: frontend spec: replicas: 5 selector: matchLabels: tier: frontend template: metadata: labels: tier: frontend spec: containers: - name: httpd image: httpd:latest [root@master haproxy]#
[root@master haproxy]# kubectl apply -f replicaset.yaml replicaset.apps/replicaset created [root@master haproxy]# kubectl get rs NAME DESIRED CURRENT READY AGE deploy-585449566 5 5 5 4m16s replicaset 5 5 0 9s [root@master haproxy]# kubectl get pod NAME READY STATUS RESTARTS AGE deploy-585449566-7hsbk 1/1 Running 0 4m23s deploy-585449566-l2m4l 1/1 Running 0 4m23s deploy-585449566-pldrf 1/1 Running 0 4m23s deploy-585449566-rnh8q 1/1 Running 0 4m23s deploy-585449566-t6dw4 1/1 Running 0 4m23s replicaset-2x2rb 0/1 ContainerCreating 0 16s replicaset-6dbv2 0/1 ContainerCreating 0 16s replicaset-6qvr7 0/1 ContainerCreating 0 16s replicaset-f78lc 0/1 ContainerCreating 0 16s replicaset-zfrgg 0/1 ContainerCreating 0 16s [root@master haproxy]#
DaemonSet ensures that all (or some) nodes run a copy of a Pod. When nodes join the cluster, a Pod will also be added for them. When nodes are removed from the cluster, these pods will also be recycled. Deleting DaemonSet will delete all pods it creates.
Some typical usage of DaemonSet:
Run the cluster daemon on each node
Run the log collection daemon on each node
Run the monitoring daemon on each node
A simple usage is to start a daemon set on all nodes for each type of daemon. A slightly more complex usage is to deploy multiple daemonsets for the same daemon; Each has different flags, and has different memory and CPU requirements for different hardware types.
[root@master haproxy]# cat daemonset.yaml apiVersion: apps/v1 kind: DaemonSet metadata: name: fluentd-elasticsearch namespace: kube-system labels: k8s-app: fluentd-logging spec: selector: matchLabels: name: fluentd-elasticsearch template: metadata: labels: name: fluentd-elasticsearch spec: tolerations: - key: node-role.kubernetes.io/master operator: Exists effect: NoSchedule containers: - name: fluentd-elasticsearch image: quay.io/fluentd_elasticsearch/fluentd:v2.5.2 resources: limits: memory: 200Mi requests: cpu: 100m memory: 200Mi volumeMounts: - name: varlog mountPath: /var/log - name: varlibdockercontainers mountPath: /var/lib/docker/containers readOnly: true terminationGracePeriodSeconds: 30 volumes: - name: varlog hostPath: path: /var/log - name: varlibdockercontainers hostPath: path: /var/lib/docker/containers [root@master haproxy]#
[root@master haproxy]# kubectl apply -f daemonset.yaml daemonset.apps/fluentd-elasticsearch created [root@master haproxy]# kubectl get pod -n kube-system NAME READY STATUS RESTARTS AGE coredns-7f89b7bc75-hhvts 1/1 Running 10 5d10h coredns-7f89b7bc75-w8gkn 1/1 Running 10 5d10h etcd-master 1/1 Running 10 5d10h fluentd-elasticsearch-hd8r6 0/1 ContainerCreating 0 9s fluentd-elasticsearch-lcbm7 0/1 ContainerCreating 0 9s fluentd-elasticsearch-nwvkc 0/1 ContainerCreating 0 9s kube-apiserver-master 1/1 Running 10 5d10h kube-controller-manager-master 1/1 Running 10 5d10h kube-flannel-ds-bbxkd 1/1 Running 8 5d10h kube-flannel-ds-ghdmx 1/1 Running 12 5d10h kube-flannel-ds-r6dkd 1/1 Running 8 5d10h kube-proxy-f2z2c 1/1 Running 8 5d10h kube-proxy-q46l4 1/1 Running 8 5d10h kube-proxy-x4wwq 1/1 Running 10 5d10h kube-scheduler-master 1/1 Running 10 5d10h [root@master haproxy]#
3. How is daemon pods scheduled
Scheduling via default scheduler
FEATURE STATE: Kubernetes v1.23 [stable]
The daemon set ensures that all eligible nodes run a copy of the Pod. Typically, the node running the Pod is selected by the Kubernetes scheduler. However, the DaemonSet Pods are created and scheduled by the DaemonSet controller. This brings the following problems:
- Inconsistency of Pod behavior: normal pods are in Pending state when waiting for scheduling after being created, and DaemonSet Pods will not be in Pending state after being created. This confuses users.
- Pod preemption is handled by the default scheduler. After preemption is enabled, the DaemonSet controller will make scheduling decisions without considering pod priority and preemption.
- ScheduleDaemonSetPods allows you to schedule DaemonSets using the default scheduler instead of the DaemonSet controller by placing NodeAffinity conditions instead of Add the spec.nodeName condition to the DaemonSet Pods. The default scheduler then binds the Pod to the target host. If the node affinity configuration of the DaemonSet Pod already exists, it is replaced (the original node affinity configuration is considered before selecting the target host). The DaemonSet controller performs these operations only when creating or modifying the DaemonSet Pod and does not change the spec.template of the DaemonSet.
nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchFields: - key: metadata.name operator: In values: - target-host-name
In addition, the system automatically adds node kubernetes. IO / unscheduled: NoSchedule tolerance to DaemonSet Pods. When scheduling the DaemonSet Pod, the default scheduler ignores the unscheduled node.
The Job will create one or more Pods and will continue to retry the execution of Pods until the specified number of Pods are successfully terminated. With the successful completion of Pods, the Job tracks the number of successfully completed Pods. When the number reaches the specified success threshold, the task (i.e. Job) ends. Deleting a Job will clear all the created Pods. Suspending a Job will delete all the active Pods of the Job until the Job is resumed again.
In a simple usage scenario, you will create a Job object to run a Pod in a reliable way until it is completed. When the first Pod fails or is deleted (for example, because the node hardware fails or restarts), the Job object will start a new Pod.
You can also use Job to run multiple pods in parallel.
[root@master haproxy]# cat jobs.yaml apiVersion: batch/v1 kind: Job metadata: name: pi spec: template: spec: containers: - name: pi image: perl command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"] restartPolicy: Never backoffLimit: 4 [root@master haproxy]#
[root@master haproxy]# kubectl apply -f jobs.yaml job.batch/pi created [root@master haproxy]# kubectl describe jobs/pi Name: pi Namespace: default Selector: controller-uid=7dc2f97e-c321-4aec-a976-b8c4aa16aae8 Labels: controller-uid=7dc2f97e-c321-4aec-a976-b8c4aa16aae8 job-name=pi Annotations: <none> Parallelism: 1 Completions: 1 Start Time: Fri, 24 Dec 2021 09:10:20 -0500 Pods Statuses: 1 Running / 0 Succeeded / 0 Failed Pod Template: Labels: controller-uid=7dc2f97e-c321-4aec-a976-b8c4aa16aae8 job-name=pi Containers: pi: Image: perl Port: <none> Host Port: <none> Command: perl -Mbignum=bpi -wle print bpi(2000) Environment: <none> Mounts: <none> Volumes: <none> Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal SuccessfulCreate 9s job-controller Created pod: pi-sgwvx [root@master haproxy]#
FEATURE STATE: Kubernetes v1.21 [stable]
CronJob creates Jobs based on interval repeat scheduling.
A CronJob object is like a line in a crontab (cron table) file. It is written in Cron format and executes jobs periodically at a given scheduling time.
All CronJob schedule s are based on Kube controller manager Time zone.
If your control plane runs Kube controller manager in a Pod or a bare container, the time zone set for the container will determine the time zone used by the controller of Cron Job.
When creating a manifest for CronJob resources, ensure that the name provided is a legal DNS subdomain name The name cannot exceed 52 characters. This is because the CronJob controller will automatically append 11 characters after the Job name provided, and there is a limit that the maximum length of the Job name cannot exceed 63 characters.
CronJob is used to perform periodic actions, such as backup, report generation, etc. Each of these tasks should be configured to repeat periodically (e.g. daily / weekly / monthly); you can define the time interval at which the task starts to execute.
[root@master haproxy]# cat cronjob.yaml apiVersion: batch/v1beta1 kind: CronJob metadata: name: hello spec: schedule: "*/1 * * * *" jobTemplate: spec: template: spec: containers: - name: hello image: busybox imagePullPolicy: IfNotPresent command: - /bin/sh - -c - date; echo Hello from the Kubernetes cluster restartPolicy: OnFailure [root@master haproxy]#
[root@master haproxy]# kubectl apply -f cronjob.yaml cronjob.batch/hello created [root@master haproxy]# kubectl get pods NAME READY STATUS RESTARTS AGE deploy-585449566-7hsbk 1/1 Running 0 12m deploy-585449566-l2m4l 1/1 Running 0 12m deploy-585449566-pldrf 1/1 Running 0 12m deploy-585449566-rnh8q 1/1 Running 0 12m deploy-585449566-t6dw4 1/1 Running 0 12m pi-sgwvx 0/1 Completed 0 2m12s replicaset-2x2rb 1/1 Running 0 8m15s replicaset-6dbv2 1/1 Running 0 8m15s replicaset-6qvr7 1/1 Running 0 8m15s replicaset-f78lc 1/1 Running 0 8m15s replicaset-zfrgg 1/1 Running 0 8m15s [root@master haproxy]#
6.Cron time syntax
* * * * * Time sharing day month week