1, K8S monitoring component metrics server
Installation steps
1.Add open source warehouse [root@k8s-master ~]# helm repo add kaiyuanshe http://mirror.kaiyuanshe.cn/kubernetes/charts/ #Search metrics server [root@k8s-master ~]# helm search repo metrics-server 2.download metrics-server package helm pull kaiyuanshe/metrics-server 3.decompression tar -xf metrics-server-2.11.4.tgz 4.modify values.yaml file cd metrics-server vim values.yaml # Replace with the image source and version number below image: repository: registry.cn-hangzhou.aliyuncs.com/linxiaowen/metrics-server tag: v0.4.1 args: [] # Delete the parentheses and write the following contents on a new line. After testing, it doesn't matter whether the following is indented or not. If the ports conflict, modify them below. - --cert-dir=/tmp - --secure-port=6443 - --metric-resolution=30s - --kubelet-insecure-tls - --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,externalDNS - --requestheader-username-headers=X-Remote-User - --requestheader-group-headers=X-Remote-Group - --requestheader-extra-headers-prefix=X-Remote-Extra- # If the port is occupied and the port is changed, the port needs to be changed in three places of two files (Note: the following ports must be modified at the same time) [root@k8s-m-01 metrics-server]# grep -R '8443' ./ ./templates/metrics-server-deployment.yaml: - --secure-port=6443 ./templates/metrics-server-deployment.yaml: - containerPort: 6443 ./values.yaml: - --secure-port=6443 5.Create user [root@k8s-master metrics-server]# kubectl create clusterrolebinding system:anonymous --clusterrole=cluster-admin --user=system:anonymous 6.install metrics-server [root@k8s-master metrics-server]# helm install metrics-server ./ 7.see metrics-server of pod Did the service run [root@k8s-master metrics-server]# kubectl get pod metrics-server-675ccccb46-84pbm 1/1 Running 0 19m 7.Test command after service [root@k8s-m-01 metrics-server]# kubectl top nodes NAME CPU(cores) CPU% MEMORY(bytes) MEMORY% k8s-master 110m 5% 948Mi 32% k8s-node1 41m 2% 1013Mi 35% k8s-node2 60m 3% 1228Mi 42% [root@k8s-m-01 metrics-server]# kubectl top pod NAME CPU(cores) MEMORY(bytes) nfs-nfs-client-provisioner-8557b8c764-lc4nx 3m 6Mi
2, HPA automatic telescopic
In the production environment, there will always be some unexpected things. For example, the traffic of the company's website suddenly increases. At this time, the previously created Pod is not enough to support all visits, and the operation and maintenance personnel can't guard the business services 24 hours. At this time, HPA can be configured to automatically expand the number of Pod copies and share the high concurrent traffic when the load is too high, When the flow returns to normal, HPA will automatically reduce the number of pods. HPA automatically expands the number of pods according to CPU utilization and memory utilization, so the Requests parameter must be defined to use HPA.
The full name of HPA is Horizontal Pod Autoscaler, which translates into Chinese as pod horizontal automatic scaling. HPA can automatically expand and shrink the number of pods in replication controller, deployment and replica based on CPU utilization (in addition to CPU utilization, it can also automatically expand and shrink based on the measurement index custom metrics provided by other applications). Pod autoscale does not apply to objects that cannot be scaled, such as DaemonSets. HPA is implemented by Kubernetes API resources and controllers. Resources determine the behavior of the controller. The controller periodically obtains the average CPU utilization, compares it with the target value, and then adjusts the number of copies in the replication controller or deployment.
# Compile pod resource list kind: Deployment apiVersion: apps/v1 metadata: name: hpa spec: selector: matchLabels: app: hpa template: metadata: labels: app: hpa spec: containers: - name: hpa image: alvinos/django:v1 resources: requests: # How many resources are used at least cpu: 100m # At least 100m for cpu memory: 100Mi # Memory is at least 100Mi limits: # How many resources are used at most cpu: 200m # The cpu is 200m at most and cannot exceed 200m at most memory: 200Mi # The maximum memory is 200Mi and cannot exceed --- # Configure HPA resource list kind: HorizontalPodAutoscaler apiVersion: autoscaling/v2beta1 metadata: name: hpa namespace: default spec: # Minimum pod quantity and maximum pod quantity of HPA maxReplicas: 10 minReplicas: 2 # HPA's scalable object description. HPA will dynamically modify the number of pod s of the object scaleTargetRef: kind: Deployment name: hpa apiVersion: apps/v1 # The monitored indicator array supports the coexistence of multiple types of indicators metrics: - type: Resource # Core indicators, including cpu and memory (indicators defined in requests and limits of the container in the elastically scalable pod object.) resource: name: cpu # CPU threshold # Calculation formula: average value of metric utilization (percentage) of all target pod s, # For example, limit CPU = 1000m, utilization=50% if 500m is actually used # For example (number of copies) deployment replica=3, limit. CPU = 1000m, then the actual used CPU of pod1 = 500m, POD2 = 300m, pod = 600m ## Then averageUtilization=(500/1000+300/1000+600/1000)/3 = (500 + 300 + 600)/(3*1000)) = 0.466667 # For example, the limit is set to 200m, and the calculation result is: (40 + 41 + 42) / 600 = 0.205 targetAverageUtilization: 40 # Note: if it exceeds 0.4%, write 40. If the cpu exceeds 40, expand the capacity immediately. The maximum capacity of cpu is 10 and the minimum is 2. 200 * 0.4 = 80 (its safety range is below 80) --- # Compile service resource list kind: Service apiVersion: v1 metadata: name: hpa spec: ports: - port: 80 targetPort: 80 selector: app: hpa
Unit interpretation
requests: represents the resource limit of the container startup request. The allocated resources must meet this requirement
limits: represents the maximum number of resources that can be requested
Unit M: the unit of measurement of CPU is called millicore (m). Multiply the number of CPU cores of a node by 1000 to get the total number of CPUs of the node. For example, if a node has two cores, the total CPU of the node is 2000m.
Take dual core as an example:
resources: requests: cpu: 50m #Equal to 0.05 memory: 512Mi limits: cpu: 100m #Equal to 0.1 memory: 1Gi
Meaning: when the container starts, it requests 50 / 2000 cores (2.5%) and allows up to 100 / 2000 cores (5%)
The total number of 0.05 cores except 2 is 2.5%, and the total number of 0.1 cores except 2 is 5%
resources: requests: cpu: 100m #Equal to 0.1 memory: 512Mi limits: cpu: 200m #Equal to 0.2 memory: 1Gi
The meaning of cpu unit m: when the container starts, it requests 100 / 2000 cores (5%) and allows up to 200 / 2000 cores (10%)
The total number of 0.1 cores except 2 is 5%, and the total number of 0.2 cores except 2 is 10%
application
2,# Application resources [root@k8s-master ~]# kubectl apply -f hpa.yaml deployment.apps/hpa created # View hpa monitor cpu usage [root@k8s-master ~]#kubectl get horizontalpodautoscalers.autoscaling NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE hpa Deployment/hpa 24%/40% 2 10 2 53s 3,# View host resource usage [root@k8s-master ~]# kubectl top pods NAME CPU(cores) MEMORY(bytes) hpa-5cb8bcdc4f-xvkkf 11m 54Mi # View pod operation details [root@k8s-master ~]# kubectl get pods NAME READY STATUS RESTARTS AGE hpa-5cb8bcdc4f-xvkkf 1/1 Running 0 7m4s # View svc operation details [root@k8s-master ~]# kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE hpa ClusterIP 10.1.159.102 <none> 80/TCP 6m50s # Enter the following command for each node in the cluster to perform the stress test [root@k8s-master ~]# while true; do curl 10.1.159.102/index; echo ''; done # Check the cpu running status of this pod again (the cpu traffic will increase, as shown below) [root@k8s-master ~]# kubectl top pods NAME CPU(cores) MEMORY(bytes) hpa-5cb8bcdc4f-xvkkf 163m 56Mi # Check again that the number of pods will increase (this enables HPA to monitor cpu utilization, and the pod container will be automatically expanded with the increase of cpu utilization) [root@k8s-master ~]# kubectl get pods NAME READY STATUS RESTARTS AGE hpa-7f5d745bf9-45fkj 1/1 Running 0 4m39s hpa-7f5d745bf9-5qb4d 1/1 Running 0 4m55s hpa-7f5d745bf9-5vnfl 1/1 Running 0 4m40s hpa-7f5d745bf9-fh66r 1/1 Running 0 4m55s hpa-7f5d745bf9-fnlx4 1/1 Running 0 15m hpa-7f5d745bf9-g7r5c 1/1 Running 0 4m55s hpa-7f5d745bf9-qdrc7 1/1 Running 0 4m39s hpa-7f5d745bf9-s2sbx 1/1 Running 0 4m39s hpa-7f5d745bf9-sz9zz 1/1 Running 0 6m56s hpa-7f5d745bf9-zw778 1/1 Running 0 16m
You can see that it automatically scales to ten.