Kubernetes Pod horizontal automatic expansion and contraction

Horizontal Pod Autoscaler (HPA) can automatically scale the number of pods in ReplicationController, Deployment, ReplicaSet and StatefulSet based on CPU utilization. In addition to CPU utilization, automatic scaling can also be performed based on custom metrics provided by other applications. Pod auto scaling is not applicable to objects that cannot be scaled, such as DaemonSet.

The Pod horizontal automatic scaling feature is implemented by Kubernetes API resources and controllers. The controller periodically adjusts the number of replicas in the replica controller or Deployment so that the observed metrics such as Pod average CPU utilization and average memory utilization match the target values set by the user.

Illustration of Pod horizontal automatic expansion and contraction mechanism

Kubernetes V1.19 and metrics server are used in this paper   v0.5.1 demonstrate the automatic expansion and contraction of pod horizontally.

Install metrics server v0.5.1

Metrics Server is the source of Kubernetes built-in auto scaling container resource metrics. The Metrics Server collects resource metrics from Kubelets and exposes them to Kubernetes apiserver through the Metrics API for use by Horizontal Pod Autoscaler.

1. Download the metrics server v0.5.1 deployment file. Note that metrics server requires the version of Kubernetes. See GitHub description of metrics server

wget https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.5.1/components.yaml  

2. The image used by metrics server is in Google's warehouse, which cannot be downloaded in the mainland. You can change the warehouse to Alibaba cloud warehouse. If the cluster is built locally and there is no certificate, you need to add -- kubelet secure TLS configuration.

Modify components.yaml

Change k8s.gcr.io/metrics-server/metrics-server:v0.5.1 to registry.cn-hangzhou.aliyuncs.com/google_containers/metrics-server:v0.5.1

Add -- kubelet secure TLS configuration

    spec:
      containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=443
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --metric-resolution=15s
       # Add -- kubelet secure TLS configuration
        - --kubelet-insecure-tls 

3. Deploying metrics server

Deployment command: kubectl apply -f components.yaml

Check whether the deployment is successful: kubectl get Pods - n Kube system

Horizontal Pod Autoscaler horizontal autoscaler demonstration

Automatic scaling based on CPU utilization

Use the official HPA example image walkthrough, which defines a   index.php page to perform CPU intensive computing

<?php
  $x = 0.0001;
  for ($i = 0; $i <= 1000000; $i++) {
    $x += sqrt($x);
  }
  echo "OK!";
?>

  1. Write the HPA example declaration file php-apache.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: php-apache
spec:
  selector:
    matchLabels:
      run: php-apache
  replicas: 1
  template:
    metadata:
      labels:
        run: php-apache
    spec:
      containers:
      - name: php-apache
        # Use mirrrogoglecontainers / HPA example to avoid being blocked by walls
        image: mirrorgooglecontainers/hpa-example
        ports:
        - containerPort: 80
        resources:
          limits:
            cpu: 500m
          requests:
            cpu: 200m

---

apiVersion: v1
kind: Service
metadata:
  name: php-apache
  labels:
    run: php-apache
spec:
  ports:
  - port: 80
  selector:
    run: php-apache

Deploy PHP Apache: kubectl apply -f php-apache.yaml

Check whether PHP Apache is successfully deployed: kubectl get pods

2. Create Horizontal Pod Autoscaler. Available    kubectl autoscale deployment XXX creates HPA.

kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10

The above command will create a Horizontal Pod Autoscaler to control PHP Apache and keep the number of copies of Pod between 1 and 10. Generally speaking, HPA keeps the average CPU utilization of all pods at about 50% by increasing or decreasing the number of Pod copies.

View HPA status

kubectl get hpa

3. Send a large number of requests to PHP Apache to increase the load of PHP Apache.

Open a new Linux client window, start a container, and send a request to the PHP Apache server through a loop.
kubectl run -i --tty load-generator --image=busybox --restart=Never -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://php-apache; done"

Use another Linux client window to check the status of HPA. After waiting for more than ten seconds, the CPU utilization of PHP Apache pod gradually increases and the number of pods gradually increases until the CPU utilization of pod stabilizes at about 50% and the expansion stops.

kubectl get hpa

Check the number of pod s in PHP Apache

kubectl get deployment php-apache


4. Stop sending requests and reduce the load of PHP Apache. HPA will reduce the number of pod s in PHP Apache.

Use Ctrl+C to stop another client's busybox.

After a period of time, the number of PHP apace pods is reduced to 1

Auto scale based on memory usage

To clean up, first delete the load generator and PHP Apache   hpa,php-apache

kubectl delete pod load-generator

kubectl delete hpa php-apache

kubectl delete -f php-apache.yaml

The blogger's docker hub provides an image of the test memory, codingsoldier / image test: v3  

@RestController
@RequestMapping("/hpa")
public class HpaController {

    public Map map = new HashMap();

    @RequestMapping("/mem")
    public String mem(@RequestParam("value") Integer value) {
        for (int i = 0; i < value; i++) {
            MemObj memObj = new MemObj(UUID.randomUUID().toString(), new Random().nextInt(Integer.MAX_VALUE));
            map.put(UUID.randomUUID().toString(), memObj);
        }
        return "ok";
    }

}

1. Create a new hpa-mem-test.yaml and use the image   codingsoldier/image-test:v3  

apiVersion: apps/v1
kind: Deployment
metadata:
  name: hpa-mem-test
spec:
  selector:
    matchLabels:
      run: hpa-mem-test
  replicas: 1
  template:
    metadata:
      labels:
        run: hpa-mem-test
    spec:
      containers:
      - name: hpa-mem-test
      # This is the image submitted by the blogger to the docker hub for testing memory
        image: codingsoldier/image-test:v3
        ports:
        - containerPort: 80
        resources:
          limits:
            memory: 400Mi
          requests:
            memory: 200Mi

---

apiVersion: v1
kind: Service
metadata:
  name: hpa-mem-test
  labels:
    run: hpa-mem-test
spec:
  ports:
  - port: 80
  selector:
    run: hpa-mem-test

2. Only the autoscaling/v2beta2 API version supports memory indicator monitoring. If the Kubernetes version is too old, autoscaling/v2beta2 API cannot be used.

Create HPA using the declaration file and create hpa-mem.yaml

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: hpa-mem-test
spec:
  # The object specified for scaling is HPA MEM test deployment
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: hpa-mem-test
  minReplicas: 1
  maxReplicas: 10
  metrics:
  # The memory utilization of each Pod is within 60%
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 60

3. Deploy hpa-mem-test.yaml and hpa-mem.yaml

kubectl apply -f hpa-mem-test.yaml

kubectl apply -f hpa-mem.yaml

4. Open a new Linux client window to run busybox in an interactive Shell

kubectl run -i --tty busybox --image=busybox -- sh

Send a request to HPA MEM test in busybox to increase memory usage

wget -q -O- http://hpa-mem-test/hpa/mem?value=99999

5. View HPA status

kubectl get hpa

Wait for more than ten seconds to see if HPA changes

The memory usage of pod has increased to 80% (if the memory usage does not exceed 60%), you can execute WGet - Q - O again- http://hpa-mem-test/hpa/mem?value=99999 , increase pod memory usage.

The HPA MEM test is increased to two pods, the application memory usage in the pod is stable at 60%, and the capacity expansion is stopped.  

Expand and shrink according to other indicators

Using the autoscaling/v2beta2 API version, you can use other metrics when automatically scaling the Deployment of PHP Apache.

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: php-apache
spec:
  # The object specified for scaling is PHP Apache deployment
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: php-apache
  minReplicas: 1
  maxReplicas: 10
  metrics:
  # CPU utilization of each Pod is within 50%
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50
  # The memory utilization of each Pod is within 60%
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 60
  # Each Pod can serve 1000 packet requests per second
  - type: Pods
    pods:
      metric:
        name: packets-per-second
      target:
        type: AverageValue
        averageValue: 1k
  # The total number of requests that can be served by Pod after progress reaches 10000 per second        
  - type: Object
    object:
      metric:
        name: requests-per-second
      describedObject:
        apiVersion: networking.k8s.io/v1beta1
        kind: Ingress
        name: main-route
      target:
        type: Value
        value: 10k

Keywords: Kubernetes Container Cloud Native

Added by IcedEarth on Thu, 04 Nov 2021 21:36:54 +0200