load average of the system

The average load of the system refers to the number of processes in the running state and non interruptible state of the system in a unit time

Runnable process: it can be understood as the process that is occupying CPU or waiting for CPU in the system, that is, the process in R state

Non interruptible process: Generally speaking, it refers to the process that is in the key process of the kernel and cannot be interrupted. The most common process is the process waiting for disk IO, that is, the common D process in the system. Non interruptible is because disk IO is being read and written at this time. Interruption will lead to the difference between the data in the process and the data in the disk (waiting for IO does not occupy CPU utilization, so high average load does not mean high CPU utilization. It is possible that CPU utilization is not high, but waiting for IO)

One CPU can only execute the code of one process at the same time (multiple processes running at the same time make use of the CPU's time slicing mechanism, resulting in context switching and CPU performance loss). In fact, the CPU can only run one process at the same time

For example, there are 2-core CPUs on the server. Through uptime, it is found that the average load within 1 minute is 1, which means that one core CPU is always working or non interruptible within 1 minute. At the same time, the CPU of another core is idle, and the average load of 2-core CPU reaches 2, which means that the CPU is saturated. At this time, if you increase the process operation, you will wait for the CPU (when viewed through pidstat, you can see that the% wait value of the process increases, indicating the time the process waits for the CPU)


Use the stress pressure test tool to verify the running process and the phenomenon of non interruptible process state. ubuntu 18.04 dual core CPU

1. Install stress and sysstat tools (mpstat and pidstat tools of sysstat package will be used)

apt-get install stress sysstat -y 

2. Use uptime to view the current average CPU load, and use stress to measure the CPU, wait 1 minute and query the average load comparison again

root@cloud-public:~# uptime    #The first query shows that the average load of 0.03 in one minute is very low
 00:10:39 up 37 days, 13:36,  1 user,  load average: 0.03, 0.03, 0.00
 root@cloud-public:~# stress --cpu 1 --timeout 100    #Use the stress tool to measure the CPU of 1 core
stress: info: [12503] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd

root@cloud-public:~# uptime    # In the second query, it is found that the average load changes to 1.05, indicating that one core CPU is running within 1 minute
 00:13:32 up 37 days, 13:39,  2 users,  load average: 1.05, 0.49, 0.19
root@cloud-public:~# mpstat -P ALL 3   # Use the mpstat tool to check the usage of all CPUs. You can see that the utilization rate of one CPU is 100%
Linux 4.15.0-142-generic (cloud-public) 	12/10/2021 	_x86_64_	(2 CPU)

12:15:36 AM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
12:15:39 AM  all   50.25    0.00    0.17    0.17    0.00    0.00    0.00    0.00    0.00   49.42
12:15:39 AM    0    0.33    0.00    0.67    0.33    0.00    0.00    0.00    0.00    0.00   98.67
12:15:39 AM    1  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00

root@cloud-public:~# pidstat -u 5 1    # By querying the process through pidstat, you can see that the CPU utilization of the stress process is 100%
Linux 4.15.0-142-generic (cloud-public) 	12/10/2021 	_x86_64_	(2 CPU)

12:19:40 AM   UID       PID    %usr %system  %guest   %wait    %CPU   CPU  Command
12:19:45 AM     0      1139    0.20    0.20    0.00    0.00    0.40     1  sunloginclient
12:19:45 AM     0      2341    0.20    0.00    0.00    0.00    0.20     1  barad_agent
12:19:45 AM     0      6034    0.20    0.00    0.00    0.00    0.20     1  tat_agent
12:19:45 AM     0     14426  100.00    0.00    0.00    0.00  100.00     0  stress
12:19:45 AM     0     14470    0.00    0.20    0.00    0.00    0.20     1  pidstat

IO Test of occupancy scenario: 
   stress -i 1 --timeout 600    #Occupy a 100%CPU,use mpstatWhen you view the tool, you will see CPU Time spent insys,iowaitupper
A large number of processes CPU Occupancy scenario test:
  stress -c 8 --timeout 600    #Concurrent 8 CPU The average load is 8, but the actual load of the machine is CPU Core 2
  Therefore, throughpidstat Tool view process occupancy CPU When, you will find %wait Rising indicates that the process is waiting CPU Time

Keywords: Cloud Server

Added by tbone05420 on Fri, 10 Dec 2021 02:48:08 +0200