Analysis and description of the stack use of embedded software based on the assembly code of ARM Cortex-M processor

In embedded software development, it is very important to correctly understand the use of stack. In embedded software development without RTOS, the main stack overflow is an easy to ignore error. In RTOS multi task embedded software development, in addition to the main stack, it is also necessary to consider the process stack, and different tasks need to be assigned task stacks of appropriate size.
Reading and understanding the assembly code of pressing stack and out of stack when calling function will help to correctly understand the use of stack. The articles I find describing these contents are based on the assembly code of X86 platform or ARM ARCH 64 platform, So here I want to go back to the assembly (Thumb instruction set) of the Cortex-M processor, the simplest processor in the ARM series, to analyze and explain the stack changes of a function call process, and try to explain in detail the meaning of each sentence of assembly code during the call.

1, Basic operations of pressing stack and out of stack

A simple C language program:

int add(int a,int b)
{
    return a+b;
}
int main(void)
{
	int c[2];
    c[0] = add(3,4);
	return c[0];
}

The assembly code obtained by compiling the above program is as follows: (based on Cortex-M0+ ARM processor, the compiler is arm none EABI GCC):

          add:
00000acc:   push    {r7, lr}
00000ace:   sub     sp, #8
00000ad0:   add     r7, sp, #0
00000ad2:   str     r0, [r7, #4]
00000ad4:   str     r1, [r7, #0]
00000ad6:   ldr     r2, [r7, #4]
00000ad8:   ldr     r3, [r7, #0]
00000ada:   adds    r3, r2, r3
00000adc:   movs    r0, r3
00000ade:   mov     sp, r7
00000ae0:   add     sp, #8
00000ae2:   pop     {r7, pc}
          main:
00000ae4:   push    {r7, lr}
00000ae6:   sub     sp, #8
00000ae8:   add     r7, sp, #0
00000aea:   movs    r1, #4
00000aec:   movs    r0, #3
00000aee:   bl      0xacc <add>
00000af2:   movs    r2, r0
00000af4:   movs    r3, r7
00000af6:   str     r2, [r3, #0]
00000af8:   movs    r3, r7
00000afa:   ldr     r3, [r3, #0]
00000afc:   movs    r0, r3
00000afe:   mov     sp, r7
00000b00:   add     sp, #8
00000b02:   pop     {r7, pc}         

The C language compiler of arm architecture follows AAPCS (ARM architecture process call standard) and is divided into caller save registers and callee save registers. For caller save registers, function calls can be used directly for operation without saving them to the stack space, such as r0 to r3 registers, while callee save registers, such as when used in functions, Stack pressing and stack out operations must be performed, such as r4 to r11 registers. In the above assembly code, r7 plays the role of recording stack pointer offset. Each call to the function needs to be saved on the stack, while r0, R1 and R2 do not.

main function assembly code analysis:

The main function assembly code is divided into the following three parts:

1. Stack pressing of main function

00000ae4:   push    {r7, lr}
00000ae6:   sub     sp, #8
00000ae8:   add     r7, sp, #0

Push {r7,lr} instruction, stack pressing r7,lr register, sub sp, #8 move the stack pointer. The corresponding size corresponds to the local variable of the main function. r7 records the current stack pointer offset. C [0] is stored at r7+0 address and C[1] is stored at r7 + 4 address.

2. Call the add function

00000aea:   movs    r1, #4
00000aec:   movs    r0, #3
00000aee:   bl      0xacc <add>
00000af2:   movs    r2, r0
00000af4:   movs    r3, r7
00000af6:   str     r2, [r3, #0]

r0 and R1 respectively record the parameters of the add function. The return value from the add function is r0 and finally stored at the address r7+0, that is, the position of C[0] on the stack.

3. main function out of stack and return

00000af8:   movs    r3, r7
00000afa:   ldr     r3, [r3, #0]
00000afc:   movs    r0, r3
00000afe:   mov     sp, r7
00000b00:   add     sp, #8
00000b02:   pop     {r7, pc} 

The return value of the main function is also C[0]. In the meaning of the above assembly, take out C[0] and put it in r0. The local variable is out of the stack add sp, #8 and the system register is out of the stack pop {r7, pc}.

add function assembly code analysis

The add function assembly code can also be divided into the following three parts:
1. Stack of add function

00000acc:   push    {r7, lr}
00000ace:   sub     sp, #8
00000ad0:   add     r7, sp, #0
00000ad2:   str     r0, [r7, #4]
00000ad4:   str     r1, [r7, #0]

Push {r7,lr} instruction, stack pressing r7,lr (00000af2) register, sub sp, #8 move the stack pointer, and the corresponding size corresponds to the parameter variable of the add function. r7 records the current stack pointer offset. int a variable is stored at r7+0 address, and int b variable is stored at r7+4 address.

2. add operation

00000ad6:   ldr     r2, [r7, #4]
00000ad8:   ldr     r3, [r7, #0]
00000ada:   adds    r3, r2, r3

3. Stack and return of add function

00000adc:   movs    r0, r3
00000ade:   mov     sp, r7
00000ae0:   add     sp, #8
00000ae2:   pop     {r7, pc}

The above assembly code will put the operation result at r0, and then perform the out of stack operation. The pc pointer points to 00000af2 and returns to the main function.

Function call stack pressing and out of stack operation

From the analysis of the above code, it can be seen that the function stack pressing and stack out operations are divided into two categories:
1. The above assembly codes adopt push and pop instructions for the stack pressing and stack out of registers;
2. Function parameters and local variables are pressed and out of the stack. The above assembly code adopts the form of directly moving sp pointer.

The stack size used by the main function is 8 (register r7 lr) + 8 (local variable int C[2]) = 16
The stack size used by the add subfunction is 8 (register r7 lr) + 8 (function parameters int a, int b) = 16

The maximum stack used by the above program is 32 bytes.

2, Factors affecting stack size

When the processor and compiler are fixed, the factors affecting the size of the stack include the number of system registers to be saved by calling the function, function parameters, local variables and function call depth. The following assembly code describes the change of local variables:

The add sub function remains unchanged, and the local variable size of the main main function is changed. The C language code of the main main function is modified as follows:

int add(int a,int b)
{
    return a+b;
}
int main(void)
{
	int c[16];
    c[0] = add(3,4);
	return c[0];
}

After changing the local variable size of the main function, the assembly code corresponding to the main function is as follows:

          add:
00000acc:   push    {r7, lr}
00000ace:   sub     sp, #8
00000ad0:   add     r7, sp, #0
00000ad2:   str     r0, [r7, #4]
00000ad4:   str     r1, [r7, #0]
00000ad6:   ldr     r2, [r7, #4]
00000ad8:   ldr     r3, [r7, #0]
00000ada:   adds    r3, r2, r3
00000adc:   movs    r0, r3
00000ade:   mov     sp, r7
00000ae0:   add     sp, #8
00000ae2:   pop     {r7, pc}
          main:
00000ae4:   push    {r7, lr}
00000ae6:   sub     sp, #64 ; 0x40 
00000ae8:   add     r7, sp, #0
00000aea:   movs    r1, #4
00000aec:   movs    r0, #3
00000aee:   bl      0xacc <add>
00000af2:   movs    r2, r0
00000af4:   movs    r3, r7
00000af6:   str     r2, [r3, #0]
00000af8:   movs    r3, r7
00000afa:   ldr     r3, [r3, #0]
00000afc:   movs    r0, r3
00000afe:   mov     sp, r7
00000b00:   add     sp, #64 ; 0x40
00000b02:   pop     {r7, pc}

After changing the size definition of the local variable of the main function of C language and comparing the two assembly codes before and after the above changes, it can be found that there are only two changes: at the address of 00000ae6, the assembly code is changed from "sub sp, #8" to "sub sp, #64"; at the address of 00000b00, the assembly code is changed from "add sp, #8" to "add sp, #64". As the local variable definition changes from int c[2] to int c[16], the sp stack pointer offset changes from 8 to 64. It can be seen that the stack used increases with the increase of local variables.

Similarly, based on the above simple C language program, you can also modify the number of function parameters of the add function, for example, modify add (int a, int b) to add (int a, int b, int c), and observe the impact of the change of function parameters on the stack size; You can also call a sub function in the add function or design a recursive function, and then call nested functions to observe the changes of stack size and stack pointer; We won't explain them one by one here. You can try it by yourself.

3, Estimate the maximum value used by the stack

In short, when the processor and compiler are fixed, the size of the stack is determined by the number of registers to be saved in the stack, function parameters, local variables and function call depth. A simple method to estimate the stack is as follows:
The number of registers to be saved in the stack can not be directly analyzed by the C language source code, but the number of function parameters and local variables can be seen from the C language source code.

As mentioned earlier, the C language compiler of arm architecture follows AAPCS (ARM architecture process call standard) and is divided into caller save registers and callee save registers. The callee save registers need to be saved through the stack, but the specific number of registers to be saved cannot be seen from the C language source code. At this time, a maximum value n can be estimated, Assuming that the stack size occupied by the local variables of the nested call to the nth layer function is Xn and the stack size occupied by the function parameters is Yn, if there is a call to the nth layer, the maximum value of the stack can be estimated as follows:

n*N + X1 + Y1 + X2 +Y2 ...+Xn+Yn

At present, many development tool chains can perform stack analysis and generate stack usage reports. In practical work, we can also perceive stack usage exceptions through stack layout and filling stack space into specific data, but it is always good to estimate the stack usage size in the worst case at the beginning of software design.

Keywords: Embedded system

Added by AutomatikStudio on Sun, 23 Jan 2022 09:26:41 +0200