content validity
What is the critical zone? Why are three entry modes defined, but only the first mode can be found in the code?
We will discuss the feasibility of the three methods combined with the knowledge about the run-time stack of c language.
text
In the first empty ucosii project that can deceive the compiler, we are in os_cpu.h defines two functions. Let's have a simple look.
//OS_CRITICAL_METHOD = 1: directly use the switch interrupt instruction of the processor to realize the macro //OS_CRITICAL_METHOD = 2: use the stack to save and restore the state of the CPU //OS_CRITICAL_METHOD = 3: use the compiler extension function to obtain the program status word and save it in the local variable cpu_sr #define OS_CRITICAL_METHOD 3 // Method of entering critical section #if OS_CRITICAL_METHOD == 3 #define OS_ENTER_CRITICAL() {cpu_sr = OS_CPU_SR_Save();} #define OS_EXIT_CRITICAL() {OS_CPU_SR_Restore(cpu_sr);} OS_CPU_SR OS_CPU_SR_Save(void); void OS_CPU_SR_Restore(OS_CPU_SR cpu_sr); #endif
In fact, in many real-time operating systems, we can only find the third method. The first two methods have more or less problems. We will introduce the implementation of these three methods and discuss their problems and advantages. Of course, my understanding of gcc compiler and c language stack may not be in place, so there will inevitably be omissions. Welcome to correct.
What is the critical zone
Critical area refers to a program segment that accesses shared resources (such as shared devices or shared memory), and these shared resources cannot be accessed by multiple threads at the same time. Different from a single program written in ordinary c language, in an operating system, there are a large number of resources that can only be mutually exclusive. We can find from the problems of classical producers and consumers of semaphores that if the mutually exclusive access of some resources cannot be guaranteed, the operation of some devices will have problems (for example, the printer can only be accessed by one "person" at the same time), and even the operation of the whole system will fall into uncertainty, which is unacceptable.
Method of entering critical zone
There are many ways to enter the critical zone, such as turning off interrupt, prohibiting scheduling and so on. The three schemes of entering the critical zone in ucosii we are going to introduce today are all based on off interrupt.
PS, off interrupt is a method strongly related to hardware. We will explain it here in combination with STM32F401RE.
method1 direct shutdown interrupt
As defined below
#define OS_ENTER_CRITICALL() (Cli()) #define OS_EXIT_CRITICAL() (Sti())
Cli CPSID I ;PRIMASK=1,This bit is the main interrupt switch BX LR ;Function return Sti CPSIE I ;PRIMASK=0,Off interrupt BX LR
The fm32sysq does not need to be interrupted. Therefore, the FSIQ does not need to be interrupted. Note that the FSIQ will not be interrupted. All fsids will be closed. Systick is the heartbeat of the system. In theory, it cannot be turned off, but in fact, what we do in the critical area is often only a few lines of assembly. There is a difference of several orders of magnitude between the time of executing these codes and the heartbeat time of the system, which can be almost ignored. Therefore, this simple and crude shutdown interrupt is feasible here. It is said that there are other problems that can not shield systick, but there are more related problems. Don't press the table below.
There is a big problem with this method. It does not support the nesting of access critical areas, because it cannot distinguish which level of critical area you are entering. As long as you leave the critical area once, the originally closed interrupt will open and the outer critical area will fail. Refer to the figure below.
method2 saves the program status word register through the stack
The demo is as follows
#define OS_ENTER_CRITICAL() (PushAndCli()) #define OS_EXIT_CRITICAL() (Pop())
PushAndCli PUSHF CPSID I BX LR Pop POPF BX LR
It looks perfect. Before I go, I save the previous state in the stack and then take it out of the stack. In this way, I can record the state when I enter and exit the critical area every time. No matter how many times I stack, there will be no problem.
In fact, generally, there will be no problems when using this method. We can first look at the changes of the function stack when the function is running.
When a function is called, the compiler will automatically allocate a stack for this function. We can find an implementation under the x86 architecture in Chapter 18 of c and pointers.
First of all, we need to know some about BP, SP and other registers (Baidu is recommended). Here, we assume that only a1 uses register to pass values, and other parameters use stack to pass values (in practice, register to pass values is preferred when there are few parameters, and stack to pass values is only used when there are insufficient registers).
int add(register int a0, int a1, int a2, int a3, int a4){ int x = a1 + a2 + a3 + a4; //(2) return x; } int main(){ add(5,1,2,3,4); //(1) return 0; }
Before entering the code that executes the sub function (1), the compiler will save the first parameter of add in r0 in the main function, and the other four parameters will be pushed into the stack in reverse order (the reason for the reverse order is that the sub function does not know how many parameters it will pass to him. If it is passed in reverse order, the one closest to its bp pointer is the first parameter, which can be found all the way), Next, the instruction to jump to the sub function is executed, which pushes the return address into the stack at the same time. The current stack situation is shown in the figure, with the low address at the top
Enter the sub function (2) at the same time. At the beginning of the sub function, we put the contents of the old bp pointer into the stack and let the bp pointer point to the position of the current sp pointer. The content of the current sp pointer is the position where the old bp pointer exists in the stack. Take this as the bottom of the sub function stack, and throw the value in our previous register and the local variable of the function into the stack of the sub function, Their order is shown in the figure
Now we have a stack. From high address to low address, we store the parameters pressed in reverse order, the old bp (old stack frame pointer), the values of local variables and saved registers, the stack of sub functions above bp, and the stack of main functions below bp. When we want to find an incoming parameter, let the bp pointer look for the high address. When we want to find a local variable, let the bp pointer look for the low address. Perfect.
Well, now that the sub function is finished and ready to return, the sub function will return the value of the saved register to the register, let the bp pointer point to the bp pointer of the main function, and then the return address is out of the stack to pc. at this time, the sp pointer points to the return address. We should note that the parameters passed in at this time are not clear, because only the main function knows how many parameters have been passed in, and only the main function knows these data is safe.
Now that we have finished talking about the knowledge of function stack under x86 architecture, I find that the implementation on STM32F4 platform is different from the above. Instead of the bottom pointer bp, it is relatively addressed through sp pointer. We can explain it in combination with a section of assembly of STM32F4 generated by MDK. The source code and assembly code are as follows
void mypush(void); void mypop(void); int add1(int a, int a2, int a3, int a4, int a5, int a6, int a7){ int x = 8, x2 = 9, x3 = 10, x4 = 11, x5 = 12, x6 = 13, x7 = 14; mypush(); //Adding so many variables here is just to run out of registers and force him to get the data out of the stack, so as to expose the problem x = a + a2 + a3 + a4 + a5 + a6 + a7 + x + x2 + x4 + x5 + x6 + x7 + x3; mypop(); return 0; } int main(){ add1(1,2,3,4,5,6,7); return 0; }
EXPORT mypush EXPORT mypop PRESERVE8 AREA |.text|, CODE, READONLY THUMB mypush PUSH {r0} BX LR mypop POP {r0} BX LR end
We can follow the assembly process. Here, the assembly will be corresponding to the corresponding c language, and the annotation is relatively complete. I think it can be understood
main function
12: int main(){ 0x080004B0 B50E PUSH {r1-r3,lr} ;The main function is also a function,bsp Save the above when calling the main function ;sp_a 13: add1(1,2,3,4,5,6,7); 0x080004B2 2007 MOVS r0,#0x070x080004B4 2106 MOVS r1,#0x060x080004B6 2205 MOVS r2,#0x050x080004B8 2304 MOVS r3,#0x04 ;The first four lines will add1 The parameter of is passed to register 0 x080004BA E9CD2100 STRD r2,r1,[sp,#0] ;Here will r1 and r2 The value of the register is stored in the sp Corresponding position,that is r1r2 Stack 0 x080004BE 9002 STR r0,[sp,#0x08] ;r0 Push ;sp_b ;Strange things have been found here,Put something in the stack and put the first line push The information covered ;Did you forget to open up stack space for the main function,But that's the code,Wait dl Dispel doubts 0 x080004C0 2203 MOVS r2,#0x030x080004C2 2102 MOVS r1,#0x020x080004C4 2001 MOVS r0,#0x01 ;Continue to deposit add1 Parameter to register 0 x080004C6 F7FFFFBD BL.W add1 (0x08000444);Call function add1 ;(1) 14: return 0; 0x080004CA 2000 MOVS r0,#0x00 ;Assign 0 to r0,This 0 is return Last 0 fifteen: } 0x080004CC BD0E POP {r1-r3,pc} ;Started saving the above. Now restore the above 0 x080004CE F04F7040 MOV r0,#0x30000000x080004D2 EEE10A10 VMSR FPSCR, r00x080004D6 4770 BX lr
add1 function
4: int add1(int a, int a2, int a3, int a4, int a5, int a6, int a7){ 0x08000444 E92D4FF0 PUSH {r4-r11,lr} ;The register value before the function starts saving,It is convenient for you to use these registers ;sp_c0x08000448 B087 SUB sp,sp,#0x1C ;Stack sub functions,arm There should be a way to calculate how many stacks are needed ;The stack opened here is just right,You adjust the number of parameters and find that it is not a coincidence ;sp_d0x0800044A 4604 MOV r4,r00x0800044C 460D MOV r5,r10x0800044E 4616 MOV r6,r20x08000450 461F MOV r7,r3 ;Previously saved in r0-r3 The parameters of the register are passed into its own register 0 x08000452 E9DD9A11 LDRD r9,r10,[sp,#0x44] ;Take the previously passed in parameters 4 and 50 from the stack x08000456 F8DD8040 LDR r8,[sp,#0x40] ;Take the previously passed in parameter 6 from the stack 5: int x = 8, x2 = 9, x3 = 10, x4 = 11, x5 = 12, x6 = 13, x7 = 14; 0x0800045A F04F0B08 MOV r11,#0x08 ;Save the local variable 0 defined in the sub function x0800045E 2009 MOVS r0,#0x090x08000460 9006 STR r0,[sp,#0x18] ;There are no local variables in the register,Put local variables on the stack ;(2)0x08000462 200A MOVS r0,#0x0A0x08000464 9005 STR r0,[sp,#0x14]0x08000466 200B MOVS r0,#0x0B0x08000468 9004 STR r0,[sp,#0x10]0x0800046A 200C MOVS r0,#0x0C0x0800046C 9003 STR r0,[sp,#0x0C]0x0800046E 200D MOVS r0,#0x0D0x08000470 9002 STR r0,[sp,#0x08]0x08000472 200E MOVS r0,#0x0E0x08000474 9001 STR r0,[sp,#0x04] ;Finally put the data that can't be saved into the stack ;(3) 6: mypush(); 7: //So many variables are added here just to run out of registers and force him to fetch data from the stack, thus exposing the problem 0x08000476 f7fffeef bl.w mypush (0x08000258) ; Call the function without parameter 8: x = a + A2 + a3 + A4 + A5 + A6 + A7 + X + x2 + X4 + X5 + X6 + X7 + X3; 0x0800047A 1960 ADDS r0,r4,r5 ; Start to add the numbers in the register 0x0800047c 4430 add R0, R0, r60x080047e 4438 add R0, R0, r70x0800480 4440 add R0, R0, r80x0800482 4448 add R0, R0, r90x0800484 4450 add R0, R0, r100x0800486 eb00010b add R1, R0, r110x0800048A 9806 LDR r0,[sp,#0x18] ; Some data is stored in the stack. Take it out of the stack and add it up ; (4)0x0800048C 4401 ADD r1,r1,r00x0800048E 9804 LDR r0,[sp,#0x10]0x08000490 4401 ADD r1,r1,r00x08000492 9803 LDR r0,[sp,#0x0C]0x08000494 4401 ADD r1,r1,r00x08000496 9802 LDR r0,[sp,#0x08]0x08000498 4401 ADD r1,r1,r00x0800049A 9801 LDR r0,[sp,#0x04]0x0800049C 4401 ADD r1,r1,r00x0800049E 9805 LDR r0,[sp,#0x14]0x080004A0 EB010B00 ADD r11,r1,r0 ; Finished ; (5) 9: mypop(); 0x080004A4 F7FFFEDA BL.W mypop (0x0800025C) ; Call mypop function 10: return 0; 0x080004A8 2000 MOVS r0,#0x00 ; Return value 0 is stored in R0 11:} 0x080004aa B007 add SP, SP, #0x1c ; Destroy the stack of subfunction 0x080004ac e8bd8ff0 pop {r4-r11, pc} ; Return to the scene when calling the sub function, and the pc returns to the main function to continue execution
push and pop functions
9: mypush 10: PUSH {r0} 0x08000258 B401 PUSH {r0} ;take r0 Stack pressing,The second way to enter the critical zone is simulated here,Pushed a value into the stack ;sp_E 13: BX LR ;be equal to return,The function returns 0 x0800025A 4770 BX lr 15: mypop 16: POP {r0} 0x0800025C BC01 POP {r0} ;Stack top data out of stack to r0 register seventeen: BX LR 0x0800025E 4770 BX lr ;Function return
By roughly browsing the above compilation, we can at least understand that in the code without bp pointer, the data in the stack is relatively located through the pointer sp at the top of the stack. After carefully observing the codes of (2) - (3) and (4) - (5), it is not difficult to find that their relative addresses relative to sp are the same, that is, no matter how we operate the sp pointer in the middle, Our function always thinks that the following are the starting addresses of 4 bytes.
[sp-0x04] = 0xE[sp-0x08] = 0xD[sp-0x0C] = 0xC ;Function is the data
However, it is a pity that we have conducted a push operation between storing the data in the stack and taking the data out of the stack. The side effect of the push operation is sp –, that is, if we call the SP when we save the data sp0, the SP when we get the data is no longer sp0, but sp0 –, so all the data are misplaced, that is
[sp-0x04] = ?[sp-0x08] = 0xE[sp-0x0C] = 0xD ;Actual data
In the end, we can't get the correct 0x69. In my case, I found that 0x60 was finally stored in r11. Anyway, the result has become unpredictable.
The following is a simple picture of the whole stack. In the figure below, I use sp_a sp_b sp_c represents the first change and the second change of sp
Of course, you said that if we first allocate stack space to mypush in the assembly code, that is, let sp –, and then destroy the stack when the function returns, there will be no problem with the access of local variables in add1 function. However, there is a larger logical problem in doing so. When the stack pointer SP + +, the meaning is that the stack of mypush is destroyed, Then the data in this stack space is meaningless. Even if you enter mypop immediately, the correct stored value will be returned, but we only need to make a function call using the stack space between mypush and mypop, which will directly overwrite the state information in the unprotected space, and we will lose this information, This problem will occur even in x86 with bp pointer (the above paragraph has not been practiced, pure theory)
In addition, in the process of checking the data, I also found a compilation option - fno defer pop. The official explanation is as follows: for machines that must pop arguments after a function call, always pop the arguments as soon as each function returns At levels -O1 and higher, -fdefer-pop is the default; this allows the compiler to let arguments accumulate on the stack for several function calls and pop them all at once. Personally, I think this strategy may also affect the security of the second scheme, but I don't think about it.
method3
#define OS_CRITICAL_METHOD 3 //Method of entering critical section #if OS_CRITICAL_METHOD == 3#define OS_ENTER_CRITICAL() {cpu_sr = OS_CPU_SR_Save();}#define OS_EXIT_CRITICAL() {OS_CPU_SR_Restore(cpu_sr);}OS_CPU_SR OS_CPU_SR_Save(void);void OS_CPU_SR_Restore(OS_CPU_SR cpu_sr);#endif
OS_CPU_SR_Save MRS R0, PRIMASK ;R0 Is the default location where the return parameters are stored, which can be understood as at the end of the function return R0 CPSID I BX LR OS_CPU_SR_Restore MSR PRIMASK, R0 ;Similarly, R0 It is also the first parameter passed in by default, that is, the only parameter of this function cpu_sr BX LR
In fact, the only difference between the second method and the third method is that the second method is to remember to give PRIMASK a space to store it. The third method is to prepare a space for you at the beginning (the compiler leaves you a space cpu_sr in the stack or register) for you to put, which is obviously safer, In fact, ucosii finally adopts the third way.