This paper introduces a precise delay method in Cortex-M kernel
Preface
Why study this method of delay?
- Many times when we run the operating system, we usually use a hardware timer - SysTick, while the clock beat of our operating system is generally set to 100-1000HZ, that is, 1ms - 10ms to generate an interrupt. Many bare metal tutorials use delay functions based on SysTick, which inevitably leads to conflicts.
- A lot of people will say, isn't there still a timer? The timer's timing is super accurate. I don't deny this point, but suppose that if a system always enters into timer interrupt (10us once / 1us once / 0.5us once), the whole system will be interrupted frequently, and the thread cannot run well. In addition, it also consumes a hardware timer resource. A hardware timer may do other things!
- According to the modification of ST HAL database, in fact, Jay thinks that everything in ST is good, that is, the HAL database out is too disgusting, and there is no way to do it. There is a HAL_Delay() in HAL database, which also uses SysTick delay. When porting the operating system, there will be many inconveniences, but fortunately, HAL_Delay() is a weak definition. We can rewrite the implementation of this function. Then Of course, kernel delay is the best way (I think so). Of course, you can write a simple delay with the for loop completely.
- Maybe what I said is not authoritative, so I quote a sentence in Cortex-M3 authoritative guide - "there are remaining counters in DWT, which are typically used for" profiling "of program code. By programming them, you can have them emit events (in the form of trace packets) when the counter overflows. Typically, the CYCCNT register is used to measure the number of cycles taken to perform a task, which can also be used for time benchmarking purposes (it can be used to count CPU utilization in the operating system). "
DWT in Cortex-M
In Cortex-M, there is a peripheral called DWT (data watch point and trace), which is used for system debugging and tracking.
It has a 32-bit register called CYCCNT, which is an up counter. It records the number of kernel clocks running. If the kernel clock beats once, the counter will add 1. The accuracy is very high, which determines the frequency of the kernel. If the F103 series, the kernel clock is 72M, the accuracy is 1/72M = 14ns, and the running time of the program is microsecond, so 14ns The accuracy of is far enough. The longest time that can be recorded is 32 power of 60s=2 / 72000000 (assuming that the kernel frequency is 72M, and the time that the kernel jumps once is about 1/72M=14ns). However, if the chip with 400M main frequency is H7, its timing accuracy is as high as 2.5ns (1 / 400000000 = 2.5). If it is i.MX RT1052, the longest time that can be recorded is 32 power of 8.13s=2 / 528000000.( Suppose the kernel frequency is 528M, and the time of kernel hop is about 1/528M=1.9ns). When CYCCNT overflows, it will clear 0 and start counting up again.
m3, m4 and m7 are available in actual measurement (m0 is not available).
Accuracy: 1 / core frequency (s).
To realize the function of delay, there are three registers: DEMCR, DWT ﹣ Ctrl and DWT ﹣ CYCCNT, which are respectively used to enable DWT function, enable CYCCNT and obtain the system clock value.
DEMCR
To enable DWT peripherals, you need to write 1 enable (key point, exam!!) controlled by bit 24 of another kernel debug register, DEMCR.
The address of DEMCR is 0xE000 EDFC
About DWT ﹣ cyccnt
Clear 0 before enabling DWT ﹣ cyccnt register.
Let's look at the base address of DWT ﹣ cypcnt. From the ARM-Cortex-M manual, we can see that its base address is 0xE000 1004, the reset default value is 0, and its type is readable and writable. When we write 0 to 0xE000 1004, we will clear DWT ﹣ cypcnt.
About CYCCNTENA
CYCCNTENA Enable the CYCCNT counter. If not enabled, the counter does not count and no event is
generated for PS sampling or CYCCNTENA. In normal use, the debugger must initialize
the CYCCNT counter to 0.
It is the first bit of the DWT control register. If write 1 is enabled, the CYCCNT counter will be enabled. Otherwise, the CYCCNT counter will not work.
In summary
To use the CYCCNT step of DWT:
- First enable DWT peripheral, which is controlled by bit 24 of other kernel debug register DEMCR, write 1 enable
- Clear 0 before enabling the CYCCNT register.
- Enable the CYCCNT register, which is controlled by the CYCCNTENA of DWT, that is, bit 0 of DWT control register, write 1 enable
code implementation
/** ****************************************************************** * @file core_delay.c * @author fire * @version V1.0 * @date 2018-xx-xx * @brief Precise delay using kernel registers ****************************************************************** * @attention * * Experimental platform: Wildfire STM32 development board * Forum: http://www.firebbs.cn * Taobao: https://fire-stm32.taobao.com * ****************************************************************** */ #include "./delay/core_delay.h" /* ********************************************************************** * Time stamp related register definition ********************************************************************** */ /* In Cortex-M, there is a peripheral called DWT (data watch point and trace). The peripheral has a 32-bit register called CYCCNT, which is an up counter. The number of kernel clocks is recorded. The longest time that can be recorded is: 10.74s=2 32 times of / 40000000 (Assuming that the kernel frequency is 400M, the time of one core hop is about 1/400M=2.5ns) When CYCCNT overflows, it will clear 0 and start counting up again. Operation steps of enabling CYCCNT counting: 1,First enable DWT peripheral, which is controlled by bit 24 of other kernel debug register DEMCR, write 1 enable 2,Clear 0 before enabling the CYCCNT register 3,Enable CYCCNT register, which is controlled by bit 0 of DWT Ctrl (macro defined as DWT Cr in code), write 1 enable */ #define DWT_CR *(__IO uint32_t *)0xE0001000 #define DWT_CYCCNT *(__IO uint32_t *)0xE0001004 #define DEM_CR *(__IO uint32_t *)0xE000EDFC #define DEM_CR_TRCENA (1 << 24) #define DWT_CR_CYCCNTENA (1 << 0) /** * @brief Initialization timestamp * @param nothing * @retval nothing * @note This function must be called before using the delay function */ HAL_StatusTypeDef HAL_InitTick(uint32_t TickPriority) { /* Enable DWT peripherals */ DEM_CR |= (uint32_t)DEM_CR_TRCENA; /* DWT CYCCNT Register count clear 0 */ DWT_CYCCNT = (uint32_t)0u; /* Enable Cortex-M DWT CYCCNT register */ DWT_CR |= (uint32_t)DWT_CR_CYCCNTENA; return HAL_OK; } /** * @brief Read current timestamp * @param nothing * @retval The current timestamp, which is the value of the DWT ﹣ cyccnt register */ uint32_t CPU_TS_TmrRd(void) { return ((uint32_t)DWT_CYCCNT); } /** * @brief Read current timestamp * @param nothing * @retval The current timestamp, which is the value of the DWT ﹣ cyccnt register */ uint32_t HAL_GetTick(void) { return ((uint32_t)DWT_CYCCNT/SysClockFreq*1000); } /** * @brief Using internal count of CPU to realize precise delay, 32-bit counter * @param us : Delay length in us * @retval nothing * @note Before using this function, the CPU ﹣ TS ﹣ tmrinit function must be called to enable the counter. Or enable macro CPU? TS? Init? In? Delay? Function The maximum delay value is 8 seconds, i.e. 8 * 1000 * 1000 */ void CPU_TS_Tmr_Delay_US(uint32_t us) { uint32_t ticks; uint32_t told,tnow,tcnt=0; /* Initialize the time stamp register inside the function. */ #if (CPU_TS_INIT_IN_DELAY_FUNCTION) /* Initialize time stamp and clear */ HAL_InitTick(5); #endif ticks = us * (GET_CPU_ClkFreq() / 1000000); /* Number of beats required */ tcnt = 0; told = (uint32_t)CPU_TS_TmrRd(); /* Counter value at first entry */ while(1) { tnow = (uint32_t)CPU_TS_TmrRd(); if(tnow != told) { /* 32 Bit counter is up counter */ if(tnow > told) { tcnt += tnow - told; } /* Reload */ else { tcnt += UINT32_MAX - told + tnow; } told = tnow; /*If the time exceeds / equals to the time to be delayed, exit */ if(tcnt >= ticks)break; } } } /*********************************************END OF FILE**********************/
#ifndef __CORE_DELAY_H #define __CORE_DELAY_H #include "stm32h7xx.h" /* Get kernel clock frequency */ #define GET_CPU_ClkFreq() HAL_RCC_GetSysClockFreq() #define SysClockFreq (218000000) /* For convenience, the CPU ﹣ TS ﹣ tmrinit function is called inside the delay function to initialize the time stamp register. This initializes the function every time it is called. Set the macro value to 0, and then call CPU ﹣ TS ﹣ tmrinit when the main function is just running to avoid initialization every time */ #define CPU_TS_INIT_IN_DELAY_FUNCTION 0 /******************************************************************************* * Function declaration ******************************************************************************/ uint32_t CPU_TS_TmrRd(void); HAL_StatusTypeDef HAL_InitTick(uint32_t TickPriority); //Before using the following functions, you must call the CPU ﹣ TS ﹣ tminit function enable counter, or enable the macro CPU ﹣ TS ﹣ init ﹣ in ﹣ delay ﹣ function //The maximum delay is 8 seconds void CPU_TS_Tmr_Delay_US(uint32_t us); #define HAL_Delay(ms) CPU_TS_Tmr_Delay_US(ms*1000) #define CPU_TS_Tmr_Delay_S(s) CPU_TS_Tmr_Delay_MS(s*1000) #endif /* __CORE_DELAY_H */
matters needing attention:
If the user is not using in HAL library, comment out:
uint32_t HAL_GetTick(void) { return ((uint32_t)DWT_CYCCNT/SysClockFreq*1000); }
At the same time, it is recommended to rename hal'inittick() function.
Rewrite the following macro definitions according to your platform:
/* Get kernel clock frequency */ #define GET_CPU_ClkFreq() HAL_RCC_GetSysClockFreq() #define SysClockFreq (218000000)
Epilogue
In fact, in ucos-iii source code, one function is to measure the interruption time. It is to use STM32 time stamp to record a certain time when the program is running. If two time points before and after the program are recorded, the running time of the program can be calculated.
However, there is very little information about the description of kernel registers. Fortunately, we found an arm manual, which contains the detailed description of these kernel registers. The time stamp related registers are described in Chapter 10 and Chapter 11. You can ask me to take the information you want to see backstage.
Like to pay attention to me!
Relevant codes can be obtained by replying to "DWT" in the background of public account.
Welcome to the public account of "IoT development"