关键词:ARMv8-M、HardFault、MemManageFault、BusFault、UsageFault、Stack、XPSR、LR、addr2line等。
1. ARMv8-M异常类型及其详细解释
ARMv8-M Exception分为两类:预定义系统异常(0~15)和外部中断(16~16+N)。
各种异常的状态可以通过Status bit查看,获取更信息的异常原因:
CFSR是由UFSR、BFSR和MMFSR组成:
下面列举HFSR、MMFSR、BFSR、UFSR的详细解释。
1.1 HFSR
DEBUGEVT, bit [31] Debug event. Indicates when a debug event has occurred.
The possible values of this bit are:
0 No debug event has occurred.
1 Debug event has occurred. The Debug Fault Status Register has been updated.
FORCED, bit [30] Forced. Indicates that a fault with configurable priority has been escalated to a HardFault exception, because
it could not be made active, because of priority, or because it was disabled.
The possible values of this bit are:
0 No priority escalation has occurred.
1 Processor has escalated a configurable-priority exception to HardFault.
VECTTBL, bit [1] Vector table. Indicates when a fault has occurred because of a vector table read error on exception processing.
The possible values of this bit are:
0 No vector table read fault has occurred.
1 Vector table read fault has occurred.
1.2 MMFSR
MMARVALID, bit [7] MMFAR valid flag. Indicates validity of the MMFAR register.
The possible values of this bit are:
0 MMFAR content not valid.
1 MMFAR content valid.
MLSPERR, bit [5] MemManage lazy state preservation error flag. Records whether a MemManage fault occurred during FP lazy state preservation.
The possible values of this bit are:
0 No MemManage occurred.
1 MemManage occurred.
MSTKERR, bit [4] MemManage stacking error flag. Records whether a derived MemManage fault occurred during exception entry stacking.
The possible values of this bit are:
0 No derived MemManage occurred.
1 Derived MemManage occurred during exception entry.
MUNSTKERR, bit [3] MemManage unstacking error flag. Records whether a derived MemManage fault occurred during exception return unstacking.
The possible values of this bit are:
0 No derived MemManage fault occurred.
1 Derived MemManage fault occurred during excep
DACCVIOL, bit [1] Data access violation flag. Records whether a data access violation has occurred.
The possible values of this bit are:
0 No MemManage fault on data access has occurred.
1 MemManage fault on data access has occurred.
IACCVIOL, bit [0] Instruction access violation. Records whether an instruction related memory access violation has occurred.
The possible values of this bit are:
0 No MemManage fault on instruction access has occurred.
1 MemManage fault on instruction access has occurred.
1.3 BFSR
BFARVALID, bit [7] BFAR valid. Indicates validity of the contents of the BFAR register.
The possible values of this bit are:
0 BFAR content not valid.
1 BFAR content valid.
LSPERR, bit [5] Lazy state preservation error. Records whether a precise BusFault occurred during FP lazy state preservation.
The possible values of this bit are:
0 No BusFault occurred.
1 BusFault occurred.
STKERR, bit [4] Stack error. Records whether a precise derived BusFault occurred during exception entry stacking.
The possible values of this bit are:
0 No derived BusFault occurred.
1 Derived BusFault occurred during exception entry.
UNSTKERR, bit [3] Unstack error. Records whether a precise derived BusFault occurred during exception return unstacking.
The possible values of this bit are:
0 No derived BusFault occurred.
1 Derived BusFault occurred during exception return.
IMPRECISERR, bit [2] Imprecise error. Records whether an imprecise data access error has occurred.
The possible values of this bit are:
0 No imprecise data access error has occurred.
1 Imprecise data access error has occurred.
PRECISERR, bit [1] Precise error. Records whether a precise data access error has occurred.
The possible values of this bit are:
0 No precise data access error has occurred.
1 Precise data access error has occurred.
IBUSERR, bit [0]
Instruction bus error. Records whether a precise BusFault on an instruction prefetch has occurred.
The possible values of this bit are:
0
No BusFault on instruction prefetch has occurred.
1
A BusFault on an instruction prefetch has occurred.
1.4 UFSR
DIVBYZERO, bit [9] Divide by zero flag. Sticky flag indicating whether an integer division by zero error has occurred.
The possible values of this bit are:
0 Error has not occurred.
1 Error has occurred.
UNALIGNED, bit [8] Unaligned access flag. Sticky flag indicating whether an unaligned access error has occurred.
The possible values of this bit are:
0 Error has not occurred.
1 Error has occurred.
STKOF, bit [4] Stack overflow flag. Sticky flag indicating whether a stack overflow error has occurred.
The possible values of this bit are:
0 Error has not occurred.
1 Error has occurred.
NOCP, bit [3] No coprocessor flag. Sticky flag indicating whether a coprocessor disabled or not present error has occurred.
The possible values of this bit are:
0 Error has not occurred.
1 Error has occurred.
INVPC, bit [2] Invalid PC flag. Sticky flag indicating whether an integrity check error has occurred.
The possible values of this bit are:
0 Error has not occurred.
1 Error has occurred.
INVSTATE, bit [1] Invalid state flag. Sticky flag indicating whether an EPSR.T, EPSR.IT, or FPSCR.LTPSIZE validity error has occurred.
The possible values of this bit are:
0 Error has not occurred.
1 Error has occurred.
UNDEFINSTR, bit [0] UNDEFINED instruction flag. Sticky flag indicating whether an UNDEFINED instruction error has occurred.
The possible values of this bit are:
0 Error has not occurred.
1 Error has occurred.
2 ARMv8-M ARM中关于异常入口处理和压栈
在ARMv8-M ARM中介绍了异常发生时,硬件所做的一系列操作:
从中可以看出对R0-R3、R12、LR、XPSR、ReturnAddress进行了压栈操作,最后PC指向异常处理函数。
当异常发生时,压栈的内容和顺序是固定的:XPSR->ReturnAddress->LR->R12->R3->R2->R1->R0。
这里的LR指的是异常的PC值,是真正的死亡前现场。ReturnAddress是处理器决定的异常后返回地址。
EXC_RETURN
EXC_RETURN代表异常入口时LR的值。
ARMv8-M规格书中关于EXC_RETURN定义如下:
PREFIX, bits [31:24] Prefix. Indicates that this is an EXC_RETURN value.This field reads as 0b11111111.
S, bit [6] Secure or Non-secure stack.
DCRS, bit [5] Default callee register stacking.
FType, bit [4] Stack frame type. 0 Extended stack frame. 1 Standard stack frame.
Mode, bit [3] Mode. Indicates the Mode that was stacked from. 0 Handler mode. 1 Thread mode.
SPSEL, bit [2] Stack pointer selection. 0 Main stack pointer. 1 Process stack pointer.
ES, bit [0] Exception Secure. 0 Non-secure. 1 Secure.
RETPSR
当异常进入的时候,会将RETPSR的值压栈。
N, bit [31] Negative condition flag. 0 Result is positive or zero. 1 Result is negative.
Z, bit [30] Zero condition flag.0 Result is nonzero. 1 Result is zero.
C, bit [29] Carry condition flag. 0 No carry occurred, or last bit shifted was clear. 1 Carry occurred, or last bit shifted was set.
V, bit [28] Overflow condition flag. 0 Signed overflow did not occur. 1 Signed overflow occurred.
Q, bit [27] Sticky saturation flag. 0 Saturation or overflow has not occurred since bit was last cleared. 1 Saturation or overflow has occurred since bit was last cleared.
T, bit [24] T32 state. 0 Execution of any instruction generates an INVSTATE UsageFault. 1 Instructions decoded as T32 instructions.
SFPA, bit [20] Secure Floating-point active.
GE, bits [19:16] Greater than or equal flags.
3 异常Handler以及分析
异常的入口是异常向量表,根据异常号调用对应的处理函数:
__isr_vector: .long __StackTop /* Top of Stack */ .long Reset_Handler /* 1. Reset Handler */ .long NMI_Handler /* 2. NMI Handler */ .long HardFault_Handler /* 3. Hard Fault Handler */ .long MemManage_Handler /* 4. MPU Fault Handler */ .long BusFault_Handler /* 5. Bus Fault Handler */ .long UsageFault_Handler /* 6. Usage Fault Handler */ .long 0 /* 7. Reserved */ .long 0 /* 8. Reserved */ .long 0 /* 9. Reserved */ .long 0 /* 10. Reserved */ .long SVC_Handler /* 11. SVCall Handler */ .long DebugMon_Handler /* 12. Debug Monitor Handler */ .long 0 /* 13. Reserved */ .long PendSV_Handler /* 14. PendSV Handler */ .long SysTick_Handler /* 15. SysTick Handler */ /* External interrupts */ /* The interrupts 0 to 31 */ .long Default_IRQHandler /*16. External Interrupt 0*/ .long Default_IRQHandler
在进入Handler的时候,异常栈顶为包括R0~R3、R12、LR、ReturnAddress、RETPSR寄存器的内容。
下面的寄存器通过判断EXC_RETURN[2]来决定使用msp还是psp:
asm volatile( " tst lr, #4 \n"--测试EXC_RETURN[2]是否为1,即测试当前StackPointer是MSP(0)还是PSP(1)、 " ite eq \n"--当EXC_RETURN[2]为0,则z=1;当EXC_RETURN[2]为1,则z=1。 " mrseq r0, msp \n"--当EXC_RETURN[2]为0,将msp放入r0。 " mrsne r0, psp \n"--当EXC_RETURN[2]为1,将psp放入r0。 "b common_handler_c \n"-- : /* no output */ : /* no input */ : "r0" /* clobber */ );
其中B和BL区别:
B Label ;程序无条件跳转到标号 Label 处执行。
BL Label ;当程序无条件跳转到标号 Label 处执行时,同时将当前的 PC 值保存到 R14 中。L ;用来区分 分支是否是有带返回的分支指令。
下面以HardFault为例,介绍代码和分析流程。
void HardFault_Handler(void) { asm volatile( " tst lr, #4 \n" " ite eq \n" " mrseq r0, msp \n" " mrsne r0, psp \n" "b hardfault_handler_c \n" : /* no output */ : /* no input */ : "r0" /* clobber */ ); } void hardfault_handler_c(sContextStateFrame* regs)--传入的参数为msp的值。 { unsigned int hfsr = SCB->HFSR; star_stack_dump(regs); MSG("Cause of Hard Fault:\n"); if(hfsr & SCB_HFSR_DEBUGEVT_Msk) { MSG("Debug event has occurred, "); unsigned dfsr = SCB->DFSR; if(dfsr & SCB_DFSR_PMU_Msk) MSG("PMU event.\n"); if(dfsr & SCB_DFSR_EXTERNAL_Msk) MSG("External event.\n"); if(dfsr & SCB_DFSR_VCATCH_Msk) MSG("Vector Catch event.\n"); if(dfsr & SCB_DFSR_DWTTRAP_Msk) MSG("Watchpoint event.\n"); if(dfsr & SCB_DFSR_BKPT_Msk) MSG("Breakpoint event.\n"); if(dfsr & SCB_DFSR_HALTED_Msk) MSG("Halt or step event.\n"); } if(hfsr & SCB_HFSR_FORCED_Msk) { MSG("Processor has escalated a configurable-priority exception to HardFault.\n"); aon_system_reset(); } if(hfsr & SCB_HFSR_VECTTBL_Msk) { MSG("Vector table read fault has occurred.\n"); aon_system_reset(); } } void star_stack_dump(sContextStateFrame* regs) { unsigned int *stackPtr = NULL; MSG("ExceptionStack(%08x):\n", regs);--输出异常入栈信息:R0~R3、R12、LR、ReturnAddress、XPSR。 MSG("R0 = %08x\n", regs->r0); MSG("R1 = %08x\n", regs->r1); MSG("R2 = %08x\n", regs->r2); MSG("R3 = %08x\n", regs->r3); MSG("R12 = %08x\n", regs->r12); MSG("LR = %08x\n", regs->lr); MSG("ReturnAddr = %08x\n", regs->return_address); MSG("PSR = %08x: N(%u) Z(%u) C(%u) V(%u) Q(%u) IT(%u) T(%u) SFPA(%u) GE(%u) SPRealign(%u) ISR(%u) \n", regs->xpsr.w, regs->xpsr.b.N, regs->xpsr.b.Z, regs->xpsr.b.C, regs->xpsr.b.V, regs->xpsr.b.Q, regs->xpsr.b.IT, regs->xpsr.b.T, regs->xpsr.b.SFPA, regs->xpsr.b.GE, regs->xpsr.b.SPREALIGN, regs->xpsr.b.ISR );--RETPSR的几种情况暂未分别分析。 MSG("\nStack from 0x%08x in [StackTop(0x%08x), MSPLIM(0x%08x)]:\n", regs, &__StackTop, __get_MSPLIM()); for(stackPtr = (unsigned int *)regs; stackPtr < &__StackTop; stackPtr++ ) {--遍历输出栈内容,方便后续分析。 MSG("0x%08x %08x\n", stackPtr, *stackPtr); } }
触发产生异常:
void fault_test_by_trigger(void) { MSG("%s\n", __func__); SCB->SHCSR |= SCB_SHCSR_HARDFAULTPENDED_Msk; // SCB->SHCSR |= SCB_SHCSR_BUSFAULTPENDED_Msk; // SCB->SHCSR |= SCB_SHCSR_MEMFAULTPENDED_Msk; // SCB->SHCSR |= SCB_SHCSR_USGFAULTPENDED_Msk; }
结果如下:
fault_test_by_trigger ExceptionStack(2003FFA8): R0 = 00000007 R1 = 0000000A R2 = E000ED00 R3 = 00270000 R12 = 00000000 LR = 000002E7 ReturnAddr = 000002F2 PSR = 69000000: N(0) Z(1) C(1) V(0) Q(1) IT(0) T(1) SFPA(0) GE(0) SPRealign(0) ISR(0) Stack from 0x2003FFA8 in [StackTop(0x2003FFF0), MSPLIM(0x2003F3F0)]: 0x2003FFA8 00000007 0x2003FFAC 0000000A 0x2003FFB0 E000ED00 0x2003FFB4 00270000 0x2003FFB8 00000000 0x2003FFBC 000002E7 0x2003FFC0 000002F2 0x2003FFC4 69000000--到此都为异常入栈内容。 0x2003FFC8 0000E4E0 0x2003FFCC 0000E4E0 0x2003FFD0 0000E4E0 0x2003FFD4 0000C0CC 0x2003FFD8 00000000 0x2003FFDC 00000000 0x2003FFE0 00000000 0x2003FFE4 00000000 0x2003FFE8 00000000 0x2003FFEC 0000B6C1 Cause of Hard Fault:
从上述log可知三个地址0x000002E7、0x000002F2、0x0000B6C1。
使用addrline工具分析对应符号表,
arm-linux-gnueabihf-addr2line -e main.elf -a -f 0x000002E7 0x000002F2 0x0000B6C1
结果如下:
0x000002e7 fault_test_by_trigger--异常栈中的LR,对应异常现场PC值。是导致问题产生的原因。 xxx.c:34 0x000002f2 main--这是异常退出后处理器PC指向的地方,即退出异常后将要执行的代码。 xxx.c:66 0x0000b6c1 Reset_Handler--栈回溯部分。 xxx.S:284
基本可以得到函数调用关系。
参考文档:
《ARM Cortex-M3/M4/M7 Hardfault异常分析》-相关寄存器、异常、异常栈、HardFault实例分析。
《How to debug a HardFault on an ARM Cortex-M MCU》-HFSR、MMFSR、UFSR、BFSR类型错误,以及恢复栈信息并进行自动化调试。
《GitHub - armink/CmBacktrace: Advanced fault backtrace library for ARM Cortex-M series MCU》:ARM Cortex-M 系列 MCU 错误追踪库。