很多情况下,很多程序的崩溃都是与heap corruption有关。一旦找到程序的崩溃是由heap corruption导致后,我们就需要启用debug page heap.这样可以在heap corruption的时候最大可能的获取更多关于heap的相关信息。
1. 通过以下方法启用debug page heap.
方法1:pageheap.exe <processname>
方法2:gflat.exe /i <process name> +hpa
实际上,以上命令的作用是在注册表Image File Execution Mapping键值下创建相应的register key.
2. 以上设置好之后,然后安装调试器如windbg, 来抓取问题发生时候的crash dump.
Adplus.vbs –crash –p <process name> -o c:\outpath –quiet.
3. 重现问题后,便会生成相应的crash dump.
4. 以下是一个简要的heap corruption发生后的dump 分析。
0:025> kpL
ChildEBP RetAddr
2058f230 7c993319 ntdll!DbgBreakPoint(void)
2058f240 7c9a7979 ntdll!RtlpPageHeapStop(unsigned long Code = 0xf, char * Message = 0x7c9a7c90 "corrupted suffix pattern", unsigned long Param1 = 0x4671000, char * Description1 = 0x7c9a7c84 "Heap handle", unsigned long Param2 = 0x78d54690, char * Description2 = 0x7c9a7c78 "Heap block", unsigned long Param3 = 0x418, char * Description3 = 0x7c9a7c6c "Block size", unsigned long Param4 = 0x78d54aa8, char * Description4 = 0x7c9a7c58 "corruption address")+0x72
2058f2bc 7c9a8b43 ntdll!RtlpDphReportCorruptedBlock(void * Heap = 0x04671000, unsigned long Context = 4, void * Block = 0x78d54690, struct _DPH_VALIDATION_INFORMATION * ValidationInformation = 0x2058f2e0)+0x1cf
2058f2ec 7c9a8da4 ntdll!RtlpDphNormalHeapFree(struct _DPH_HEAP_ROOT * Heap = 0x04671000, void * NtHeap = 0x04770000, unsigned long Flags = 0x1001002, void * Block = 0x78d54690)+0x32
2058f344 7c9abc7b ntdll!RtlpDebugPageHeapFree(void * HeapHandle = 0x04670000, unsigned long Flags = 0x1001002, void * Address = 0x78d54690)+0x146
2058f3ac 7c98575a ntdll!RtlDebugFreeHeap(void * HeapHandle = 0x04670000, unsigned long Flags = 0x1001002, void * BaseAddress = 0x78d54690)+0x2c
2058f484 7c96e608 ntdll!RtlFreeHeapSlowly(void * HeapHandle = 0x04670000, unsigned long Flags = 0x1001002, void * BaseAddress = 0x78d54690)+0x37
2058f568 78134c39 ntdll!RtlFreeHeap(void * HeapHandle = 0x04670000, unsigned long Flags = 0x1001002, void * BaseAddress = 0x78d54690)+0x11a
2058f5b4 637150f1 msvcr80!free(void * pBlock = 0x78d54690)+0xcd
WARNING: Stack unwind information not available. Following frames may be wrong.
2058f600 774fa4a2 CustomerModule!DllUnregisterServer+0xf31
2058f624 774e3427 ole32!CStdMarshal::Disconnect(unsigned long dwType = 1)+0x26c
2058f634 774e33f9 ole32!CStdMarshal::HandlePendingDisconnect(HRESULT hr = 0x00000000)+0x2b
2058f684 774e3294 ole32!CRemoteUnknown::RemReleaseWorker(unsigned short cInterfaceRefs = 2, struct tagREMINTERFACEREF * InterfaceRefs = 0x001abbe0, int fTopLevel = 1)+0x1bd
2058f698 77c50193 ole32!CRemoteUnknown::RemRelease(unsigned short cInterfaceRefs = 2, struct tagREMINTERFACEREF * InterfaceRefs = 0x001abbe0)+0x15
2058f6b8 77cb33e1 rpcrt4!Invoke(void)+0x30
2058fab8 77cb2ed5 rpcrt4!NdrStubCall2(struct IRpcStubBuffer * pThis = 0x1ff70fe0, struct IRpcChannelBuffer * pChannel = 0x066bee5c, struct _RPC_MESSAGE * pRpcMsg = 0x74d5dc38, unsigned long * pdwStubPhase = 0x2058faf4)+0x299
2058fb10 775cd01b rpcrt4!CStdStubBuffer_Invoke(struct IRpcStubBuffer * This = 0x1ff70fe0, struct tagRPCOLEMESSAGE * prpcmsg = 0x74d5dc38, struct IRpcChannelBuffer * pRpcChannelBuffer = 0x066bee5c)+0xc6
2058fb54 775ccfc8 ole32!SyncStubInvoke(struct tagRPCOLEMESSAGE * pMsg = 0x74d5dc38, struct _GUID * riid = 0x0663afbc, class CIDObject * pID = 0x00000000, struct IRpcChannelBuffer * pChnl = 0x066bee5c, struct IRpcStubBuffer * pStub = 0x1ff70fe0, unsigned long * pdwFault = 0x2058fcfc)+0x37
2058fb9c 7750120b ole32!StubInvoke(struct tagRPCOLEMESSAGE * pMsg = 0x74d5dc38, class CStdIdentity * pStdID = 0x2002cf20, struct IRpcStubBuffer * pStub = 0x1ff70fe0, struct IRpcChannelBuffer * pChnl = 0x066bee5c, struct tagIPIDEntry * pIPIDEntry = 0x80004021, unsigned long * pdwFault = 0x2058fcfc)+0xa7
2058fc78 77500bf5 ole32!CCtxComChnl::ContextInvoke(struct tagRPCOLEMESSAGE * pMessage = 0x00000000, struct IRpcStubBuffer * pStub = 0x1ff70fe0, struct tagIPIDEntry * pIPIDEntry = 0x0392a330, unsigned long * pdwFault = 0x2058fcfc)+0xec
0:025> dt _DPH_VALIDATION_INFORMATION 0x2058f2e0
+0x000 ReasonCode : 0x10
+0x004 ExceptionCode : 0x4675000
+0x008 CorruptionLocation : 0x78d54aa8 (该地址就是程序试图访问的地址,实际上在启用了调试功能时,该地址上应该写上了相应的栅栏值)
//说明:根据Debug page heap的存储结构,在_DPH_BLOCK_INFORMATION下面存储的就是用户数据区。而_DPH_BLOCK_INFORMATION的size大小为20,其结构如下所示:
0:025> dt _dph_block_information
ntdll!_DPH_BLOCK_INFORMATION
+0x000 StartStamp : Uint4B
+0x004 Heap : Ptr32 Void
+0x008 RequestedSize : Uint4B
+0x00c ActualSize : Uint4B
+0x010 FreeQueue : _LIST_ENTRY
+0x010 FreePushList : _SINGLE_LIST_ENTRY
+0x010 TraceIndex : Uint2B
+0x018 StackTrace : Ptr32 Void
+0x01c EndStamp : Uint4B
因此,从用户数据区 减去 sizeof(_DPH_BLOCK_INFORMATION) ,即为_DPH_BLOCK_INFORMATION的地址。
0:025> dt 0x78d54690-0x20 _dph_block_information
ntdll!_DPH_BLOCK_INFORMATION
+0x000 StartStamp : 0xabcdaaaa
+0x004 Heap : 0x84671000
+0x008 RequestedSize : 0x418
+0x00c ActualSize : 0x440
+0x010 FreeQueue : _LIST_ENTRY [ 0x0 - 0x0 ]
+0x010 FreePushList : _SINGLE_LIST_ENTRY
+0x010 TraceIndex : 0
+0x018 StackTrace : (null)
+0x01c EndStamp : 0xdcbaaaaa
下面,我们可以看看用户数据区之后的填充的栅栏值是否正确
0:025> dd 0x78d54690+0x418 L4
78d54aa8 00000000 a0a0a0a0 00000000 00000000
可以看到,本来应该为a0a0a0a0的栅栏值被冲毁!表明heap corruption发生了。
0:025> ?0x78d54690+0x418
Evaluate expression: 2027244200 = 78d54aa8
实际上,我们看到发生Access Violation的地址也正是用户请求的数据区域之后紧接的栅栏地址。
OK,目前知道了heap 被冲毁了,那么我们就需要知道如何被冲毁的了。
这时候,我们就需要拿到call stack中涉及到的module的symbol文件,这样我们就可以定位到那个函数导致了该heap corruption.
写的比较粗糙。以后有空再慢慢润色吧。兄弟们就凑合一下了