内存问题排查工具 --- valgrind
1 概述
在用C/C++编程的时候,经常会出现下面三种内存问题:
- 内存泄漏
- 悬挂指针
- 多次释放同一块内存
本系列文章简要介绍排查这三个问题的工具和方法,先看看Valgrind
2 Valgrind
Valgrind是一款可以监测内存使用情况、监测内存泄漏的工具。对于一些规模不是很大的应用程序,Valgrind是一把利器。
3 内存泄漏监测
3.1 示例代码
1: int main()
2: {
3: char *p = malloc(sizeof(char) * 10);
4: if (p == NULL) {
5: return 0;
6: }
7:
8: *p++ = 'a';
9: *p++ = 'b';
10:
11: printf("%s
", *p);
12:
13: return 0;
14: }
3.2 编译它
1: gcc -g -o core1 core1.c
3.3 用Valgrind监测进程的内存泄漏
1: valgrind --leak-check=yes --show-reachable=yes ./core
Valgrind的输出为为:
1: ==25500== Memcheck, a memory error detector
2: ==25500== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.
3: ==25500== Using Valgrind-3.5.0 and LibVEX; rerun with -h for copyright info
4: ==25500== Command: ./core1
5: ==25500==
6: ==25500== Conditional jump or move depends on uninitialised value(s)
7: ==25500== at 0x36A104546A: vfprintf (in /lib64/libc-2.12.so)
8: ==25500== by 0x36A104EAC9: printf (in /lib64/libc-2.12.so)
9: ==25500== by 0x40055D: main (core1.c:13)
10: ==25500==
11: (null)
12: ==25500==
13: ==25500== HEAP SUMMARY:
14: ==25500== in use at exit: 10 bytes in 1 blocks
15: ==25500== total heap usage: 1 allocs, 0 frees, 10 bytes allocated
16: ==25500==
17: ==25500== 10 bytes in 1 blocks are definitely lost in loss record 1 of 1
18: ==25500== at 0x4A0515D: malloc (vg_replace_malloc.c:195)
19: ==25500== by 0x400515: main (core1.c:5)
20: ==25500==
21: ==25500== LEAK SUMMARY:
22: ==25500== definitely lost: 10 bytes in 1 blocks
23: ==25500== indirectly lost: 0 bytes in 0 blocks
24: ==25500== possibly lost: 0 bytes in 0 blocks
25: ==25500== still reachable: 0 bytes in 0 blocks
26: ==25500== suppressed: 0 bytes in 0 blocks
27: ==25500==
28: ==25500== For counts of detected and suppressed errors, rerun with: -v
29: ==25500== Use --track-origins=yes to see where uninitialised values come from
30: ==25500== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 6 from 6)
可以看到,Valgrind提示在第五行分配的内存未被释放
4 悬挂指针
4.1 示例代码
1: struct elem {
2: int a;
3: double b;
4: };
5:
6: int main()
7: {
8: struct elem *e = malloc(sizeof(struct elem));
9: if (e == NULL) {
10: return 0;
11: }
12:
13: e->a = 10;
14: e->b = 10.10;
15:
16: double *xx = &e->b;
17:
18: printf("%f
", *xx);
19:
20: free(e);
21:
22: printf("%f
", *xx);
23:
24: return 0;
25: }
4.2 Valgrind运行结果
同样用-g编译后valgrind运行的结果:
1: [cobbliu@MacBook]$ valgrind --leak-check=yes --show-reachable=yes ./core2
2: ==26148== Memcheck, a memory error detector
3: ==26148== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.
4: ==26148== Using Valgrind-3.5.0 and LibVEX; rerun with -h for copyright info
5: ==26148== Command: ./core2
6: ==26148==
7: 10.100000
8: ==26148== Invalid read of size 8
9: ==26148== at 0x4005CA: main (core2.c:26)
10: ==26148== Address 0x502a048 is 8 bytes inside a block of size 16 free'd
11: ==26148== at 0x4A04D72: free (vg_replace_malloc.c:325)
12: ==26148== by 0x4005C5: main (core2.c:24)
13: ==26148==
14: 10.100000
15: ==26148==
16: ==26148== HEAP SUMMARY:
17: ==26148== in use at exit: 0 bytes in 0 blocks
18: ==26148== total heap usage: 1 allocs, 1 frees, 16 bytes allocated
19: ==26148==
20: ==26148== All heap blocks were freed -- no leaks are possible
21: ==26148==
22: ==26148== For counts of detected and suppressed errors, rerun with: -v
23: ==26148== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 6 from 6)
可以看到在free(e)后,指针xx成为了悬挂指针,此后对xx的读,如果xx指向的内存还未被glibc回收,进程不会core掉。valgrind提示在26行做了对xx的 Invalid read.
5 多次释放同一个指针
5.1 示例代码
1: int main()
2: {
3: char *p = malloc(sizeof(char) * 10);
4: if (p == NULL) {
5: return 0;
6: }
7:
8: char *q = p;
9:
10: *p++ = 'a';
11: *p++ = 'b';
12:
13: printf("%s
", *p);
14:
15: free(p);
16: free(q);
17: return 0;
18: }
5.2 Valgrind 监测
1: [cobbliu@MacBook]$ valgrind --leak-check=yes --show-reachable=yes ./core1
2: ==26874== Memcheck, a memory error detector
3: ==26874== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.
4: ==26874== Using Valgrind-3.5.0 and LibVEX; rerun with -h for copyright info
5: ==26874== Command: ./core1
6: ==26874==
7: ==26874== Conditional jump or move depends on uninitialised value(s)
8: ==26874== at 0x36A104546A: vfprintf (in /lib64/libc-2.12.so)
9: ==26874== by 0x36A104EAC9: printf (in /lib64/libc-2.12.so)
10: ==26874== by 0x4005B5: main (core1.c:15)
11: ==26874==
12: (null)
13: ==26874== Invalid free() / delete / delete[]
14: ==26874== at 0x4A04D72: free (vg_replace_malloc.c:325)
15: ==26874== by 0x4005C1: main (core1.c:17)
16: ==26874== Address 0x502a042 is 2 bytes inside a block of size 10 alloc'd
17: ==26874== at 0x4A0515D: malloc (vg_replace_malloc.c:195)
18: ==26874== by 0x400565: main (core1.c:5)
19: ==26874==
20: ==26874==
21: ==26874== HEAP SUMMARY:
22: ==26874== in use at exit: 0 bytes in 0 blocks
23: ==26874== total heap usage: 1 allocs, 2 frees, 10 bytes allocated
24: ==26874==
25: ==26874== All heap blocks were freed -- no leaks are possible
26: ==26874==
27: ==26874== For counts of detected and suppressed errors, rerun with: -v
28: ==26874== Use --track-origins=yes to see where uninitialised values come from
29: ==26874== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 6 from 6)
可以看到,valgrind提示在17行有一个Invalid Free()
6 Valgrind的优缺点
valgrind默认使用memcheck工具做内存监测。
6.1 Advantages
用valgrind监测内存泄漏,不用重新编译应用程序,不用重新链接应用程序,不用对应用进程做任何修改。如果想查看详细的出错信息,只需要在编译时加上-g选项。
6.2 Disadvantages
不管valgrind在使用memcheck工具监测内存时,它会接管应用程序,并且读取应用程序可执行文件和库文件中的debug信息来显示详细的出错位置。当valgrind启动后,应用 进程实际上在valgrind的虚拟环境中执行,valgrind会将每行代码传递给memcheck工具,memcheck工具再加入自己的调试信息,之后再将合成的代码真正运行。memcheck工具在 应用进程每个防存操作和每个变量赋值操作时加入额外的统计代码,通常情况下,使用memcheck工具后应用程序的运行时间会比原生代码慢大约10-50倍。
其次,对于一些不停机运行的服务器程序的内存问题,valgrind无能为力。不仅仅是因为valgrind无法使之停止,还有可能是因为服务器进程本身就被设计为申请一些生命周期 与进程生命周期一样长的内存,永远不释放,这些内存会被valgrind报泄漏错误。
再次,valgrind对多线程程序支持得不够好。在多线程程序执行时,valgrind在同一时刻只让其中一个线程执行,它不会充分利用多核的环境。在用valgrind运行您的多线程程序 时,您的宝贵程序的运行情况可能跟不使用valgrind的运行情况千差万别。
7 Valgrind的其他工具
除了memcheck工具外,valgrind工具包还有一些别的好用的工具
7.1 Cachegrind
这个工具模拟 CPU中的一级缓存I1,D1和L2二级缓存,能够精确地指出程序中 cache的丢失和命中。如果需要,它还能够为我们提供cache丢失次数,内存引用次数,以及每行 代码,每个函数,每个模块,整个程序产生的指令数。这对优化程序有很大的帮助。 详情见这里
7.2 Callgrind
Callgrind收集程序运行时的一些数据,函数调用关系等信息,还可以有选择地进行cache 模拟。在运行结束时,它会把分析数据写入一个文件。callgrindannotate可以把 这个文件的内容转化成可读的形式。 详情见这里
7.3 Helgrind
它主要用来检查C/C++多线程程序(使用POSIX线程)中出现的同步问题。Helgrind 寻找内存中被多个线程访问,而又没有一贯加锁的区域,这些区域往往是线程之间失去同 步的地方,而且会导致难以发掘的错误。Helgrind实现了名为"Eraser" 的竞争检测算法,并做了进一步改进,减少了报告错误的次数。 详情见这里
7.4 DRD
这也使一款多线程程序监测工具,它提供的监测信息比Helgrind更丰富。 详情见这里
7.5 Massif
堆栈分析器,它能测量程序在堆栈中使用了多少内存。告诉我们堆块,堆管理块和栈的大小。对于那些被应用进程释放但是还没有交还给操作系统的内存,memcheck是监测 不出来的,而Massif能有效第监测到这类内存。 详情见这里
7.6 DHAT
这个工具能详细地显示应用进程如何使用堆栈,以使用户更好地评估程序的设计。 详情见这里