• 记一次 .NET 某妇产医院 WPF内存溢出分析


    一:背景

    1. 讲故事

    上个月有位朋友通过博客园的短消息找到我,说他的程序存在内存溢出情况,寻求如何解决。

    要解决还得通过 windbg 分析啦。

    二:Windbg 分析

    1. 为什么会内存溢出

    大家都知道内存溢出对应着 .NET 中的 OutOfMemoryException 异常,这种异常有可能是托管代码手工抛出的,也有可能是CLR层面抛出的,言外之意就是可以通过两种方式排查。

    • 托管线程是否挂载着异常?
    
    0:000> !t
    ThreadCount:      23
    UnstartedThread:  0
    BackgroundThread: 5
    PendingThread:    0
    DeadThread:       17
    Hosted Runtime:   no
                                                                             Lock  
           ID OSID ThreadOBJ    State GC Mode     GC Alloc Context  Domain   Count Apt Exception
       0    1 362c 00fac868     26020 Preemptive  7ED701A0:00000000 00fa6b60 0     STA 
       5    2 2d70 00fbeba0     2b220 Preemptive  7EBA7AC0:00000000 00fa6b60 0     MTA (Finalizer) 
       7    3 3264 061c8890   102a220 Preemptive  00000000:00000000 00fa6b60 0     MTA (Threadpool Worker) 
      17   15 3f98 19682b90   202b220 Preemptive  7EBB0830:00000000 00fa6b60 0     MTA 
    XXXX   16    0 2845fb00     35820 Preemptive  00000000:00000000 00fa6b60 0     Ukn 
      18   14  a7c 2842b1c8   202b220 Preemptive  00000000:00000000 00fa6b60 0     MTA 
    XXXX    6    0 2c9b3778   1039820 Preemptive  00000000:00000000 00fa6b60 0     Ukn (Threadpool Worker) 
    XXXX   18    0 288a1318   1039820 Preemptive  00000000:00000000 00fa6b60 0     Ukn (Threadpool Worker) 
    XXXX   23    0 288a22f0   1039820 Preemptive  00000000:00000000 00fa6b60 0     Ukn (Threadpool Worker) 
    XXXX   10    0 2ccf3550   1039820 Preemptive  00000000:00000000 00fa6b60 0     Ukn (Threadpool Worker) 
    XXXX   21    0 288a1860   1039820 Preemptive  00000000:00000000 00fa6b60 0     Ukn (Threadpool Worker) 
    XXXX   12    0 288a1da8   1039820 Preemptive  00000000:00000000 00fa6b60 0     Ukn (Threadpool Worker) 
    XXXX   11    0 2c993640   1039820 Preemptive  00000000:00000000 00fa6b60 0     Ukn (Threadpool Worker) 
    XXXX    8    0 2ccf3a98     35820 Preemptive  00000000:00000000 00fa6b60 0     Ukn 
    XXXX    9    0 2ccf2030   1039820 Preemptive  00000000:00000000 00fa6b60 0     Ukn (Threadpool Worker) 
    XXXX    7    0 2c9aed88   1039820 Preemptive  00000000:00000000 00fa6b60 0     Ukn (Threadpool Worker) 
    XXXX   26    0 28898308   1039820 Preemptive  00000000:00000000 00fa6b60 0     Ukn (Threadpool Worker) 
    XXXX   25    0 2c492c68   1039820 Preemptive  00000000:00000000 00fa6b60 0     Ukn (Threadpool Worker) 
    XXXX    4    0 2c993b88   1039820 Preemptive  00000000:00000000 00fa6b60 0     Ukn (Threadpool Worker) 
    XXXX   20    0 2c9af2d0   1039820 Preemptive  00000000:00000000 00fa6b60 0     Ukn (Threadpool Worker) 
    XXXX   17    0 2c9afd60   1039820 Preemptive  00000000:00000000 00fa6b60 0     Ukn (Threadpool Worker) 
    XXXX   24    0 2c9b1280   1039820 Preemptive  00000000:00000000 00fa6b60 0     Ukn (Threadpool Worker) 
      23   22 2658 2c9b02a8   1029220 Preemptive  7ED5BFF8:00000000 00fa6b60 0     MTA (Threadpool Worker) 
    
    

    从输出信息看,这些线程并没有挂载任何托管异常,我去。。。

    • 是否在 CLR 上抛出

    这主要是看 托管堆(heap) 上的内存分配或者gc回收造成的内存不足,可以用 !ao 命令。

    
    0:000> !ao
    There was no managed OOM due to allocations on the GC heap
    
    

    从输出信息看也没有任何异常,尴尬了。。。 尼玛,那到底是因为什么呢?

    2. 探索溢出原因

    出现这种尴尬情况,我只能怀疑生成这个dump的时候并没有get到那个点,或者是我的知识边界有限,不过天无绝人之路,不在那个 也肯定在那个 附近,对吧,接下来用 !address -summary 看一下内存使用的归类信息。

    
    0:000> !address -summary
    
    --- Usage Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal
    <unknown>                              1520          4c185000 (   1.189 GB)  65.57%   59.45%
    Image                                  4306          1f140000 ( 497.250 MB)  26.78%   24.28%
    Free                                   1133           bf17000 ( 191.090 MB)            9.33%
    Heap                                    617           7626000 ( 118.148 MB)   6.36%    5.77%
    Stack                                    72           1740000 (  23.250 MB)   1.25%    1.14%
    Other                                    34             7b000 ( 492.000 kB)   0.03%    0.02%
    TEB                                      24             30000 ( 192.000 kB)   0.01%    0.01%
    PEB                                       1              3000 (  12.000 kB)   0.00%    0.00%
    
    --- Type Summary (for busy) ------ RgnCount ----------- Total Size -------- %ofBusy %ofTotal
    MEM_MAPPED                              549          34b60000 ( 843.375 MB)  45.42%   41.18%
    MEM_PRIVATE                            1718          20424000 ( 516.141 MB)  27.80%   25.20%
    MEM_IMAGE                              4307          1f155000 ( 497.332 MB)  26.78%   24.28%
    
    --- State Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal
    MEM_COMMIT                             4904          66ddd000 (   1.607 GB)  88.64%   80.37%
    MEM_RESERVE                            1670           d2fc000 ( 210.984 MB)  11.36%   10.30%
    MEM_FREE                               1133           bf17000 ( 191.090 MB)            9.33%
    
    --- Protect Summary (for commit) - RgnCount ----------- Total Size -------- %ofBusy %ofTotal
    PAGE_READONLY                          2272          382cf000 ( 898.809 MB)  48.41%   43.89%
    PAGE_READWRITE                         1572          1eead000 ( 494.676 MB)  26.64%   24.15%
    PAGE_EXECUTE_READ                       218           dd59000 ( 221.348 MB)  11.92%   10.81%
    PAGE_WRITECOPY                          449           133e000 (  19.242 MB)   1.04%    0.94%
    PAGE_EXECUTE_READWRITE                  188            ab4000 (  10.703 MB)   0.58%    0.52%
    PAGE_NOACCESS                           156             9c000 ( 624.000 kB)   0.03%    0.03%
    PAGE_READWRITE | PAGE_GUARD              48             78000 ( 480.000 kB)   0.03%    0.02%
    PAGE_READWRITE | PAGE_WRITECOMBINE        1              2000 (   8.000 kB)   0.00%    0.00%
    
    --- Largest Region by Usage ----------- Base Address -------- Region Size ----------
    <unknown>                                   1d200000           a001000 ( 160.004 MB)
    Image                                        fed1000           36e4000 (  54.891 MB)
    Free                                        33dfe000           1082000 (  16.508 MB)
    Heap                                        3da84000            a1b000 (  10.105 MB)
    Stack                                        1a10000             fd000 (1012.000 kB)
    Other                                       7fa40000             33000 ( 204.000 kB)
    TEB                                           a4c000              3000 (  12.000 kB)
    PEB                                           a3d000              3000 (  12.000 kB)
    
    

    从上面的 MEM_COMMIT=1.607 GB 80.37% 信息看,当前内存占用 1.6G,占比 80.37%,可以看出它受到了一个 2G内存 的限制,而且从 !t 输出中的内存地址看,当前是 32bit 程序,所以这是一个经典的: 64系统跑着32位程序被2G内存限制 的问题。

    3. 如何突破 2G 限制

    要寻找答案,还得看最权威的 MSDN: https://docs.microsoft.com/en-us/windows/win32/memory/memory-limits-for-windows-releases?redirectedfrom=MSDN

    破局 还得设置程序的 IMAGE_FILE_LARGE_ADDRESS_AWARE 标记。

    关于具体怎么设置,我找了三种方法。

    • 使用 LargeAddressAware 安装包

    参见 github: https://github.com/KirillOsenkov/LargeAddressAware

    • 使用 editbin

    可以在 vs 的生成事件中输入 editbin /largeaddressaware $(TargetPath)

    • 使用代码方式

    这种可以直接给生成好的 exe 增加 LargeAddressAware 标记,除了标记,还能检测,

    
    using System;
    using System.IO;
    
    namespace PEFile
    {
        public class LargeAddressAware
        {
            public static bool IsLargeAddressAware(string filePath)
            {
                bool isLargeAddressAware = false;
                PrepareStream(filePath, (stream, binaryReader) => isLargeAddressAware = (binaryReader.ReadInt16() & 0x20) != 0);
                return isLargeAddressAware;
            }
    
            public static void SetLargeAddressAware(string filePath)
            {
                PrepareStream(filePath, (stream, binaryReader) =>
                {
                    var value = binaryReader.ReadInt16();
                    if ((value & 0x20) == 0)
                    {
                        value = (short)(value | 0x20);
                        stream.Position -= 2;
                        var binaryWriter = new BinaryWriter(stream);
                        binaryWriter.Write(value);
                        binaryWriter.Flush();
                    }
                });
            }
    
            private static void PrepareStream(string filePath, Action<Stream, BinaryReader> action)
            {
                using (var stream = new FileStream(filePath, FileMode.Open, FileAccess.ReadWrite, FileShare.Read))
                {
                    if (stream.Length < 0x3C)
                    {
                        return;
                    }
    
                    var binaryReader = new BinaryReader(stream);
    
                    // MZ header
                    if (binaryReader.ReadInt16() != 0x5A4D)
                    {
                        return;
                    }
    
                    stream.Position = 0x3C;
                    var peHeaderLocation = binaryReader.ReadInt32();
    
                    stream.Position = peHeaderLocation;
    
                    // PE header
                    if (binaryReader.ReadInt32() != 0x4550)
                    {
                        return;
                    }
    
                    stream.Position += 0x12;
    
                    action(stream, binaryReader);
                }
            }
        }
    }
    
    

    更多办法参考: https://stackoverflow.com/questions/639540/how-much-memory-can-a-32-bit-process-access-on-a-64-bit-operating-system

    三:总结

    总的来说,2G 内存限制 是一个 32bit 程序所必须面对的问题,知道了就好解决了,最后有一个问题要解释下,为什么 commit 内存高达 1.6G,这是因为医疗类的软件,大多是 FastReport + DevExpress 这些重量级的经典搭配以及大量的图片资源占用了太多 native memory。

    图片名称
  • 相关阅读:
    爱情七十八课,闲了就“犯贱”
    阿里巴巴中文站的CSS设计规则(转)
    爱情八十一课,可预测的分手
    [性格][管理]《九型人格2》 唐·理查德·里索(美)、拉斯·赫德森(美)
    爱情八十二课,爱情三国杀
    爱情七十九课,不爱权力大
    [心理学]《爱情心灵安全岛》 四四
    一些你不知道的囧知识,保证让你崩溃
    爱情七十四课,我们的意义
    爱情七十六课,门当户对
  • 原文地址:https://www.cnblogs.com/huangxincheng/p/15671957.html
Copyright © 2020-2023  润新知