• 也谈string.Join和StringBuilder的性能比较


    前几天在园子里面看到一篇讲StringBuilder性能的文章( https://www.cnblogs.com/wanghao72214/p/15571181.html )。文章里面给出了一个测试用例,比较StringBuilder.AppendJoin和String.Join的性能。根据该测试结果,“对于这个操作,这两种方法的速度很接近,但 StringBuilder.AppendJoin 使用的内存明显较少”。据此,该文言之凿凿地指出,应该使用“StringBuilder.AppendJoin 而不是 String.Join”。

    事实果真如此吗?

    搜索一下就知道,StringBuilder采用的是先预分配缓冲区,然后将要连接的字符串直接复制到缓冲区的做法。这个做法确实高效,避免了中间结果带来的时间消耗和内存占用。

    那么,string.Join真的那么不堪吗?

    首先看看string.Join的代码:

    public static string Join(string separator, params string[] value)
    {
        if (value == null)
        {
            throw new ArgumentNullException("value");
        }
        return string.Join(separator, value, 0, value.Length);
    }
    
    public unsafe static string Join(string separator, string[] value, int startIndex, int count)
    {
        if (value == null)
        {
            throw new ArgumentNullException("value");
        }
        if (startIndex < 0)
        {
            throw new ArgumentOutOfRangeException("startIndex", Environment.GetResourceString("ArgumentOutOfRange_StartIndex"));
        }
        if (count < 0)
        {
            throw new ArgumentOutOfRangeException("count", Environment.GetResourceString("ArgumentOutOfRange_NegativeCount"));
        }
        if (startIndex > value.Length - count)
        {
            throw new ArgumentOutOfRangeException("startIndex", Environment.GetResourceString("ArgumentOutOfRange_IndexCountBuffer"));
        }
        if (separator == null)
        {
            separator = string.Empty;
        }
        if (count == 0)
        {
            return string.Empty;
        }
        int num = 0;
        int num2 = startIndex + count - 1;
        for (int i = startIndex; i <= num2; i++)
        {
            if (value[i] != null)
            {
                num += value[i].Length;
            }
        }
        num += (count - 1) * separator.Length;
        if (num < 0 || num + 1 < 0)
        {
            throw new OutOfMemoryException();
        }
        if (num == 0)
        {
            return string.Empty;
        }
        string text = string.FastAllocateString(num);
        fixed (char* ptr = &text.m_firstChar)
        {
            UnSafeCharBuffer unSafeCharBuffer = new UnSafeCharBuffer(ptr, num);
            unSafeCharBuffer.AppendString(value[startIndex]);
            for (int j = startIndex + 1; j <= num2; j++)
            {
                unSafeCharBuffer.AppendString(separator);
                unSafeCharBuffer.AppendString(value[j]);
            }
        }
        return text;
    }
    View Code

     可以看到,string.Join的做法是先计算最终结果的大小,然后调用string.FastAllocateString分配空间,最后将数据直接复制到分配的缓冲区。很显然,这一过程和StringBuilder如出一辙。

    但是测试结果毕竟摆在那里,那么问题在哪里呢?

    看看该文的测试用例:

    [Benchmark]
    public string UsingStringJoin() {
                var list = new List < string > {
                            "A",
                            "B", "C", "D", "E"
                };
                var stringBuilder = new StringBuilder();
                for (int i = 0; i < 10000; i++) {
                            stringBuilder.Append(string.Join(' ', list));
                }
                return stringBuilder.ToString();
    }
    [Benchmark]
    public string UsingAppendJoin() {
                var list = new List < string > {
                            "A",
                            "B", "C", "D", "E"
                };
                var stringBuilder = new StringBuilder();
                for (int i = 0; i < 10000; i++) {
                            stringBuilder.AppendJoin(' ', list);
                }
                return stringBuilder.ToString();
    }
    View Code

    问题就在下面的一句:

    stringBuilder.Append(string.Join(' ', list));

    这句代码实际上是先用string.Join把list拼好,再调用stringBuilder.Append把string.Join的结果拼接起来。这样,string.Join分配一次内存,stringBuilder再分配一次内存,内存占用怎能不大?

    当然,没码没真相,得拿编译后的IL说话。所以,根据这个用法写段测试代码:

    private void BtnStartClick(object sender, EventArgs e)
    {
        string[] dummy = new string[]
        {
            "zfsdfsd",
            "sdfsdf"
        };
        StringBuilder sb = new StringBuilder();
        sb.Append(string.Join(",", dummy));
        string s = sb.ToString();
        Console.WriteLine(s);
    }
    View Code

    看看IL:

     1 .method private hidebysig 
     2     instance void BtnStartClick (
     3         object sender,
     4         class [mscorlib]System.EventArgs e
     5     ) cil managed 
     6 {
     7     // Header Size: 12 bytes
     8     // Code Size: 65 (0x41) bytes
     9     // LocalVarSig Token: 0x11000004 RID: 4
    10     .maxstack 3
    11     .locals init (
    12         [0] string[] dummy,
    13         [1] class [mscorlib]System.Text.StringBuilder sb,
    14         [2] string s,
    15         [3] string[] CS$0$0000
    16     )
    17 
    18     /* (34,3)-(34,4) d:\Work_Private\IoT\ClientSimulator\MainForm.cs */
    19     /* 0x00000340 00           */ IL_0000: nop
    20     /* (35,4)-(35,52) d:\Work_Private\IoT\ClientSimulator\MainForm.cs */
    21     /* 0x00000341 18           */ IL_0001: ldc.i4.2
    22     /* 0x00000342 8D1D000001   */ IL_0002: newarr    [mscorlib]System.String
    23     /* 0x00000347 0D           */ IL_0007: stloc.3
    24     /* 0x00000348 09           */ IL_0008: ldloc.3
    25     /* 0x00000349 16           */ IL_0009: ldc.i4.0
    26     /* 0x0000034A 7201000070   */ IL_000A: ldstr     "zfsdfsd"
    27     /* 0x0000034F A2           */ IL_000F: stelem.ref
    28     /* 0x00000350 09           */ IL_0010: ldloc.3
    29     /* 0x00000351 17           */ IL_0011: ldc.i4.1
    30     /* 0x00000352 7211000070   */ IL_0012: ldstr     "sdfsdf"
    31     /* 0x00000357 A2           */ IL_0017: stelem.ref
    32     /* 0x00000358 09           */ IL_0018: ldloc.3
    33     /* 0x00000359 0A           */ IL_0019: stloc.0
    34     /* (37,4)-(37,41) d:\Work_Private\IoT\ClientSimulator\MainForm.cs */
    35     /* 0x0000035A 731600000A   */ IL_001A: newobj    instance void [mscorlib]System.Text.StringBuilder::.ctor()
    36     /* 0x0000035F 0B           */ IL_001F: stloc.1
    37     /* (38,4)-(38,38) d:\Work_Private\IoT\ClientSimulator\MainForm.cs */
    38     /* 0x00000360 07           */ IL_0020: ldloc.1
    39     /* 0x00000361 721F000070   */ IL_0021: ldstr     ","
    40     /* 0x00000366 06           */ IL_0026: ldloc.0
    41     /* 0x00000367 281700000A   */ IL_0027: call      string [mscorlib]System.String::Join(string, string[])
    42     /* 0x0000036C 6F1800000A   */ IL_002C: callvirt  instance class [mscorlib]System.Text.StringBuilder [mscorlib]System.Text.StringBuilder::Append(string)
    43     /* 0x00000371 26           */ IL_0031: pop
    44     /* (40,4)-(40,27) d:\Work_Private\IoT\ClientSimulator\MainForm.cs */
    45     /* 0x00000372 07           */ IL_0032: ldloc.1
    46     /* 0x00000373 6F1900000A   */ IL_0033: callvirt  instance string [mscorlib]System.Object::ToString()
    47     /* 0x00000378 0C           */ IL_0038: stloc.2
    48     /* (42,4)-(42,25) d:\Work_Private\IoT\ClientSimulator\MainForm.cs */
    49     /* 0x00000379 08           */ IL_0039: ldloc.2
    50     /* 0x0000037A 281A00000A   */ IL_003A: call      void [mscorlib]System.Console::WriteLine(string)
    51     /* 0x0000037F 00           */ IL_003F: nop
    52     /* (45,3)-(45,4) d:\Work_Private\IoT\ClientSimulator\MainForm.cs */
    53     /* 0x00000380 2A           */ IL_0040: ret
    54 } // end of method MainForm::BtnStartClick
    View Code

    从第41和42行可以清楚看到,代码先调用了String.Join,然后是StringBuilder.Append.

    所以事情很清楚了。错误的代码得出了不符合本意的测试结果,根据这个结果得到的结论自然也是错误的。

    实际上,根据MS的文档,“修改 StringBuilder 时,除非达到容量,否则对象不会为自己重新分配空间。 当达到容量时,将自动分配新的空间且容量翻倍。”,可以看出,在边界情况下,使用StringBuilder耗费的空间反而比Join要大。当然,考虑到内存对齐的因素,Join也会有部分内存浪费,但这实在是微不足道的。

    那么,应该使用StringBuilder还是Join呢?

    很简单,按照具体情况决定。如果要拼接的是现成的字符串数组,自然应该用Join。否则的话,还是用StringBuilder省事点。

  • 相关阅读:
    索引使用及注意事项
    Explain详解与索引
    JVM常量池了解
    认识Mysql索引
    JVM调优工具及了解
    JVM垃圾收集器
    JVM垃圾回收相关算法
    JVM字节码文件结构剖析
    JVM对象创建与内存分配机制
    JVM内存参数设置
  • 原文地址:https://www.cnblogs.com/firstrose/p/15606998.html
Copyright © 2020-2023  润新知