• StringTokenizer 的性能看来真的不用担心


    一直以来,分析HTTP的Header使用的都是StringTokenizer,但是看过jdk中关于StringTokenizer的介绍:

    StringTokenizer 是出于兼容性的原因而被保留的遗留类(虽然在新代码中并不鼓励使用它)。建议所有寻求此功能的人使用 String 的 split 方法或 java.util.regex 包。

    开始以为 StringTokenizer 功能或性能不是很给力,但经过半天的测试,使用String.split()、StringUtils.split()、mySplit(我定制的)、StringTokenizer 进行对比,下面是结果:

    测试结果表明: StringTokenizer 对一个字符串进行分组读取,速度是最快的。

    通过查看jdk源码,StringTokenizer.java 和 String.java中的split()方法,可以看到:StringTokenizer在对数据分段读取的时候,通过当前索引和下一个索引,进行判断和读取:

    class StringTokenizer implements Enumeration<Object> {
        private int currentPosition;
        private int newPosition;
        private int maxPosition;
        private String str;
        private String delimiters;
        private boolean retDelims;
        private boolean delimsChanged;

    ................

    而 String.split(),这个支持正则表达式(这个很耗时),然后先进行分组,然后保存到ArrayList,然后再转换成数组:

     public String[] split(String regex, int limit) {
            /* fastpath if the regex is a
               (1)one-char String and this character is not one of the
                  RegEx's meta characters ".$|()[{^?*+\", or
               (2)two-char String and the first char is the backslash and
                  the second is not the ascii digit or ascii letter.
            */
            char ch = 0;
            if (((regex.count == 1 &&
                 ".$|()[{^?*+\".indexOf(ch = regex.charAt(0)) == -1) ||
                 (regex.length() == 2 &&
                  regex.charAt(0) == '\' &&
                  (((ch = regex.charAt(1))-'0')|('9'-ch)) < 0 &&
                  ((ch-'a')|('z'-ch)) < 0 &&
                  ((ch-'A')|('Z'-ch)) < 0)) &&
                (ch < Character.MIN_HIGH_SURROGATE ||
                 ch > Character.MAX_LOW_SURROGATE))
            {
                int off = 0;
                int next = 0;
                boolean limited = limit > 0;
                ArrayList<String> list = new ArrayList<>();
                while ((next = indexOf(ch, off)) != -1) {
                    if (!limited || list.size() < limit - 1) {
                        list.add(substring(off, next));
                        off = next + 1;
                    } else {    // last one
                        //assert (list.size() == limit - 1);
                        list.add(substring(off, count));
                        off = count;
                        break;
                    }
                }
                // If no match was found, return this
                if (off == 0)
                    return new String[] { this };

                // Add remaining segment
                if (!limited || list.size() < limit)
                    list.add(substring(off, count));

                // Construct result
                int resultSize = list.size();
                if (limit == 0)
                    while (resultSize > 0 && list.get(resultSize-1).length() == 0)
                        resultSize--;
                String[] result = new String[resultSize];
                return list.subList(0, resultSize).toArray(result);
            }
            return Pattern.compile(regex).split(this, limit);
        }

    所以,String.split()快不到哪里去。

    2012-02-29

     

  • 相关阅读:
    四川省选2012 day1 喵星球上的点名 (后缀数组,并不是完全的正解)
    6.2.1 最短路
    5.3.3 敌兵布阵
    6.1.1 Constructing Roads
    6.2.4 Arbitrage
    6.1.6 Jungle Roads
    5.3.6 Cow Sorting (HDU 及 POJ)
    6.2.5 Trucking
    6.1.4 还是畅通工程
    6.1.3 畅通工程
  • 原文地址:https://www.cnblogs.com/personnel/p/4583217.html
Copyright © 2020-2023  润新知