这里面我们分析一下replace与replaceAll方法的差异以及原理。
replace各个方法的定义
一、replaceFirst方法
public String replaceFirst(String regex, String replacement) { return Pattern.compile(regex).matcher(this).replaceFirst(replacement); }
二、replace方法
public String replace(CharSequence target, CharSequence replacement) { return Pattern.compile(target.toString(), Pattern.LITERAL).matcher(this).replaceAll(Matcher.quoteReplacement(replacement.toString())); }
三、replaceAll方法
public String replaceAll(String regex, String replacement) { return Pattern.compile(regex).matcher(this).replaceAll(replacement); }
replace各个方法的原理
我们通过以下的例子来分析他们的原理。
@Test public void stringReplace() { replaceFirst("year = 1929. month=07, day=29, other=\d{2}"); } public void replaceFirst(String string) { System.out.println(string.replaceFirst("\d{2}", "--")); System.out.println(string.replace("\d{2}", "--")); System.out.println(string.replace("29", "--")); System.out.println(string.replaceAll("\d{2}", "--")); } // year = --29. month=07, day=29, other=d{2} // year = 1929. month=07, day=29, other=-- // year = 19--. month=07, day=--, other=d{2} // year = ----. month=--, day=--, other=d{2}
一、首先我们分析一下replaceFirst与replaceAll方法,他们的区别在于Pattern构建之后Matcher调用的方法不同。一个是reaplceFirst、一个是replaceAll方法。这两个方法现在可以分析一下。
1、首先对于Matcher的replceFirst方法:可以看到只调用一下的appendReplacement和appendTail方法。关于appendReplacement方法后面可以贴出源码,实现比较复杂
public String replaceFirst(String replacement) { if (replacement == null) throw new NullPointerException("replacement"); reset(); if (!find()) return text.toString(); StringBuffer sb = new StringBuffer(); appendReplacement(sb, replacement); appendTail(sb); return sb.toString(); }
2、对于Matcher的replceAll方法,和上述的replaceFirst方法类似。只不过是多次调用了appendReplacement的替换函数。直到没有匹配为止
public String replaceAll(String replacement) { reset(); boolean result = find(); if (result) { StringBuffer sb = new StringBuffer(); do { appendReplacement(sb, replacement); result = find(); } while (result); appendTail(sb); return sb.toString(); } return text.toString(); }
二、对于replace方法,和上述的replaceAll方法主要有以下两种区别。
1、在Pattern.compile时,添加了Pattern.LITERAL的flag,表示pattern会把regex当作纯文本来处理了。比如\d{2}不转义成两个0-9的数字,而是当作纯文本\d{2}看待。
2、在调用MatcherMatcher.quoteReplacement(replacement.toString())方法对replacement做了对特殊符号($和)作去除转义的操作。
public static String quoteReplacement(String s) { if ((s.indexOf('\') == -1) && (s.indexOf('$') == -1)) return s; StringBuilder sb = new StringBuilder(); for (int i=0; i<s.length(); i++) { char c = s.charAt(i); if (c == '\' || c == '$') { sb.append('\'); } sb.append(c); } return sb.toString(); }
但是为何只对\和$做处理呢?
三、以下是我们的重点appendReplacement方法
1 public Matcher appendReplacement(StringBuffer sb, String replacement) { 2 3 // If no match, return error 4 if (first < 0) 5 throw new IllegalStateException("No match available"); 6 7 // Process substitution string to replace group references with groups 8 int cursor = 0; 9 StringBuilder result = new StringBuilder(); 10 11 while (cursor < replacement.length()) { 12 char nextChar = replacement.charAt(cursor); 13 if (nextChar == '\') { 14 cursor++; 15 if (cursor == replacement.length()) 16 throw new IllegalArgumentException("character to be escaped is missing"); 17 nextChar = replacement.charAt(cursor); 18 result.append(nextChar); 19 cursor++; 20 } else if (nextChar == '$') { 21 // Skip past $ 22 cursor++; 23 // Throw IAE if this "$" is the last character in replacement 24 if (cursor == replacement.length()) 25 throw new IllegalArgumentException("Illegal group reference: group index is missing"); 26 nextChar = replacement.charAt(cursor); 27 int refNum = -1; 28 if (nextChar == '{') { 29 cursor++; 30 StringBuilder gsb = new StringBuilder(); 31 while (cursor < replacement.length()) { 32 nextChar = replacement.charAt(cursor); 33 if (ASCII.isLower(nextChar) || 34 ASCII.isUpper(nextChar) || 35 ASCII.isDigit(nextChar)) { 36 gsb.append(nextChar); 37 cursor++; 38 } else { 39 break; 40 } 41 } 42 if (gsb.length() == 0) 43 throw new IllegalArgumentException("named capturing group has 0 length name"); 44 if (nextChar != '}') 45 throw new IllegalArgumentException("named capturing group is missing trailing '}'"); 46 String gname = gsb.toString(); 47 if (ASCII.isDigit(gname.charAt(0))) 48 throw new IllegalArgumentException("capturing group name {" + gname + "} starts with digit character"); 49 if (!parentPattern.namedGroups().containsKey(gname)) 50 throw new IllegalArgumentException("No group with name {" + gname + "}"); 51 refNum = parentPattern.namedGroups().get(gname); 52 cursor++; 53 } else { 54 // The first number is always a group 55 refNum = (int)nextChar - '0'; 56 if ((refNum < 0)||(refNum > 9)) 57 throw new IllegalArgumentException("Illegal group reference"); 58 cursor++; 59 // Capture the largest legal group string 60 boolean done = false; 61 while (!done) { 62 if (cursor >= replacement.length()) { 63 break; 64 } 65 int nextDigit = replacement.charAt(cursor) - '0'; 66 if ((nextDigit < 0)||(nextDigit > 9)) { // not a number 67 break; 68 } 69 int newRefNum = (refNum * 10) + nextDigit; 70 if (groupCount() < newRefNum) { 71 done = true; 72 } else { 73 refNum = newRefNum; 74 cursor++; 75 } 76 } 77 } 78 // Append group 79 if (start(refNum) != -1 && end(refNum) != -1) 80 result.append(text, start(refNum), end(refNum)); 81 } else { 82 result.append(nextChar); 83 cursor++; 84 } 85 } 86 // Append the intervening text 87 sb.append(text, lastAppendPosition, first); 88 // Append the match substitution 89 sb.append(result); 90 91 lastAppendPosition = last; 92 return this; 93 }
四、以下是appendTail的代码
public StringBuffer appendTail(StringBuffer sb) { sb.append(text, lastAppendPosition, getTextLength()); return sb; }