正则表达式（java）

概念：

正则表达式，又称规则表达式。（英语：Regular Expression，在代码中常简写为regex、regexp或RE），计算机科学的一个概念。

正则表通常被用来检索、替换那些符合某个模式(规则)的文本。

用途：

通常用于判断语句，检查字符串是否满足某一格式(匹配)。字符串查找、替换等。

正则表达式是含有一些特殊意义的字符的字符串，这些特殊字符称为正则表达式的元字符。

涉及的类

java.lang.String

java.util.regex.Pattern----模式

java.util.regex.Matcher---结果

示例："."代表任何一个字符。“abc”用“...”匹配

public class RegExp {
    public static void main(String[] args){
        //简单介绍正则表达式
        System.out.println("abc".matches("..."));
    }
}

"d"---0-9任意数字，java正则表达式在元字符基础上需要加""区分转义字符，所以写成“\d”

public class RegExp {
    public static void main(String[] args){
        //简单介绍正则表达式
        p("abc".matches("..."));//匹配
        //"d"---匹配数字
        p("d1234w".replaceAll("\d", "-"));//替换，采用的是反斜杠
    }
    public static void p(Object o){
        System.out.println(o);
    }
}

类的介绍：

Pattern

定义：

A compiled representation of a regular expression.

A regular expression, specified as a string, must first be compiled into an instance of this class. The resulting pattern can then be used to create a Matcher object that can match arbitrary character sequences against the regular expression. All of the state involved in performing a match resides in the matcher, so many matchers can share the same pattern.

A typical invocation sequence is thus

 Pattern p = Pattern.compile("a*b");
 Matcher m = p.matcher("aaaaab");
 boolean b = m.matches();

A matches method is defined by this class as a convenience for when a regular expression is used just once. This method compiles an expression and matches an input sequence against it in a single invocation. The statement

 boolean b = Pattern.matches("a*b", "aaaaab");

is equivalent to the three statements above, though for repeated matches it is less efficient since it does not allow the compiled pattern to be reused.

下面的写法更有效率efficient ，同时Pattern和Matcher提供了更多的方法。

Pattern p = Pattern.compile("a*b");
 Matcher m = p.matcher("aaaaab");
 boolean b = m.matches();

[a-z]代表一个在a-z范围内的字母

[]代表范围；

限定修饰符

？---0次或者多次

*----0次或者多次

+---一次或者多次

{n}---正好出现{n}次

{n,}--至少出现n次

{n,m}出现n~m次

//范围

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegExp {
    public static void main(String[] args){
        
        //范围
        p("a".matches("[abc]"));
        p("a".matches("[^abc]"));//除了abc之外的都可以
        p("A".matches("[a-zA-Z]"));//任意字母都可以
        p("A".matches("[a-z]|[A-Z]"));//a-z或者A-Z，任意字母都可以
        p("A".matches("[a-z[A-Z]]"));//一样
        p("A".matches("[A-Z]&&[REG]"));//属于A-Z而且是EEG中的一个
        
    }
    public static void p(Object o){
        System.out.println(o);
    }
}

//Predefined character classes

"\".matches("\\")----匹配一个反斜线要写4个，前面写一个就会认为是转义，后面写两个会出错，三个转义，四个正确（暂时不清楚原理）

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegExp {
    public static void main(String[] args){
    
        //认识s w d
        p(" 

	".matches("\s{4}"));
        p(" ".matches("\S"));
        p("a_8".matches("\w{3}"));
        p("abc888&^%".matches("[a-z]{1,3}\d+[&^#%]+"));
        p("\".matches("\\"));
        
    }
    public static void p(Object o){
        System.out.println(o);
    }
}

Predefined character classes
`.`	Any character (may or may not match line terminators)
`d`	A digit: `[0-9]`
`D`	A non-digit: `[^0-9]`
`h`	A horizontal whitespace character: `[ xA0u1680u180eu2000-u200au202fu205fu3000]`
`H`	A non-horizontal whitespace character: `[^h]`
`s`	A whitespace character: `[ x0Bf ]`
`S`	A non-whitespace character: `[^s]`
`v`	A vertical whitespace character: `[ x0Bf x85u2028u2029]`
`V`	A non-vertical whitespace character: `[^v]`
`w`	A word character: `[a-zA-Z_0-9]`
`W`	A non-word character: `[^w]`

find()

Attempts to find the next subsequence（子序列） of the input sequence that matches the pattern.

reset()

Resetting a matcher discards all of its explicit state information and sets its append position to zero.

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegExp {
    public static void main(String[] args){
        
        //matches find looking
        Pattern p = Pattern.compile("\d{3,5}");
        String s = "123-45623-789-00";
        Matcher m = p.matcher(s);
        p(m.matches());
        m.reset();//matches方法和find方法会造成冲突,记得要调用reset方法
        p(m.find());
        p(m.start()+"-"+ m.end());
        p(m.find());
        p(m.start()+"-"+ m.end());
        p(m.find());
        p(m.start()+"-"+ m.end());
        p(m.lookingAt());
        p(m.lookingAt());
        p(m.lookingAt());
        p(m.lookingAt());
        
        
    }
    public static void p(Object o){
        System.out.println(o);
    }
}

查找替代

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegExp {
    public static void main(String[] args){
        
        //replacement   可以参考appendReplacement()在API文档里面的描述
        Pattern p = Pattern.compile("java",Pattern.CASE_INSENSITIVE);
        Matcher m = p.matcher("java Java Java I love Java  u hate JAVA sfarwwfr");
       // p(m.replaceAll("JAVA"));//所有都替换成JAVA
        StringBuffer buf = new StringBuffer();
        int i = 0;
        while(m.find()){  //寻找
            i++;
            if (i%2 == 0) { //单数替换为java双数替换成JAVA
                m.appendReplacement(buf, "java");
            } else {
                m.appendReplacement(buf, "JAVA");
            }
        }
        m.appendTail(buf);//appendReplacement()多次调用后用此方法补全尾部
       p(buf);     
    }
    public static void p(Object o){
        System.out.println(o);
    }
}

分组

Matcher.group（)-----Returns the input subsequence matched by the previous match.

1 ((A)(B(C)))
2 (A)
3 (B(C))
4 (C)

group运用括号可以得到不同的分组，eg:group(1);group(2)

public class RegExp {
    public static void main(String[] args){
    
        
        //groupregex
        Pattern p = Pattern.compile("(\d{3,5})|([a-z]{2})");
        String s = "123aa-34345bb-234cc-00";
        Matcher m = p.matcher(s);
        while (m.find()) {
            p(m.group(2));
        }
    }
    public static void p(Object o){
        System.out.println(o);
    }
}

总结几个重要的知识点：

相关阅读:
js post提交
 JS转换HTML转义符
 HTML 空格的表示符号 nbsp / ensp / emsp 的区别
 JS解析XML文件和XML字符串
 js数组
 javaScript系列：js中获取时间new Date()详细介绍
 父类和子类（指针，对象，引用，盲点）
震惊~数组居然可以直接比较大小
 二分递归求最大次大的方法（数组的下标的表示方法居然可以方括号内部加减）
二分递归
原文地址：https://www.cnblogs.com/limingxian537423/p/6995025.html