• 正则表达式(java)


    概念:

    正则表达式,又称规则表达式。(英语:Regular Expression,在代码中常简写为regex、regexp或RE),计算机科学的一个概念。

    正则表通常被用来检索、替换那些符合某个模式(规则)的文本。

    用途:

    通常用于判断语句,检查字符串是否满足某一格式(匹配)。字符串查找、替换等。

    正则表达式是含有一些特殊意义的字符的字符串,这些特殊字符称为正则表达式的元字符。

    涉及的类

    java.lang.String

    java.util.regex.Pattern----模式

    java.util.regex.Matcher---结果

    示例:"."代表任何一个字符。“abc”用“...”匹配

    public class RegExp {
        public static void main(String[] args){
            //简单介绍正则表达式
            System.out.println("abc".matches("..."));
        }
    }

    "d"---0-9任意数字,java正则表达式在元字符基础上需要加""区分转义字符,所以写成“\d”

    public class RegExp {
        public static void main(String[] args){
            //简单介绍正则表达式
            p("abc".matches("..."));//匹配
            //"d"---匹配数字
            p("d1234w".replaceAll("\d", "-"));//替换,采用的是反斜杠
        }
        public static void p(Object o){
            System.out.println(o);
        }
    }

    类的介绍:

    Pattern

    定义:

    A compiled representation of a regular expression.

    A regular expression, specified as a string, must first be compiled into an instance of this class. The resulting pattern can then be used to create a Matcher object that can match arbitrary character sequences against the regular expression. All of the state involved in performing a match resides in the matcher, so many matchers can share the same pattern.

    A typical invocation sequence is thus

     Pattern p = Pattern.compile("a*b");
     Matcher m = p.matcher("aaaaab");
     boolean b = m.matches();

    matches method is defined by this class as a convenience for when a regular expression is used just once. This method compiles an expression and matches an input sequence against it in a single invocation. The statement

     boolean b = Pattern.matches("a*b", "aaaaab");

    is equivalent to the three statements above, though for repeated matches it is less efficient since it does not allow the compiled pattern to be reused.

    下面的写法更有效率efficient ,同时Pattern和Matcher提供了更多的方法。

    Pattern p = Pattern.compile("a*b");
     Matcher m = p.matcher("aaaaab");
     boolean b = m.matches();

    [a-z]代表一个在a-z范围内的字母

    []代表范围;

    限定修饰符

    ?---0次或者多次

    *----0次或者多次

    +---一次或者多次

    {n}---正好出现{n}次

    {n,}--至少出现n次

    {n,m}出现n~m次

    //范围

    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    
    public class RegExp {
        public static void main(String[] args){
            
            //范围
            p("a".matches("[abc]"));
            p("a".matches("[^abc]"));//除了abc之外的都可以
            p("A".matches("[a-zA-Z]"));//任意字母都可以
            p("A".matches("[a-z]|[A-Z]"));//a-z或者A-Z,任意字母都可以
            p("A".matches("[a-z[A-Z]]"));//一样
            p("A".matches("[A-Z]&&[REG]"));//属于A-Z而且是EEG中的一个
            
        }
        public static void p(Object o){
            System.out.println(o);
        }
    }

    //Predefined character classes

    "\".matches("\\")----匹配一个反斜线要写4个,前面写一个就会认为是转义,后面写两个会出错,三个转义,四个正确(暂时不清楚原理)
    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    
    public class RegExp {
        public static void main(String[] args){
        
            //认识s w d
            p(" 
    
    	".matches("\s{4}"));
            p(" ".matches("\S"));
            p("a_8".matches("\w{3}"));
            p("abc888&^%".matches("[a-z]{1,3}\d+[&^#%]+"));
            p("\".matches("\\"));
            
        }
        public static void p(Object o){
            System.out.println(o);
        }
    }
    Predefined character classes
    . Any character (may or may not match line terminators)
    d A digit: [0-9]
    D A non-digit: [^0-9]
    h A horizontal whitespace character: [ xA0u1680u180eu2000-u200au202fu205fu3000]
    H A non-horizontal whitespace character: [^h]
    s A whitespace character: [ x0Bf ]
    S A non-whitespace character: [^s]
    v A vertical whitespace character: [ x0Bf x85u2028u2029]
    V A non-vertical whitespace character: [^v]
    w A word character: [a-zA-Z_0-9]
    W A non-word character: [^w]

     find()

    Attempts to find the next subsequence(子序列) of the input sequence that matches the pattern.

    reset()

    Resetting a matcher discards all of its explicit state information and sets its append position to zero.

    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    
    public class RegExp {
        public static void main(String[] args){
            
            //matches find looking
            Pattern p = Pattern.compile("\d{3,5}");
            String s = "123-45623-789-00";
            Matcher m = p.matcher(s);
            p(m.matches());
            m.reset();//matches方法和find方法会造成冲突,记得要调用reset方法
            p(m.find());
            p(m.start()+"-"+ m.end());
            p(m.find());
            p(m.start()+"-"+ m.end());
            p(m.find());
            p(m.start()+"-"+ m.end());
            p(m.lookingAt());
            p(m.lookingAt());
            p(m.lookingAt());
            p(m.lookingAt());
            
            
        }
        public static void p(Object o){
            System.out.println(o);
        }
    }

    查找替代

    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    
    public class RegExp {
        public static void main(String[] args){
            
            //replacement   可以参考appendReplacement()在API文档里面的描述
            Pattern p = Pattern.compile("java",Pattern.CASE_INSENSITIVE);
            Matcher m = p.matcher("java Java Java I love Java  u hate JAVA sfarwwfr");
           // p(m.replaceAll("JAVA"));//所有都替换成JAVA
            StringBuffer buf = new StringBuffer();
            int i = 0;
            while(m.find()){  //寻找
                i++;
                if (i%2 == 0) { //单数替换为java双数替换成JAVA
                    m.appendReplacement(buf, "java");
                } else {
                    m.appendReplacement(buf, "JAVA");
                }
            }
            m.appendTail(buf);//appendReplacement()多次调用后用此方法补全尾部
           p(buf);     
        }
        public static void p(Object o){
            System.out.println(o);
        }
    }

    分组

    Matcher.group()-----Returns the input subsequence matched by the previous match.

    1 ((A)(B(C)))
    2 (A)
    3 (B(C))
    4 (C)

    group运用括号可以得到不同的分组,eg:group(1);group(2)

    public class RegExp {
        public static void main(String[] args){
        
            
            //groupregex
            Pattern p = Pattern.compile("(\d{3,5})|([a-z]{2})");
            String s = "123aa-34345bb-234cc-00";
            Matcher m = p.matcher(s);
            while (m.find()) {
                p(m.group(2));
            }
        }
        public static void p(Object o){
            System.out.println(o);
        }
    }

    总结几个重要的知识点:

  • 相关阅读:
    js post提交
    JS转换HTML转义符
    HTML 空格的表示符号 nbsp / ensp / emsp 的区别
    JS解析XML文件和XML字符串
    js数组
    javaScript系列:js中获取时间new Date()详细介绍
    父类和子类(指针,对象,引用 ,盲点)
    震惊~数组居然可以直接比较大小
    二分递归求最大次大的方法(数组的下标的表示方法居然可以方括号内部加减)
    二分递归
  • 原文地址:https://www.cnblogs.com/limingxian537423/p/6995025.html
Copyright © 2020-2023  润新知