• 对于C11中的正則表達式的使用


    Regular Expression Special Characters

    "."---Any single character(a "wildcard")

    "["---Begin character class

    "]"---End character class

    "{"---Begin count

    "}"---End count

    "("---Begin grouping

    ")"---End grouping

    ""---Next character has a special meaning

    "*"---Zero or more

    "+"---One or more

    "?"---Optional(zero or one)

    "!"---Alternative(or)

    "^"---Start of line; negation

    "$"---End of line


    Example:

    case 1:

            ^A*B+C?

    $

    explain 1:

            以A开头。有多个或者没有B。有至少一个C。之后有没有都能够,结束。



    A pattern can be optional or repeated(the default is exactly once) by adding a suffix:


    Repetition

    {n}---Exactly n times;

    {n,}---no less than n times;

    {n,m}---at least n times and at most m times;

    *---Zero or more , that is , {0,}

    +---One or more, that is ,{1,}

    ?---Optional(zero or one), that is {0,1}


    Example:

    case 1:

            A{3}B{2,4}C*

    explain 1:

            AAABBC  or  AAABBB


    A suffix ? after any of the repetition notations makes the pattern matcher "lazy" or "non-greedy".

    That is , when looking for a pattern, it will look for the shortest match rather than the lonest.

    By default, the pattern matcher always looks for the longest match (similar to C++'s Max rule).

    Consider:

        ​ababab

    The pattern (ab)*matches all of "ababab". However, (ab)*? matches only the first "ab".

    The most common character classifications have names:

    Character Classes

    alnum --- Any alphanumeric character

    alpha --- Any alphanumeric character

    blank --- Any whitespace character that is not a line separator

    cntrl --- Any control character

    d --- Any decimal digit

    digit --- Any decimal digit

    graph --- Any graphical character

    lower --- Any lowercase character

    print --- Any printable character

    punct --- Any punctuation character

    s --- Any whitespace character

    space --- Any whitespace character

    upper --- Any uppercase charater

    w --- Any word character(alphnumeric characters plus the underscore)

    xdigit --- Any hexadecimal digit character


    Several character classes are supported by shorthand notation:

    Character Class Abbreviations
    d --- A decimal digit --- [[:digit:]]

    s --- A space (space tab,...) --- [[:space:]]

    w --- A letter(a-z) or digit(0-9) or underscore(_) --- [_[:alnum:]]

    D --- Not d --- [^[:digit:]]

    S --- Not s --- [^[:space:]]

    W --- Not w --- [^_[:alnum:]]

    In addition, languages supporting regular expressions often provide:

    Nonstandard (but Common)  Character Class Abbreviations

    l --- A lowercase character --- [[:lower:]]

    u --- An uppercase character --- [[:upper;]]

    L --- Not l --- [^[:lower:]]

    U --- Not u --- [^[:upper:]]


    Note the doubling of the backslash to include a backslash in an ordinary string literal.

    As usual, backslashes can denote special charaters:

    Special Characters

    --- Newline

    --- Tab

    \ --- One backslash

    xhh -- Unicode characters expressed using twp hexadecimal digits

    uhhh --- Unicode characters expressed using four hexadecimal digits


    To add to the opportunites for confusion, two further logically differents uses of the backslash are provided:

    Special Characters

     --- The first or last character of a word (a "boundary character")

    B --- Not a 

    i --- The ith sub_match in this pattern


    Here are some examples of patterns:

    Ax*    ​    ​//A,Ax,Axxxx

    Ax+    ​    ​//Ax,Axxx not A

    d-?

    d    ​//1-2,12 not 1--2

    w{2}-d{4,5}    ​    ​//Ab-1234,XX54321,22-5432

    (d*:)?(d+)    ​    ​  //12:3, 1:23, 123, :123 Not 123:

    (bs|BS)    ​    ​    ​    ​  //bs ,BS Not bS

    [aeiouy]    ​    ​    ​    ​//a,o,u    An English vowel, not x

    [^aeiouy]    ​    ​    ​ //x,k     Not an English vowel, not e

    [a^eiouy]    ​    ​    ​ //a,^,o,u   An Engish vowel or ^


    以下是測试代码:

    #include <iostream>
    #include <regex>
    
    using namespace std;
    
    int main()
    {
        const char* reg_esp = "^A*B+C?

    $"; regex rgx(reg_esp); cmatch match; const char* target = "AAAAAAAAABBBBBBBBC"; if(regex_search(target,match,rgx)) { for(size_t a = 0;a < match.size();a++) cout << string(match[a].first,match[a].second) << endl; } else cout << "No Match Case !" << endl; return 0; }


     


  • 相关阅读:
    文字
    <script type="text/x-template"> 模板
    防xss攻击
    url
    symmfony
    composer
    header 和http状态码
    bootstrap
    linux的设置ip连接crt,修改主机名,映射,建文件
    Centos上传下载小工具lrzsz
  • 原文地址:https://www.cnblogs.com/mengfanrong/p/5077825.html
Copyright © 2020-2023  润新知