• .net 正则表达式类


    NET Framework 开发员指南  

    以下各节介绍 .NET Framework 正则表达式类。

    Regex

    Regex 类表示不可变(只读)正则表达式类。它还包含各种静态方法,允许在不显式创建其他类的实例的情况下使用其他正则表达式类。

    以下代码示例创建了 Regex 类的实例并在初始化对象时定义一个简单的正则表达式。请注意,使用了附加的反斜杠作为转义字符,它将 \s 匹配字符类中的反斜杠指定为原义字符。

    [Visual Basic]
        ' Declare object variable of type Regex.
        Dim r As Regex 
        ' Create a Regex object and define its regular expression.
        r = New Regex("\s2000")
    [C#]
        // Declare object variable of type Regex.
        Regex r; 
        // Create a Regex object and define its regular expression.
        r = new Regex("\\s2000"); 

    Match

    Match 类表示正则表达式匹配操作的结果。以下示例使用 Regex 类的 Match 方法返回 Match 类型的对象,以便找到输入字符串中的第一个匹配项。此示例使用 Match 类的 Match.Success 属性来指示是否已找到匹配。

    [Visual Basic]
        ' cCreate a new Regex object.
        Dim r As New Regex("abc") 
        ' Find a single match in the input string.
        Dim m As Match = r.Match("123abc456") 
        If m.Success Then
            ' Print out the character position where a match was found. 
            ' (Character position 3 in this case.)
            Console.WriteLine("Found match at position " & m.Index.ToString())
        End If
    [C#]
        // Create a new Regex object.
        Regex r = new Regex("abc"); 
        // Find a single match in the string.
        Match m = r.Match("123abc456"); 
        if (m.Success) 
        {
            // Print out the character position where a match was found. 
            // (Character position 3 in this case.)
            Console.WriteLine("Found match at position " + m.Index);
        }

    MatchCollection

    MatchCollection 类表示成功的非重叠匹配的序列。该集合为不可变(只读)的,并且没有公共构造函数。MatchCollection 的实例是由 Regex.Matches 属性返回的。

    以下示例使用 Regex 类的 Matches 方法,通过在输入字符串中找到的所有匹配填充 MatchCollection。该示例将此集合复制到一个字符串数组和一个整数数组中,其中字符串数组用以保存每个匹配项,整数数组用以指示每个匹配项的位置。

    [Visual Basic]
        Dim mc As MatchCollection
        Dim results(20) As String
        Dim matchposition(20) As Integer
    
        ' Create a new Regex object and define the regular expression.
        Dim r As New Regex("abc")
        ' Use the Matches method to find all matches in the input string.
        mc = r.Matches("123abc4abcd")
        ' Loop through the match collection to retrieve all 
        ' matches and positions.
        Dim i As Integer
        For i = 0 To mc.Count - 1
            ' Add the match string to the string array.
            results(i) = mc(i).Value
            ' Record the character position where the match was found.
            matchposition(i) = mc(i).Index
        Next i
    [C#]
        MatchCollection mc;
        String[] results = new String[20];
        int[] matchposition = new int[20];
        
        // Create a new Regex object and define the regular expression.
        Regex r = new Regex("abc"); 
        // Use the Matches method to find all matches in the input string.
        mc = r.Matches("123abc4abcd");
        // Loop through the match collection to retrieve all 
        // matches and positions.
        for (int i = 0; i < mc.Count; i++) 
        {
            // Add the match string to the string array.   
            results[i] = mc[i].Value;
            // Record the character position where the match was found.
            matchposition[i] = mc[i].Index;   
        }

    GroupCollection

    GroupCollection 类表示捕获的组的集合并返回单个匹配中捕获的组的集合。该集合为不可变(只读)的,并且没有公共构造函数。GroupCollection 的实例在 Match.Groups 属性返回的集合中返回。

    以下控制台应用程序示例查找并输出由正则表达式捕获的组的数目。有关如何提取组集合的每一成员中的各个捕获项的示例,请参见下面一节的 Capture Collection 示例。

    [Visual Basic]
        Imports System
        Imports System.Text.RegularExpressions
    
        Public Class RegexTest
            Public Shared Sub RunTest()
                ' Define groups "abc", "ab", and "b".
                Dim r As New Regex("(a(b))c") 
                Dim m As Match = r.Match("abdabc")
                Console.WriteLine("Number of groups found = " _
                & m.Groups.Count.ToString())
            End Sub    
        
            Public Shared Sub Main()
                RunTest()
            End Sub
        End Class
    [C#]
        using System;
        using System.Text.RegularExpressions;
    
        public class RegexTest 
        {
            public static void RunTest() 
            {
                // Define groups "abc", "ab", and "b".
                Regex r = new Regex("(a(b))c"); 
                Match m = r.Match("abdabc");
                Console.WriteLine("Number of groups found = " + m.Groups.Count);
            }
            public static void Main() 
            {
                RunTest();
            }
        }

    该示例产生下面的输出。

    [Visual Basic]
        Number of groups found = 3
    [C#]
        Number of groups found = 3

    CaptureCollection

    CaptureCollection 类表示捕获的子字符串的序列,并且返回由单个捕获组执行的捕获的集合。由于限定符,捕获组可以在单个匹配中捕获多个字符串。Captures 属性(CaptureCollection 类的对象)是作为 Matchgroup 类的成员提供的,以便于对捕获的子字符串的集合的访问。

    例如,如果使用正则表达式 ((a(b))c)+(其中 + 限定符指定一个或多个匹配)从字符串“abcabcabc”中捕获匹配,则子字符串的每一匹配的 GroupCaptureCollection 将包含三个成员。

    以下控制台应用程序示例使用正则表达式 (Abc)+ 来查找字符串“XYZAbcAbcAbcXYZAbcAb”中的一个或多个匹配。该示例阐释了使用 Captures 属性来返回多组捕获的子字符串。

    [Visual Basic]
        Imports System
        Imports System.Text.RegularExpressions
    
        Public Class RegexTest
            Public Shared Sub RunTest()
                Dim counter As Integer
                Dim m As Match
                Dim cc As CaptureCollection
                Dim gc As GroupCollection
                ' Look for groupings of "Abc".
                Dim r As New Regex("(Abc)+") 
                ' Define the string to search.
                m = r.Match("XYZAbcAbcAbcXYZAbcAb")
                gc = m.Groups
                
                ' Print the number of groups.
                Console.WriteLine("Captured groups = " & gc.Count.ToString())
                
                ' Loop through each group.
                Dim i, ii As Integer
                For i = 0 To gc.Count - 1
                    cc = gc(i).Captures
                    counter = cc.Count
                    
                    ' Print number of captures in this group.
                    Console.WriteLine("Captures count = " & counter.ToString())
                    
                    ' Loop through each capture in group.            
                    For ii = 0 To counter - 1
                        ' Print capture and position.
                        Console.WriteLine(cc(ii).ToString() _
                            & "   Starts at character " & cc(ii).Index.ToString())
                    Next ii
                Next i
            End Sub
        
            Public Shared Sub Main()
                RunTest()
             End Sub
        End Class
    [C#]
        using System;
        using System.Text.RegularExpressions;
    
        public class RegexTest 
            {
            public static void RunTest() 
            {
                int counter;
                Match m;
                CaptureCollection cc;
                GroupCollection gc;
    
                // Look for groupings of "Abc".
                Regex r = new Regex("(Abc)+"); 
                // Define the string to search.
                m = r.Match("XYZAbcAbcAbcXYZAbcAb"); 
                gc = m.Groups;
    
                // Print the number of groups.
                Console.WriteLine("Captured groups = " + gc.Count.ToString());
    
                // Loop through each group.
                for (int i=0; i < gc.Count; i++) 
                {
                    cc = gc[i].Captures;
                    counter = cc.Count;
                    
                    // Print number of captures in this group.
                    Console.WriteLine("Captures count = " + counter.ToString());
                    
                    // Loop through each capture in group.
                    for (int ii = 0; ii < counter; ii++) 
                    {
                        // Print capture and position.
                        Console.WriteLine(cc[ii] + "   Starts at character " + 
                            cc[ii].Index);
                    }
                }
            }
    
            public static void Main() {
                RunTest();
            }
        }

    此示例返回下面的输出结果。

    [Visual Basic]
        Captured groups = 2
        Captures count = 1
        AbcAbcAbc   Starts at character 3
        Captures count = 3
        Abc   Starts at character 3
        Abc   Starts at character 6
        Abc   Starts at character 9
    [C#]
        Captured groups = 2
        Captures count = 1
        AbcAbcAbc   Starts at character 3
        Captures count = 3
        Abc   Starts at character 3
        Abc   Starts at character 6
        Abc   Starts at character 9

    Group

    group 类表示来自单个捕获组的结果。因为 Group 可以在单个匹配中捕获零个、一个或更多的字符串(使用限定符),所以它包含 Capture 对象的集合。因为 Group 继承自 Capture,所以可以直接访问最后捕获的子字符串(Group 实例本身等价于 Captures 属性返回的集合的最后一项)。

    Group 的实例是由 Match.Groups(groupnum) 属性返回的,或者在使用“(?<groupname>)”分组构造的情况下,是由 Match.Groups("groupname") 属性返回的。

    以下代码示例使用嵌套的分组构造来将子字符串捕获到组中。

    [Visual Basic]
        Dim matchposition(20) As Integer
        Dim results(20) As String
        ' Define substrings abc, ab, b.
        Dim r As New Regex("(a(b))c") 
        Dim m As Match = r.Match("abdabc")
        Dim i As Integer = 0
        While Not (m.Groups(i).Value = "")    
           ' Copy groups to string array.
           results(i) = m.Groups(i).Value     
           ' Record character position. 
           matchposition(i) = m.Groups(i).Index 
            i = i + 1
        End While
    [C#]
        int[] matchposition = new int[20];
        String[] results = new String[20];
        // Define substrings abc, ab, b.
        Regex r = new Regex("(a(b))c"); 
        Match m = r.Match("abdabc");
        for (int i = 0; m.Groups[i].Value != ""; i++) 
        {
            // Copy groups to string array.
            results[i]=m.Groups[i].Value; 
            // Record character position.
            matchposition[i] = m.Groups[i].Index; 
        }

    此示例返回下面的输出结果。

    [Visual Basic]
        results(0) = "abc"   matchposition(0) = 3
        results(1) = "ab"    matchposition(1) = 3
        results(2) = "b"     matchposition(2) = 4
    [C#]
        results[0] = "abc"   matchposition[0] = 3
        results[1] = "ab"    matchposition[1] = 3
        results[2] = "b"     matchposition[2] = 4

    以下代码示例使用命名的分组构造,从包含“DATANAME:VALUE”格式的数据的字符串中捕获子字符串,正则表达式通过冒号“:”拆分数据。

    [Visual Basic]
        Dim r As New Regex("^(?<name>\w+):(?<value>\w+)")
        Dim m As Match = r.Match("Section1:119900")
    [C#]
        Regex r = new Regex("^(?<name>\\w+):(?<value>\\w+)");
        Match m = r.Match("Section1:119900");

    此正则表达式返回下面的输出结果。

    [Visual Basic]
        m.Groups("name").Value = "Section1"
        m.Groups("value").Value = "119900"
    [C#]
        m.Groups["name"].Value = "Section1"
        m.Groups["value"].Value = "119900"

    Capture

    Capture 类包含来自单个子表达式捕获的结果。

    以下示例在 Group 集合中循环,从 Group 的每一成员中提取 Capture 集合,并且将变量 posn length 分别分配给找到每一字符串的初始字符串中的字符位置,以及每一字符串的长度。

    [Visual Basic]
        Dim r As Regex
        Dim m As Match
        Dim cc As CaptureCollection
        Dim posn, length As Integer
    
        r = New Regex("(abc)*")
        m = r.Match("bcabcabc")
        Dim i, j As Integer
        i = 0
        While m.Groups(i).Value <> ""
            ' Grab the Collection for Group(i).
            cc = m.Groups(i).Captures
            For j = 0 To cc.Count - 1
                ' Position of Capture object.
                posn = cc(j).Index
                ' Length of Capture object.
                length = cc(j).Length
            Next j
            i += 1
        End While
    [C#]
        Regex r;
        Match m;
        CaptureCollection cc;
        int posn, length;
    
        r = new Regex("(abc)*");
        m = r.Match("bcabcabc");
        for (int i=0; m.Groups[i].Value != ""; i++) 
        {
            // Capture the Collection for Group(i).
            cc = m.Groups[i].Captures; 
            for (int j = 0; j < cc.Count; j++) 
            {
                // Position of Capture object.
                posn = cc[j].Index; 
                // Length of Capture object.
                length = cc[j].Length; 
            }
        }

    请参见

    .NET Framework 正则表达式 | System.Text.RegularExpressions




    相关
    解读C#中的正则表达式
    http://www.knowsky.com/4202.html

  • 相关阅读:
    leetcode35. search Insert Position
    leetcode26.Remove Duplicates from Sorted Array
    leetcode46.Permutation & leetcode47.Permutation II
    leetcode.5 Longest Palindromic Substring
    [转载] C++中new和malloc的区别
    [转载] C++中的自由存储区和堆
    xCode8以及iOS10 的新特性
    cell上添加倒计时,以及时差问题的解决
    cell的复用机制
    iOS 懒加载模式
  • 原文地址:https://www.cnblogs.com/runfeng/p/215734.html
Copyright © 2020-2023  润新知