C#正则表达式类Match和Group类的理解

正则表达式可以看做一种有特定功能的小型编程语言，在一段文本中定位子字符串。利用正则表达式可以快速地分析大量的文本以找到特定的字符模式；提取、编辑、替换或删除文本子字符串；或将提取的字符串添加到集合。正则表达式的基本语法可参见：深入浅出之正则表达式（一），深入浅出之正则表达式（二）。

C#命名空间System.Text.RegularExpressions提供了支持正则表达式操作的类。这些类主要包括Regex，MatchCollection，Match，GroupCollection，Group，CaputerCollection和Caputer，下图表示了这些类之间的关系。

        public static void Main(string[] args)

         {

             string text = "I've found this amazing URL at http://www.sohu.com ,and then find ftp://ftp.sohu.com is better.";

             string pattern = @"\b(\S+)://(\S+)\b";  //匹配URL的模式

            MatchCollection mc = Regex.Matches(text, pattern); //满足pattern的匹配集合

            Console.WriteLine("文本中包含的URL地址有：");

             foreach (Match match in mc)

             {

                 Console.WriteLine(match.ToString());

             }

             Console.Read();

         }

         /*

          * 运行后输出如下：

         * 文本中包含的URL地址有：

         * http://www.sohu.com

          * ftp://ftp.sohu.com

         */

现在，要求变了，不仅要找出URL，还要找出每个URL的协议和域名地址，这时就要用到正则表达式的分组功能了。分组是要匹配的模式pattern用小括号括起来，分成不同的组，如可以把上面例子中的模式改为：string pattern = @"\b(?<protocol>\S+)://(?<address>\S+)\b"; 这样就用括号分成了两个组（实际上是三个组，因为匹配本身可以看做一个大组），"?<protocol>"和"?<address>"定义了每个组的别名protocol和address，这不是必须的，只是方便我们获取需要的组。示例代码如下

    public static void Main(string[] args)

     {

         string text = "I've found this amazing URL at http://www.sohu.com ,and then find ftp://ftp.sohu.com is better.";

         string pattern = @"\b(?<protocol>\S+)://(?<address>\S+)\b"; //匹配URL的模式,并分组

        MatchCollection mc = Regex.Matches(text, pattern); //满足pattern的匹配集合

        Console.WriteLine("文本中包含的URL地址有：");

         foreach (Match match in mc)

         {

             GroupCollection gc = match.Groups;

             string outputText = "URL:" + match.Value + "；Protocol:" + gc["protocol"].Value + "；Address:" + gc["address"].Value;

             Console.WriteLine(outputText);

         }

         Console.Read();

     }

     /**

      * 运行后输出如下：

     * 文本中包含的URL地址有：

     * URL:http://www.sohu.com；Protocol:http；Address:www.sohu.com

      * URL:ftp://ftp.sohu.com；Protocol:ftp；Address:ftp.sohu.com

     */

秒客网

C#正则表达式类Match和Group类的理解

相关文章