Java对正则表达式的支持(一)

　　Java对正则表达式的支持主要体现在String、Pattern、Matcher和Scanner类。

　1.Pattern、Matcher

　　先看一个Pattern和Matcher类使用正则表达式的例子。

public class PatternTest {    

    public static void main(String [ ] args) {

        String testString = "abcabcabcdefabc";

        String [] regexs = new String []{"abc+","(abc)+","(abc){2,}"};

        for(String regex:regexs){

            Pattern p = Pattern.compile(regex);

            Matcher m = p.matcher(testString);

            System.out.println("test regex: " + regex);

            while(m.find()){

                System.out.println("match " + m.group() + " at position " + m.start() + "-" + (m.end() -1));

            }

        }

    }

}

　　运行的结果为：

test regex: abc+

match abc at position 0-2

match abc at position 3-5

match abc at position 6-8

match abc at position 12-14

test regex: (abc)+

match abcabcabc at position 0-8

match abc at position 12-14

test regex: (abc){2,}

match abcabcabc at position 0-8

　　先对几个正则表达式的含义进行解释：

　　abc+：匹配abc或者abcc或者abccc等。

　　(abc)+：根据贪婪原则，匹配1次或者多次连续的abc，匹配最长的字符串。

　　(abc){2,}：abc至少出现2次，匹配abcabc或者abcabcabc等。

　　测试一个字符串是否匹配某个正则表达式，可以使用下面的方法：

String testString = "abcabcabcdefabc";

System.out.println(Pattern.matches("abc+", testString));

System.out.println(Pattern.matches("abc+", "abccc"));

　　输出结果为:false 和 true。

　　查看子匹配的方法如下，使用group方法：

import java.util.regex.Matcher;

import java.util.regex.Pattern;

public class PatternTest2 {

    public static void main(String [ ] args) {

        String poem = "'Twas brillig，and the slithy toves\n" +

            "Did gyre and gimble in the wabe.\n" +

            "All mimsy were the borogoves,\n" +

            "And the mome raths outgrabe.";

         Pattern p = Pattern.compile("(?m)(\\S+)\\s(\\S+\\s\\S+)$");

         Matcher m = p.matcher(poem);

         while(m.find()){

             for(int i=0;i<= m.groupCount();i++){

                 System.out.print("[" + m.group(i) + "]");

             }

             System.out.println("");

         }

    }

}

　　输出的结果为：

[the slithy toves][the][slithy toves]

[in the wabe.][in][the wabe.]

[were the borogoves,][were][the borogoves,]

[mome raths outgrabe.][mome][raths outgrabe.]

　　需要解释的是：

　　(?m)指明了是多行模式，否则“$”只会指向结尾的位置，加上了(?m)，“$”指向每行的结尾位置。

　　(\\S+)\\s(\\S+\\s\\S+)$表示每行结尾处的3个字符，需要注意这里面还包含了2个子匹配，代码中用 m.group(i)获取了子匹配的内容。

　　如果希望在匹配时忽略大小写和支持多行模式，应该使用下面的代码：

import java.util.regex.Matcher;

import java.util.regex.Pattern;

public class PatternTest3 {

    public static void main(String [ ] args) {

        String testString = "java hava regex\n" +

                    "JAVA hava regex\n" +

                    "Java hava regex";

        Pattern p = Pattern.compile("^java",Pattern.CASE_INSENSITIVE | Pattern.MULTILINE);

        Matcher m = p.matcher(testString);

        while(m.find()){

            System.out.println(m.group());

        }

    }

}

　　输出结果为：

java

JAVA

Java

　　Pattern.CASE_INSENSITIVE(?i)--忽略大小写

　　Pattern.MULTILINE(?m)--支持多行模式

　　Pattern.COMMENTS(?x)--忽略大小写

　　将匹配结果分割成数组，可以使用split方法，String的split方法也支持正则表达式，如下面的例子：

public class RegexSplit {

    public static void main(String [ ] args) {

        String testString = "This!!unusual use!!of exclamation!!points";

        Pattern p = Pattern.compile("!!");

        String [] sts = p.split(testString);

        for(String st:sts){

            System.out.print(st +"|");

        }

        System.out.println();

        sts = p.split(testString,3);

        for(String st:sts){

            System.out.print(st+"|");

        }

    }

}

　　正则表达式的替换操作，replaceFirst和replaceAll，复杂的替换操作需要appendReplacement来完成，如下:

import java.util.HashMap;

import java.util.Map;

import java.util.regex.Matcher;

import java.util.regex.Pattern;

public class RegexExam {

    public static void main(String args[]) {

        String template = "尊敬的客户${customerName}你好！本次消费金额${amount}，"

                + "您帐户${accountNumber}上的余额为${balance}，欢迎下次光临！";

        HashMap<String, String> data = new HashMap<String, String>();

        data.put("customerName", "刘明");

        data.put("accountNumber", "888888888");

        data.put("balance", "$1000000.00");

        data.put("amount", "$1000.00");

        try {

            System.out.println(composeMessage(template, data));

        }

        catch (Exception e) {

            e.printStackTrace();

        }

    }

    public static String composeMessage(String template, Map<String, String> data)

            throws Exception {

        //这里使用勉强式匹配.+?，使用贪婪式匹配.+结果是不正确的

        String regex = "\\$\\{(.+?)\\}";

        Pattern pattern = Pattern.compile(regex);

        Matcher matcher = pattern.matcher(template);

        /*

         * sb用来存储替换过的内容，它会把多次处理过的字符串按源字符串序

         * 存储起来。

         */

        StringBuffer sb = new StringBuffer();

        while (matcher.find()) {

            String name = matcher.group(1);//键名

            String value = (String) data.get(name);//键值

            if (value == null) {

                value = "";

            }

            else {

                /*

                 * $和\都是特殊字符，表示字符$需要转义\$

                 * 要把 $ 替换成 \$ ，则要使用 \\\\\\$来替换，java中的\是特殊字符，用\\表示正则表达式中的\

                 * value的结果仍是一个正则表达式，会在下面使用

                 */

                value = value.replaceAll("\\$", "\\\\\\$");

                //System.out.println("value=" + value);

            }

            /*

             * 经过上面的替换操作，现在的 value 中含有 $ 特殊字符的内容被换成了"\$1000.00"

             */

            matcher.appendReplacement(sb, value);

            System.out.println("sb = " + sb.toString());

        }

        //最后还得要把尾串接到已替换的内容后面去，这里尾串为“，欢迎下次光临！”

        matcher.appendTail(sb);

        return sb.toString();

    }

}

　　运行的结果为：

尊敬的客户刘明你好！本次消费金额$1000.00，您帐户888888888上的余额为$1000000.00，欢迎下次光临！

　　重置，将Mathcer对象应用于一个新的字符串：

import java.util.regex.Matcher;

import java.util.regex.Pattern;

public class RegexReset {

    public static void main(String [ ] args) {

        String str = "fix the rug with bags";

        Pattern pattern = Pattern.compile("[frb][aiu][gx]");

        Matcher matcher = pattern.matcher(str);

        while(matcher.find()){

            System.out.print(matcher.group() + " ");

        }

        System.out.println();

        matcher.reset("fix the rig with rags");

        while(matcher.find()){

            System.out.print(matcher.group() + " ");

        }

    }

}

　　输出的结果为：

fix rug bag

fix rig rag

秒客网

Java对正则表达式的支持(一)

1.Pattern、Matcher

相关文章