如何检查字符串是否可以被正则表达式匹配耗尽?

时间:2022-09-13 16:23:50

So the problem is to determine if every character in a string would be included in a match of a particular regex. Or, to state it differently, if the set of all of the character positions that could be included in some match of a particular regex includes all the character positions in the string.

所以问题是确定字符串中的每个字符是否都包含在特定正则表达式的匹配中。或者,换句话说,如果可以包含在特定正则表达式的某些匹配中的所有字符位置的集合包括字符串中的所有字符位置。

My thought is to do something like this:

我的想法是做这样的事情:

boolean matchesAll(String myString, Matcher myMatcher){
    boolean matched[] = new boolean[myString.size()];
    for(myMatcher.reset(myString); myMatcher.find();)
        for(int idx = myMatcher.start(); idx < myMatcher.end(); idx++)
            matched[idx] = true;

    boolean allMatched = true;
    for(boolean charMatched : matched)
        allMatched &= charMatched;

    return allMatched
}

Is there a better way to do this, however?

有没有更好的方法呢?

Also, as I was writing this, it occured to me that that would not do what I want in cases like

此外,正如我写这篇文章时,我发现在我喜欢的情况下不会做我想要的事情

matchesAll("abcabcabc", Pattern.compile("(abc){2}").matcher()); //returns false

because Matcher only tries to match starting at the end of the last match. I want it to return true, because if you start the matcher at position 3, it could include the third abc in a match.

因为匹配器只在最后一场比赛结束时尝试匹配。我希望它返回true,因为如果你在第3位开始匹配,它可能包括匹配中的第三个abc。

boolean matchesAll(String myString, Matcher myMatcher){

    boolean matched[] = new boolean[myString.size()];
    boolean allMatched = true;

    for(int idx = 0; idx < myString.size() && myMatcher.find(idx);
            idx = myMatcher.start() + 1) {

        for(int idx2 = myMatcher.start(); idx2 < myMatcher.end(); idx2++)
            matched[idx2] = true;
    }

    boolean allMatched = true;
    for(boolean charMatched : matched)
        allMatched &= charMatched;

    return allMatched;
}

Is there any way to make this code better, faster, or more readable?

有没有办法让这些代码更好,更快,更可读?

2 个解决方案

#1


1  

This works:

这有效:

private static boolean fullyCovered(final String input,
    final Pattern pattern)
{
    // If the string is empty, check that it is matched by the pattern
    if (input.isEmpty())
        return pattern.matcher(input).find();

    final int len = input.length();
    // All initialized to false by default
    final boolean[] covered = new boolean[len];

    final Matcher matcher = pattern.matcher(input);

    for (int index = 0; index < len; index++) {
        // Try and match at this index:
        if (!matcher.find(index)) {
            // if there isn't a match, check if this character is already covered;
            // if no, it's a failure
            if (!covered[index])
                return false;
            // Otherwise, continue
            continue;
        }
        // If the match starts at the string index, fill the covered array
        if (matcher.start() == index)
            setCovered(covered, index, matcher.end());
    }

    // We have finished parsing the string: it is fully covered.
    return true;
}

private static void setCovered(final boolean[] covered,
    final int beginIndex, final int endIndex)
{
    for (int i = beginIndex; i < endIndex; i++)
        covered[i] = true;
}

It will probably not be any faster to execute, but I surmise it is easier to read ;) Also, .find(int) resets the matcher, so this is safe.

执行它可能不会更快,但我猜测它更容易阅读;)另外,.find(int)重置匹配器,所以这是安全的。

#2


2  

I have 2 answers for you, although I am not sure I understand the question right.

我有2个答案,虽然我不确定我是否理解这个问题。

  1. Call Pattern.matcher(str2match).matches() method instead of find(). In one shot a true return value will tell you if the entire string is matched.
  2. 调用Pattern.matcher(str2match).matches()方法而不是find()。在一次拍摄中,真正的返回值将告诉您整个字符串是否匹配。
  3. Prepend the reg exp by "^" (beginning of string) and add a "$" at the end (for end of string) before "Pattern.compile(str)"-ing the regex.
  4. 将reg exp前加“^”(字符串的开头),并在“Pattern.compile(str)”前面的正则表达式的末尾(对于字符串的结尾)添加“$”。

The 2 solutions can go together, too. An example class follows - you can copy it into AllMatch.java, compile it with "javac AllMatch.java" and run it as "java AllMatch" (I assume you have "." in your CLASSSPATH). Just pick the solution you find is more elegant :) Happy New Year!

这两种解决方案也可以结合在一起。下面是一个示例类 - 您可以将其复制到AllMatch.java中,使用“javac AllMatch.java”进行编译并将其作为“java AllMatch”运行(我假设您的CLASSSPATH中有“。”)。只需选择你找到的解决方案就更优雅了:)新年快乐!

import java.util.regex.Pattern;

public class AllMatch {

公共类AllMatch {

private Pattern pattern;

public AllMatch (String reStr) {
    pattern = Pattern.compile ("^" + reStr + "$");
}

public boolean checkMatch (String s) {
    return pattern.matcher(s).matches();
}

    public static void main (String[] args) {
    int n = args.length;
    String  rexp2Match = (n > 0) ? args[0] : "(abc)+",
        testString = (n > 1) ? args[1] : "abcabcabc",
        matchMaker = new AllMatch (rexp2Match)
                .checkMatch(testString) ? "" : "un";
    System.out.println ("[AllMatch] match " + matchMaker +
                "successful");
    }

}

}

#1


1  

This works:

这有效:

private static boolean fullyCovered(final String input,
    final Pattern pattern)
{
    // If the string is empty, check that it is matched by the pattern
    if (input.isEmpty())
        return pattern.matcher(input).find();

    final int len = input.length();
    // All initialized to false by default
    final boolean[] covered = new boolean[len];

    final Matcher matcher = pattern.matcher(input);

    for (int index = 0; index < len; index++) {
        // Try and match at this index:
        if (!matcher.find(index)) {
            // if there isn't a match, check if this character is already covered;
            // if no, it's a failure
            if (!covered[index])
                return false;
            // Otherwise, continue
            continue;
        }
        // If the match starts at the string index, fill the covered array
        if (matcher.start() == index)
            setCovered(covered, index, matcher.end());
    }

    // We have finished parsing the string: it is fully covered.
    return true;
}

private static void setCovered(final boolean[] covered,
    final int beginIndex, final int endIndex)
{
    for (int i = beginIndex; i < endIndex; i++)
        covered[i] = true;
}

It will probably not be any faster to execute, but I surmise it is easier to read ;) Also, .find(int) resets the matcher, so this is safe.

执行它可能不会更快,但我猜测它更容易阅读;)另外,.find(int)重置匹配器,所以这是安全的。

#2


2  

I have 2 answers for you, although I am not sure I understand the question right.

我有2个答案,虽然我不确定我是否理解这个问题。

  1. Call Pattern.matcher(str2match).matches() method instead of find(). In one shot a true return value will tell you if the entire string is matched.
  2. 调用Pattern.matcher(str2match).matches()方法而不是find()。在一次拍摄中,真正的返回值将告诉您整个字符串是否匹配。
  3. Prepend the reg exp by "^" (beginning of string) and add a "$" at the end (for end of string) before "Pattern.compile(str)"-ing the regex.
  4. 将reg exp前加“^”(字符串的开头),并在“Pattern.compile(str)”前面的正则表达式的末尾(对于字符串的结尾)添加“$”。

The 2 solutions can go together, too. An example class follows - you can copy it into AllMatch.java, compile it with "javac AllMatch.java" and run it as "java AllMatch" (I assume you have "." in your CLASSSPATH). Just pick the solution you find is more elegant :) Happy New Year!

这两种解决方案也可以结合在一起。下面是一个示例类 - 您可以将其复制到AllMatch.java中,使用“javac AllMatch.java”进行编译并将其作为“java AllMatch”运行(我假设您的CLASSSPATH中有“。”)。只需选择你找到的解决方案就更优雅了:)新年快乐!

import java.util.regex.Pattern;

public class AllMatch {

公共类AllMatch {

private Pattern pattern;

public AllMatch (String reStr) {
    pattern = Pattern.compile ("^" + reStr + "$");
}

public boolean checkMatch (String s) {
    return pattern.matcher(s).matches();
}

    public static void main (String[] args) {
    int n = args.length;
    String  rexp2Match = (n > 0) ? args[0] : "(abc)+",
        testString = (n > 1) ? args[1] : "abcabcabc",
        matchMaker = new AllMatch (rexp2Match)
                .checkMatch(testString) ? "" : "un";
    System.out.println ("[AllMatch] match " + matchMaker +
                "successful");
    }

}

}