为什么这个regex在Java中不像预期的那样工作?

时间:2021-06-18 21:44:32

trivial regex question (the answer is most probably Java-specific):

琐碎的正则表达式问题(答案很可能是java特有的):

"#This is a comment in a file".matches("^#")

This returns false. As far as I can see, ^ means what it always means and # has no special meaning, so I'd translate ^# as "A '#' at the beginning of the string". Which should match. And so it does, in Perl:

这返回false。据我所见,^意味着它总是意味着什么和#没有特殊的意义,所以我翻译^ #作为“一个“#”开头的字符串”。这应该匹配。在Perl中确实如此:

perl -e "print '#This is a comment'=~/^#/;"

prints "1". So I'm pretty sure the answer is something Java specific. Would somebody please enlighten me?

输出“1”。所以我很确定答案是某种特定于Java的东西。谁能开导我一下吗?

Thank you.

谢谢你!

3 个解决方案

#1


17  

Matcher.matches() checks to see if the entire input string is matched by the regex.

matches()检查整个输入字符串是否与regex匹配。

Since your regex only matches the very first character, it returns false.

因为您的regex只匹配第一个字符,所以它返回false。

You'll want to use Matcher.find() instead.

您将使用Matcher.find()代替。

Granted, it can be a bit tricky to find the concrete specification, but it's there:

诚然,找到具体的规范可能有点棘手,但它确实存在:

  • String.matches() is defined as doing the same thing as Pattern.matches(regex, str).
  • String.matches()被定义为与模式做同样的事情。正则表达式匹配(str)。
  • Pattern.matches() in turn is defined as Pattern.compile(regex).matcher(input).matches().
  • matches .matches()依次定义为Pattern.compile(regex).matcher(input).matches()。Pattern.compile()返回一个模式。Pattern.matcher()返回一个匹配器
  • Matcher.matches() is documented like this (emphasis mine):

    Attempts to match the entire region against the pattern.

    尝试将整个区域与模式匹配。

#2


2  

The matches method matches your regex against the entire string.

matches方法将regex与整个字符串匹配。

So try adding a .* to match rest of the string.

因此,尝试添加一个。*来匹配字符串的其余部分。

"#This is a comment in a file".matches("^#.*")

which returns true. One can even drop all anchors(both start and end) from the regex and the match method will add it for us. So in the above case we could have also used "#.*" as the regex.

这将返回true。甚至可以从regex中删除所有锚(包括开始和结束),match方法将为我们添加它。在上面的例子中,我们也可以用#。*”正则表达式。

#3


0  

This should meet your expectations:

这应该会满足你的期望:

"#This is a comment in a file".matches("^#.*$")

Now the input String matches the pattern "First char shall be #, the rest shall be any char"

现在输入字符串匹配模式“第一个char应该是#,其余应该是任何char”


Following Joachims comment, the following is equivalent:

在Joachims的评论之后,以下是等价的:

"#This is a comment in a file".matches("#.*")

#1


17  

Matcher.matches() checks to see if the entire input string is matched by the regex.

matches()检查整个输入字符串是否与regex匹配。

Since your regex only matches the very first character, it returns false.

因为您的regex只匹配第一个字符,所以它返回false。

You'll want to use Matcher.find() instead.

您将使用Matcher.find()代替。

Granted, it can be a bit tricky to find the concrete specification, but it's there:

诚然,找到具体的规范可能有点棘手,但它确实存在:

  • String.matches() is defined as doing the same thing as Pattern.matches(regex, str).
  • String.matches()被定义为与模式做同样的事情。正则表达式匹配(str)。
  • Pattern.matches() in turn is defined as Pattern.compile(regex).matcher(input).matches().
  • matches .matches()依次定义为Pattern.compile(regex).matcher(input).matches()。Pattern.compile()返回一个模式。Pattern.matcher()返回一个匹配器
  • Matcher.matches() is documented like this (emphasis mine):

    Attempts to match the entire region against the pattern.

    尝试将整个区域与模式匹配。

#2


2  

The matches method matches your regex against the entire string.

matches方法将regex与整个字符串匹配。

So try adding a .* to match rest of the string.

因此,尝试添加一个。*来匹配字符串的其余部分。

"#This is a comment in a file".matches("^#.*")

which returns true. One can even drop all anchors(both start and end) from the regex and the match method will add it for us. So in the above case we could have also used "#.*" as the regex.

这将返回true。甚至可以从regex中删除所有锚(包括开始和结束),match方法将为我们添加它。在上面的例子中,我们也可以用#。*”正则表达式。

#3


0  

This should meet your expectations:

这应该会满足你的期望:

"#This is a comment in a file".matches("^#.*$")

Now the input String matches the pattern "First char shall be #, the rest shall be any char"

现在输入字符串匹配模式“第一个char应该是#,其余应该是任何char”


Following Joachims comment, the following is equivalent:

在Joachims的评论之后,以下是等价的:

"#This is a comment in a file".matches("#.*")