正则表达式(。*?)和(。*)之间有什么区别?

时间:2021-09-28 13:58:33

I've been doing regex for a while but I'm not an expert on the subtleties of what particular rules do, I've always done (.*?) for matching, but with restriction, as in I understood it would stop the first chance it got, whereas (.*)? would continue and be more greedy.

我已经做了一段时间的正则表达式,但我不是关于特定规则的微妙之处的专家,我总是(。*?)进行匹配,但有限制,因为我知道它会阻止第一次有机会,而(。*)?会继续下去并且更加贪婪。

but I have no real reason why I think that, I just think it because I read it once upon a time.

但我没有真正的理由为什么我这么想,我只是想到它,因为我曾经读过它。

now I'd like to know, is there a difference? and if so, what is it...

现在我想知道,有区别吗?如果是的话,它是什么......

4 个解决方案

#1


12  

(.*?) is a group containing a non-greedy match.

(。*?)是一个包含非贪婪匹配的组。

(.*)? is an optional group containing a greedy match.

(。*)?是一个包含贪婪匹配的可选组。

#2


4  

Because * means "zero or more", it all gets slightly confusing. Both ?'s are quite different, which can be more clearly shown with a different example of each:

因为*表示“零或更多”,所有这些都会让人感到有些困惑。两者都是完全不同的,可以用不同的例子更清楚地显示出来:

fo*? will match only f if you supply it foo. That is, this ? makes the match non-greedy. Removing it makes it match foo.

FO *?如果你提供foo,它将只匹配f。那就是这个?使比赛不贪婪。删除它使它匹配foo。

fo? will match f, but also fo. That is, this ? makes the match optional: the part that it applies to (in this case only o) must be present 0 or 1 times. Removing it makes the match required: it must then be present exactly once, so only fo will still match.

FO?会匹配f,但也会。那就是这个?使匹配成为可选:它适用的部分(在这种情况下只有o)必须存在0或1次。删除它需要匹配:它必须只存在一次,所以只有fo仍然匹配。

And while we're at different meanings of the ? in regexps, there's one more: a ? immediately following a ( is a prefix for several special operations, such as lookaround. That is, its meaning is not like any of the things you ask.

虽然我们有不同的含义?在regexps中,还有一个:a?紧跟在a之后(是几个特殊操作的前缀,例如lookaround。也就是说,它的含义与你提出的任何内容都不一样。

#3


3  

The ? has different meanings.

的?有不同的含义。

  1. When it follows a character or a group it is a quantifier, matching 0 or 1 occurrence of the preceding construct. See here for details

    当它跟随一个字符或一个组时,它是一个量词,匹配前面构造的0或1次出现。详情请见此处

  2. When it follows a quantifier it modifies the matching behaviour of that quantifier, making it match lazy/ungreedy. See here for details

    当它跟随量词时,它会修改该量词的匹配行为,使其与lazy / ungreedy匹配。详情请见此处

#4


3  

Others have pointed out the difference between greedy and non-greedy matches. Here is an example of different results you can see in practice. Since regular expressions are often embedded in a host language, I'm going to use Perl as the host. In Perl, enclosing matches in parenthesis assigns the results of those matches to special variables. Therefore in this case, the matches may be the same but what's assigned to those variables may not:

其他人指出了贪婪和非贪婪的比赛之间的区别。以下是您在实践中可以看到的不同结果的示例。由于正则表达式通常嵌入在宿主语言中,因此我将使用Perl作为主机。在Perl中,括号中的匹配将这些匹配的结果分配给特殊变量。因此,在这种情况下,匹配可能是相同的,但分配给这些变量的可能不是:

For example, let's say your match string is 'hello'. Both patterns would match it, but the matched portions ($1) differ:

例如,假设你的匹配字符串是'hello'。两种模式都匹配,但匹配的部分($ 1)不同:

'hello' =~ /(.*?)l/;
# $1 == 'he' 

'hello' =~ /(.*)?l/;
# $1 == 'hel'

#1


12  

(.*?) is a group containing a non-greedy match.

(。*?)是一个包含非贪婪匹配的组。

(.*)? is an optional group containing a greedy match.

(。*)?是一个包含贪婪匹配的可选组。

#2


4  

Because * means "zero or more", it all gets slightly confusing. Both ?'s are quite different, which can be more clearly shown with a different example of each:

因为*表示“零或更多”,所有这些都会让人感到有些困惑。两者都是完全不同的,可以用不同的例子更清楚地显示出来:

fo*? will match only f if you supply it foo. That is, this ? makes the match non-greedy. Removing it makes it match foo.

FO *?如果你提供foo,它将只匹配f。那就是这个?使比赛不贪婪。删除它使它匹配foo。

fo? will match f, but also fo. That is, this ? makes the match optional: the part that it applies to (in this case only o) must be present 0 or 1 times. Removing it makes the match required: it must then be present exactly once, so only fo will still match.

FO?会匹配f,但也会。那就是这个?使匹配成为可选:它适用的部分(在这种情况下只有o)必须存在0或1次。删除它需要匹配:它必须只存在一次,所以只有fo仍然匹配。

And while we're at different meanings of the ? in regexps, there's one more: a ? immediately following a ( is a prefix for several special operations, such as lookaround. That is, its meaning is not like any of the things you ask.

虽然我们有不同的含义?在regexps中,还有一个:a?紧跟在a之后(是几个特殊操作的前缀,例如lookaround。也就是说,它的含义与你提出的任何内容都不一样。

#3


3  

The ? has different meanings.

的?有不同的含义。

  1. When it follows a character or a group it is a quantifier, matching 0 or 1 occurrence of the preceding construct. See here for details

    当它跟随一个字符或一个组时,它是一个量词,匹配前面构造的0或1次出现。详情请见此处

  2. When it follows a quantifier it modifies the matching behaviour of that quantifier, making it match lazy/ungreedy. See here for details

    当它跟随量词时,它会修改该量词的匹配行为,使其与lazy / ungreedy匹配。详情请见此处

#4


3  

Others have pointed out the difference between greedy and non-greedy matches. Here is an example of different results you can see in practice. Since regular expressions are often embedded in a host language, I'm going to use Perl as the host. In Perl, enclosing matches in parenthesis assigns the results of those matches to special variables. Therefore in this case, the matches may be the same but what's assigned to those variables may not:

其他人指出了贪婪和非贪婪的比赛之间的区别。以下是您在实践中可以看到的不同结果的示例。由于正则表达式通常嵌入在宿主语言中,因此我将使用Perl作为主机。在Perl中,括号中的匹配将这些匹配的结果分配给特殊变量。因此,在这种情况下,匹配可能是相同的,但分配给这些变量的可能不是:

For example, let's say your match string is 'hello'. Both patterns would match it, but the matched portions ($1) differ:

例如,假设你的匹配字符串是'hello'。两种模式都匹配,但匹配的部分($ 1)不同:

'hello' =~ /(.*?)l/;
# $1 == 'he' 

'hello' =~ /(.*)?l/;
# $1 == 'hel'