
时间:2021-09-28 13:58:33

I've been doing regex for a while but I'm not an expert on the subtleties of what particular rules do, I've always done (.*?) for matching, but with restriction, as in I understood it would stop the first chance it got, whereas (.*)? would continue and be more greedy.


but I have no real reason why I think that, I just think it because I read it once upon a time.


now I'd like to know, is there a difference? and if so, what is it...


4 个解决方案



(.*?) is a group containing a non-greedy match.


(.*)? is an optional group containing a greedy match.




Because * means "zero or more", it all gets slightly confusing. Both ?'s are quite different, which can be more clearly shown with a different example of each:


fo*? will match only f if you supply it foo. That is, this ? makes the match non-greedy. Removing it makes it match foo.

FO *?如果你提供foo,它将只匹配f。那就是这个?使比赛不贪婪。删除它使它匹配foo。

fo? will match f, but also fo. That is, this ? makes the match optional: the part that it applies to (in this case only o) must be present 0 or 1 times. Removing it makes the match required: it must then be present exactly once, so only fo will still match.


And while we're at different meanings of the ? in regexps, there's one more: a ? immediately following a ( is a prefix for several special operations, such as lookaround. That is, its meaning is not like any of the things you ask.




The ? has different meanings.


  1. When it follows a character or a group it is a quantifier, matching 0 or 1 occurrence of the preceding construct. See here for details


  2. When it follows a quantifier it modifies the matching behaviour of that quantifier, making it match lazy/ungreedy. See here for details

    当它跟随量词时,它会修改该量词的匹配行为,使其与lazy / ungreedy匹配。详情请见此处



Others have pointed out the difference between greedy and non-greedy matches. Here is an example of different results you can see in practice. Since regular expressions are often embedded in a host language, I'm going to use Perl as the host. In Perl, enclosing matches in parenthesis assigns the results of those matches to special variables. Therefore in this case, the matches may be the same but what's assigned to those variables may not:


For example, let's say your match string is 'hello'. Both patterns would match it, but the matched portions ($1) differ:

例如,假设你的匹配字符串是'hello'。两种模式都匹配,但匹配的部分($ 1)不同:

'hello' =~ /(.*?)l/;
# $1 == 'he' 

'hello' =~ /(.*)?l/;
# $1 == 'hel'



(.*?) is a group containing a non-greedy match.


(.*)? is an optional group containing a greedy match.




Because * means "zero or more", it all gets slightly confusing. Both ?'s are quite different, which can be more clearly shown with a different example of each:


fo*? will match only f if you supply it foo. That is, this ? makes the match non-greedy. Removing it makes it match foo.

FO *?如果你提供foo,它将只匹配f。那就是这个?使比赛不贪婪。删除它使它匹配foo。

fo? will match f, but also fo. That is, this ? makes the match optional: the part that it applies to (in this case only o) must be present 0 or 1 times. Removing it makes the match required: it must then be present exactly once, so only fo will still match.


And while we're at different meanings of the ? in regexps, there's one more: a ? immediately following a ( is a prefix for several special operations, such as lookaround. That is, its meaning is not like any of the things you ask.




The ? has different meanings.


  1. When it follows a character or a group it is a quantifier, matching 0 or 1 occurrence of the preceding construct. See here for details


  2. When it follows a quantifier it modifies the matching behaviour of that quantifier, making it match lazy/ungreedy. See here for details

    当它跟随量词时,它会修改该量词的匹配行为,使其与lazy / ungreedy匹配。详情请见此处



Others have pointed out the difference between greedy and non-greedy matches. Here is an example of different results you can see in practice. Since regular expressions are often embedded in a host language, I'm going to use Perl as the host. In Perl, enclosing matches in parenthesis assigns the results of those matches to special variables. Therefore in this case, the matches may be the same but what's assigned to those variables may not:


For example, let's say your match string is 'hello'. Both patterns would match it, but the matched portions ($1) differ:

例如,假设你的匹配字符串是'hello'。两种模式都匹配,但匹配的部分($ 1)不同:

'hello' =~ /(.*?)l/;
# $1 == 'he' 

'hello' =~ /(.*)?l/;
# $1 == 'hel'