正则表达式混乱 - ?括号内外

时间:2021-04-28 23:35:13

This regex:

这个正则表达式:

(a)?b\1c

does not match "bc" while this one:

与此相符不符合“bc”:

(a?)b\1c

does match it. Why is this? I thought these statements are identical.

确实匹配它。为什么是这样?我认为这些陈述是相同的。

3 个解决方案

#1


6  

In your first example (a)?b\1c, \1 refers to your (a) group, it means you must have an a :

在你的第一个例子中(a)?b \ 1c,\ 1指的是你的(a)组,这意味着你必须有一个:

正则表达式混乱 - ?括号内外

  • abac will match
  • abac会匹配
  • bac will match
  • bac会匹配
  • bc won't match
  • bc不会匹配

In your second example (a?)b\1c, \1 refers to (a?), where a is optional :

在你的第二个例子中(a?)b \ 1c,\ 1引用(a?),其中a是可选的:

正则表达式混乱 - ?括号内外

  • abac will match
  • abac会匹配
  • bac won't match
  • bac不匹配
  • bc will match
  • bc会匹配

The back reference doesn't care of your external ? (in the first example), it only takes care of what is inside parenthesis.

后面的参考不关心你的外部? (在第一个例子中),它只关注括号内的内容。

#2


3  

It's a bit confusing, but let's see, I will start with the second regular expression:

这有点令人困惑,但让我们看看,我将从第二个正则表达式开始:

(a?)b\1c

When this tries to match bc it first tries (a?) but since there is no a in bc, () will capture the empty string "" so when we later refer to it in the string using \1, \1 will match the empty string which is always possible.

当它试图匹配bc时它首先尝试(a?)但由于bc中没有a,()将捕获空字符串“”所以当我们稍后使用\ 1在字符串中引用它时,\ 1将匹配空字符串总是可能的。

Now let's go to the second case:

现在让我们进入第二种情况:

(a)?b\1c

(a) will try to match a but fails, but since the entire group (a)? is optional, the regular expression continues, now it tries to find a b OK, then \1 but (a)? didn't match anything, even the empty string so the match fails.

(a)会尝试匹配但失败,但自整个组(a)?是可选的,正则表达式继续,现在它试图找到一个b OK,然后\ 1但是(a)?没有匹配任何东西,即使是空字符串,所以匹配失败。

So the difference between the two regex is that in (a?) the capturing group captures an empty string which can be referenced later and matched successfully using \1, but (a)? creates an optional capturing group that didn't match anything so referencing it later using \1 will always fails unless the group actually matched an a.

所以两个正则表达式之间的区别在于(a?)捕获组捕获一个空字符串,后面可以引用并使用\ 1成功匹配,但是(a)?创建一个与之匹配的可选捕获组,因此稍后使用\ 1引用它将始终失败,除非该组实际匹配a。

#3


2  

In the firs version, parentheses catch a so \1 returns a.

在第一个版本中,括号捕获一个\ 1返回a。

In the second regex, parentheses catch a? so \1 returns a? which means "0 or 1 a".

在第二个正则表达式中,括号中有一个?所以\ 1返回一个?这意味着“0或1 a”。

As a is optional in the second regex, bc match so well the end of the second regex (b\1c)

作为第二个正则表达式中的a是可选的,bc匹配第二个正则表达式的结束(b \ 1c)

#1


6  

In your first example (a)?b\1c, \1 refers to your (a) group, it means you must have an a :

在你的第一个例子中(a)?b \ 1c,\ 1指的是你的(a)组,这意味着你必须有一个:

正则表达式混乱 - ?括号内外

  • abac will match
  • abac会匹配
  • bac will match
  • bac会匹配
  • bc won't match
  • bc不会匹配

In your second example (a?)b\1c, \1 refers to (a?), where a is optional :

在你的第二个例子中(a?)b \ 1c,\ 1引用(a?),其中a是可选的:

正则表达式混乱 - ?括号内外

  • abac will match
  • abac会匹配
  • bac won't match
  • bac不匹配
  • bc will match
  • bc会匹配

The back reference doesn't care of your external ? (in the first example), it only takes care of what is inside parenthesis.

后面的参考不关心你的外部? (在第一个例子中),它只关注括号内的内容。

#2


3  

It's a bit confusing, but let's see, I will start with the second regular expression:

这有点令人困惑,但让我们看看,我将从第二个正则表达式开始:

(a?)b\1c

When this tries to match bc it first tries (a?) but since there is no a in bc, () will capture the empty string "" so when we later refer to it in the string using \1, \1 will match the empty string which is always possible.

当它试图匹配bc时它首先尝试(a?)但由于bc中没有a,()将捕获空字符串“”所以当我们稍后使用\ 1在字符串中引用它时,\ 1将匹配空字符串总是可能的。

Now let's go to the second case:

现在让我们进入第二种情况:

(a)?b\1c

(a) will try to match a but fails, but since the entire group (a)? is optional, the regular expression continues, now it tries to find a b OK, then \1 but (a)? didn't match anything, even the empty string so the match fails.

(a)会尝试匹配但失败,但自整个组(a)?是可选的,正则表达式继续,现在它试图找到一个b OK,然后\ 1但是(a)?没有匹配任何东西,即使是空字符串,所以匹配失败。

So the difference between the two regex is that in (a?) the capturing group captures an empty string which can be referenced later and matched successfully using \1, but (a)? creates an optional capturing group that didn't match anything so referencing it later using \1 will always fails unless the group actually matched an a.

所以两个正则表达式之间的区别在于(a?)捕获组捕获一个空字符串,后面可以引用并使用\ 1成功匹配,但是(a)?创建一个与之匹配的可选捕获组,因此稍后使用\ 1引用它将始终失败,除非该组实际匹配a。

#3


2  

In the firs version, parentheses catch a so \1 returns a.

在第一个版本中,括号捕获一个\ 1返回a。

In the second regex, parentheses catch a? so \1 returns a? which means "0 or 1 a".

在第二个正则表达式中,括号中有一个?所以\ 1返回一个?这意味着“0或1 a”。

As a is optional in the second regex, bc match so well the end of the second regex (b\1c)

作为第二个正则表达式中的a是可选的,bc匹配第二个正则表达式的结束(b \ 1c)