相同的正则表达式不匹配两次

时间:2021-09-20 11:06:10

Trying to solve a problem in my perl script I finally could break it down to this situation:

试图解决我的perl脚本中的问题,我终于可以将其分解为这种情况:

my $content = 'test';
if($content =~ m/test/g) {
    print "1\n";
} 
if($content =~ m/test/g) {
    print "2\n";
} 
if($content =~ m/test/g) {
    print "3\n";
} 

Output:

输出:

1
3

My real case is just a bit different but at the end it's the same thing: I'm confused why regex 2 isn't matching. Does anyone has an explanation for this? I realized that /g seems to be the reason and of course this is not needed in my example. But (why) is this output normal behaviour?

我的实际情况略有不同,但最后却是同样的事情:我很困惑为什么正则表达式2不匹配。有没有人对此有解释?我意识到/ g似乎是原因,当然这在我的例子中并不需要。但(为什么)这个输出正常行为?

2 个解决方案

#1


7  

This is exactly what /g in scalar context is supposed to do.

这正是标量上下文中应该做的/ g。

The first time it matches "test". The second match tries to start matching in the string after where the previous match left off, and fails. The third match then tries again from the beginning of the string (and succeeds) because the second match failed and you didn't also specify /c.

第一次匹配“测试”。第二个匹配尝试在前一个匹配停止后的字符串中开始匹配,然后失败。然后第三个匹配从字符串的开头再次尝试(并且成功),因为第二个匹配失败并且您也没有指定/ c。

(/c keeps it from restarting at the beginning if a match fails; if your second match was /test/gc, the second and third match would both fail.)

(/ c如果匹配失败,则阻止它在开始时重新启动;如果第二次匹配是/ test / gc,则第二次和第三次匹配都将失败。)

#2


7  

Generally speaking, if (/.../g) makes no sense and should be replaced with if (/.../)[1].

一般来说,如果(/.../g)没有意义,应该用if(/.../)[1]代替。


You wouldn't expect the following to match twice:

您不希望以下内容匹配两次:

my $content = "test";
while ($content =~ /test/g) {
   print(++$i, "\n");
}

So why would you expect the following to match twice:

那么为什么你希望以下两次匹配:

my $content = "test";
if ($content =~ /test/g) {
   print(++$i, "\n");
}

if ($content =~ /test/g) {
   print(++$i, "\n");
}

They're the same!

他们是一样的!


Let's imagine $content contains testtest.

让我们假设$ content包含testtest。

  1. The 1st time $content =~ /test/g is evaluated in scalar context,
    it matches the first test.
  2. 第一次$ content =〜/ test / g在标量上下文中计算,它匹配第一个测试。
  3. The 2nd time $content =~ /test/g is evaluated in scalar context,
    it matches the second test.
  4. 第二次$ content =〜/ test / g在标量上下文中进行评估,它与第二次测试匹配。
  5. The 3rd time $content =~ /test/g is evaluated in scalar context,
    it returns false to indicate there are no more matches.
    This also resets the position at which $content future matches will start.
  6. 第3次$ content =〜/ test / g在标量上下文中计算,它返回false表示没有更多匹配。这也会重置$ content future match将开始的位置。
  7. The 4th time $content =~ /test/g is evaluated in scalar context,
    it matches the first test.
  8. 第4次$ content =〜/ test / g在标量上下文中进行评估,它与第一次测试匹配。
  9. ...
  10. ...

  1. There are advanced uses for if (/\G.../gc), but that's different. if (/.../g) only makes sense if you're unrolling a while loop. (e.g. while (1) { ...; last if !/.../g; ... }).
  2. if(/\ G.../gc)有高级用途,但这有所不同。 if(/.../g)只有在你展开while循环时才有意义。 (例如,while(1){...; last if!/.../ g; ...})。

#1


7  

This is exactly what /g in scalar context is supposed to do.

这正是标量上下文中应该做的/ g。

The first time it matches "test". The second match tries to start matching in the string after where the previous match left off, and fails. The third match then tries again from the beginning of the string (and succeeds) because the second match failed and you didn't also specify /c.

第一次匹配“测试”。第二个匹配尝试在前一个匹配停止后的字符串中开始匹配,然后失败。然后第三个匹配从字符串的开头再次尝试(并且成功),因为第二个匹配失败并且您也没有指定/ c。

(/c keeps it from restarting at the beginning if a match fails; if your second match was /test/gc, the second and third match would both fail.)

(/ c如果匹配失败,则阻止它在开始时重新启动;如果第二次匹配是/ test / gc,则第二次和第三次匹配都将失败。)

#2


7  

Generally speaking, if (/.../g) makes no sense and should be replaced with if (/.../)[1].

一般来说,如果(/.../g)没有意义,应该用if(/.../)[1]代替。


You wouldn't expect the following to match twice:

您不希望以下内容匹配两次:

my $content = "test";
while ($content =~ /test/g) {
   print(++$i, "\n");
}

So why would you expect the following to match twice:

那么为什么你希望以下两次匹配:

my $content = "test";
if ($content =~ /test/g) {
   print(++$i, "\n");
}

if ($content =~ /test/g) {
   print(++$i, "\n");
}

They're the same!

他们是一样的!


Let's imagine $content contains testtest.

让我们假设$ content包含testtest。

  1. The 1st time $content =~ /test/g is evaluated in scalar context,
    it matches the first test.
  2. 第一次$ content =〜/ test / g在标量上下文中计算,它匹配第一个测试。
  3. The 2nd time $content =~ /test/g is evaluated in scalar context,
    it matches the second test.
  4. 第二次$ content =〜/ test / g在标量上下文中进行评估,它与第二次测试匹配。
  5. The 3rd time $content =~ /test/g is evaluated in scalar context,
    it returns false to indicate there are no more matches.
    This also resets the position at which $content future matches will start.
  6. 第3次$ content =〜/ test / g在标量上下文中计算,它返回false表示没有更多匹配。这也会重置$ content future match将开始的位置。
  7. The 4th time $content =~ /test/g is evaluated in scalar context,
    it matches the first test.
  8. 第4次$ content =〜/ test / g在标量上下文中进行评估,它与第一次测试匹配。
  9. ...
  10. ...

  1. There are advanced uses for if (/\G.../gc), but that's different. if (/.../g) only makes sense if you're unrolling a while loop. (e.g. while (1) { ...; last if !/.../g; ... }).
  2. if(/\ G.../gc)有高级用途,但这有所不同。 if(/.../g)只有在你展开while循环时才有意义。 (例如,while(1){...; last if!/.../ g; ...})。