在backtick结果的while()中进行全局正则表达式匹配。

时间:2022-12-16 12:16:46

This script searches for lines with words and prints them, while rereading source file in each iteration:

这个脚本搜索带有单词的行并打印它们,同时在每次迭代中重新读取源文件:

# cat mm.pl
#!/usr/bin/perl
use strict;
use warnings;

while( `cat aa` =~ /(\w+)/g ) {
    print "$1\n";
}

Input file:

输入文件:

# cat aa
aa
bb
cc

Result:

结果:

# ./mm.pl
aa
bb
cc

Please explain me why running the script isn't endless.

请解释一下为什么运行脚本并不是无止境的。

In every while iteration offset for regex engine should be reset because expression is changed (new cat is forked).

每次regex引擎的迭代偏移量都应该重新设置,因为表达式被更改(新的cat被分叉)。

I thought perl does some kind of caching for cat result, but strace claims that cat was spawned 4 times (3 for 3 lines + 1 for false while condition):

我认为perl对cat结果进行某种缓存,但是strace声称cat被衍生了4次(3行3次+ 1次):

# strace -f ./mm.pl 2>&1 | grep cat | grep -v ENOENT
[pid 22604] execve("/bin/cat", ["cat", "aa"], [/* 24 vars */] <unfinished ...>
[pid 22605] execve("/bin/cat", ["cat", "aa"], [/* 24 vars */] <unfinished ...>
[pid 22606] execve("/bin/cat", ["cat", "aa"], [/* 24 vars */] <unfinished ...>
[pid 22607] execve("/bin/cat", ["cat", "aa"], [/* 24 vars */] <unfinished ...>

On the other hand, following example runs forever:

另一方面,下面的例子是永恒的:

# cat kk.pl
#!/usr/bin/perl
use strict;
use warnings;

my $d = 'aaa';
while( $d =~ /(\w+)/g ) {
    print "$1\n";
    $d = 'aaa';
}

Where is a difference between the two scripts? What am I missing?

这两个脚本有什么不同?我缺少什么?

1 个解决方案

#1


7  

The position at which //g left off is stored in magic added to the scalar against which the matching was performed.

将// /g停止的位置以神奇的方式添加到执行匹配的标量中。

$ perl -MDevel::Peek -e'$_ = "abc"; Dump($_); /./g; Dump($_);'
SV = PV(0x32169a0) at 0x3253ee0
  REFCNT = 1
  FLAGS = (POK,IsCOW,pPOK)
  PV = 0x323bae0 "abc"\0
  CUR = 3
  LEN = 10
  COW_REFCNT = 1
SV = PVMG(0x326c040) at 0x3253ee0
  REFCNT = 1
  FLAGS = (SMG,POK,IsCOW,pPOK)
  IV = 0
  NV = 0
  PV = 0x323bae0 "abc"\0
  CUR = 3
  LEN = 10
  COW_REFCNT = 2
  MAGIC = 0x323d050
    MG_VIRTUAL = &PL_vtbl_mglob
    MG_TYPE = PERL_MAGIC_regex_global(g)
    MG_FLAGS = 0x40
      BYTES
    MG_LEN = 1

This means the only way the behaviour observed is possible in the backticks example is if the match operator matched against the same scalar all four times it was evaluated! How is that possible? It's because backticks is one of the operators that uses a TARG.

这意味着,在反节拍示例中,观察到的行为的唯一可能方式是,如果匹配操作符与相同的标量匹配,并且所有四次都被求值!这怎么可能?这是因为反拍子是使用TARG的运营商之一。

Creating a scalar is relatively expensive since it requires up to three memory allocations! In order to increase performance, a scalar called TARG is associated with each instance of some operators. When an operator with a TARG is evaluated, it may populate the TARG with the value to return and return the TARG (rather than allocating and returning a new one).

创建标量比较昂贵,因为它需要最多3个内存分配!为了提高性能,一个名为TARG的标量与一些操作符的每个实例相关联。当计算带有TARG的操作符时,它可能会用返回和返回TARG的值填充TARG(而不是分配和返回一个新的TARG)。

"So what?", you might ask. After all, you've already demonstrated that assigning to a scalar resets the match position associated with that scalar. That's what's suppose to happen, but it doesn't for backticks.

“所以什么?”,你可能会问。毕竟,您已经演示了向标量赋值会重置与该标量相关的匹配位置。这是应该发生的,但它不适用于背景。

Magic not only allows information to be attached to a variable, it also attaches functions to be called under certain conditions. The magic added by //g attaches a function that should be called after the scalar is modified (which is indicated by the SMG flag in the dump above). This function is what clears the position when a value is assigned to the scalar.

Magic不仅允许将信息附加到变量上,还附加了在某些条件下需要调用的函数。由//g添加的魔力附加了一个函数,该函数应该在修改标量之后调用(上面转储中的SMG标志指出了这一点)。当一个值被分配给标量时,这个函数就会清除这个位置。

The assignment operator handles the magic properly, but not by the backticks operator. It doesn't expect magic to have been added to its TARG, so it doesn't check if there's any, so the function that clears the match position goes uncalled.

赋值操作符正确地处理魔术,但不是由反拍子操作符处理。它不希望魔法添加到它的TARG中,所以它不检查是否有魔法,所以清除匹配位置的函数不会被调用。

#1


7  

The position at which //g left off is stored in magic added to the scalar against which the matching was performed.

将// /g停止的位置以神奇的方式添加到执行匹配的标量中。

$ perl -MDevel::Peek -e'$_ = "abc"; Dump($_); /./g; Dump($_);'
SV = PV(0x32169a0) at 0x3253ee0
  REFCNT = 1
  FLAGS = (POK,IsCOW,pPOK)
  PV = 0x323bae0 "abc"\0
  CUR = 3
  LEN = 10
  COW_REFCNT = 1
SV = PVMG(0x326c040) at 0x3253ee0
  REFCNT = 1
  FLAGS = (SMG,POK,IsCOW,pPOK)
  IV = 0
  NV = 0
  PV = 0x323bae0 "abc"\0
  CUR = 3
  LEN = 10
  COW_REFCNT = 2
  MAGIC = 0x323d050
    MG_VIRTUAL = &PL_vtbl_mglob
    MG_TYPE = PERL_MAGIC_regex_global(g)
    MG_FLAGS = 0x40
      BYTES
    MG_LEN = 1

This means the only way the behaviour observed is possible in the backticks example is if the match operator matched against the same scalar all four times it was evaluated! How is that possible? It's because backticks is one of the operators that uses a TARG.

这意味着,在反节拍示例中,观察到的行为的唯一可能方式是,如果匹配操作符与相同的标量匹配,并且所有四次都被求值!这怎么可能?这是因为反拍子是使用TARG的运营商之一。

Creating a scalar is relatively expensive since it requires up to three memory allocations! In order to increase performance, a scalar called TARG is associated with each instance of some operators. When an operator with a TARG is evaluated, it may populate the TARG with the value to return and return the TARG (rather than allocating and returning a new one).

创建标量比较昂贵,因为它需要最多3个内存分配!为了提高性能,一个名为TARG的标量与一些操作符的每个实例相关联。当计算带有TARG的操作符时,它可能会用返回和返回TARG的值填充TARG(而不是分配和返回一个新的TARG)。

"So what?", you might ask. After all, you've already demonstrated that assigning to a scalar resets the match position associated with that scalar. That's what's suppose to happen, but it doesn't for backticks.

“所以什么?”,你可能会问。毕竟,您已经演示了向标量赋值会重置与该标量相关的匹配位置。这是应该发生的,但它不适用于背景。

Magic not only allows information to be attached to a variable, it also attaches functions to be called under certain conditions. The magic added by //g attaches a function that should be called after the scalar is modified (which is indicated by the SMG flag in the dump above). This function is what clears the position when a value is assigned to the scalar.

Magic不仅允许将信息附加到变量上,还附加了在某些条件下需要调用的函数。由//g添加的魔力附加了一个函数,该函数应该在修改标量之后调用(上面转储中的SMG标志指出了这一点)。当一个值被分配给标量时,这个函数就会清除这个位置。

The assignment operator handles the magic properly, but not by the backticks operator. It doesn't expect magic to have been added to its TARG, so it doesn't check if there's any, so the function that clears the match position goes uncalled.

赋值操作符正确地处理魔术,但不是由反拍子操作符处理。它不希望魔法添加到它的TARG中,所以它不检查是否有魔法,所以清除匹配位置的函数不会被调用。