为什么pos()报告与非捕获组匹配?

时间:2022-05-01 09:02:22

I want to match the placeholders (bare ?, without quotes) in a parameterized SQL query like this:

我想匹配参数化SQL查询中的占位符(裸?,没有引号),如下所示:

UPDATE `table` SET `col1`=? WHERE `col2`=? AND `x`="??as"

(I know I should be using SQL::Parser instead. Bear with me here.)

(我知道我应该使用SQL:::Parser。容忍我。

This regex (?:`.+?`)|(?:'.+?')|(?:".+?")|(\?) matches the bare question marks in `col1`=? and `col2`=? but skips the question marks inside the double quotes in `x`="??as", as I want it to. You can see this working at https://regex101.com/r/iH4aV2/3.

这个正则表达式(?:“。+ ?”)|(?:“。+ ?”)|(?:“。+ ?”)|(\)匹配的col1 =的光秃秃的问号?和“col2”= ?但在“x”=“?”的双引号内跳过问号。就像“我想要的那样”。您可以在https://regex101.com/r/iH4aV2/3上看到这个工作。

Now that regex is run by PCRE. If I run this bit of Perl:

现在,regex由PCRE运行。如果我运行这一小段Perl:

# same regex and test string
my $x = 'UPDATE `table` SET `col1`=? WHERE `col2`=? AND `x`="??as"';          

while ($x =~ /(?:`.+?`)|(?:'.+?')|(?:".+?")|(\?)/g) {
    print "A:".pos($x)."\n";
}

I get:

我得到:

A:14
A:25
A:27
A:40
A:42
A:50
A:57

I was expecting to get only the positions of the bare question marks, like on the regex101 site:

我希望得到的只是纯问号的位置,比如在regex101站点上:

A:27
A:42

Why is this happening? Can I make Perl's regex engine behave like PCRE?

为什么会这样?我能让Perl的regex引擎像PCRE那样运行吗?

1 个解决方案

#1


3  

The simplest solution is just to check if the capturing parens actually captured something before checking pos:

最简单的解决方案是在检查pos之前检查捕获参数是否确实捕获了什么:

my $x = 'UPDATE `table` SET `col1`=? WHERE `col2`=? AND "x"="??as"';
while($x =~ /(?:`.+?`)|(?:'.+')|(?:".+?")|(\?)/g) {
  if (defined($1)) {
    print "A:".pos($x)."\n";
  }
}

This yields the desired results.

这将产生预期的结果。

(I mean, you can use the fancy (*SKIP) and (*FAIL) verbs mentioned in comments, but this seems cleaner)

(我的意思是,您可以使用注释中提到的花哨的(*SKIP)和(*FAIL)动词,但这似乎更简洁)

#1


3  

The simplest solution is just to check if the capturing parens actually captured something before checking pos:

最简单的解决方案是在检查pos之前检查捕获参数是否确实捕获了什么:

my $x = 'UPDATE `table` SET `col1`=? WHERE `col2`=? AND "x"="??as"';
while($x =~ /(?:`.+?`)|(?:'.+')|(?:".+?")|(\?)/g) {
  if (defined($1)) {
    print "A:".pos($x)."\n";
  }
}

This yields the desired results.

这将产生预期的结果。

(I mean, you can use the fancy (*SKIP) and (*FAIL) verbs mentioned in comments, but this seems cleaner)

(我的意思是,您可以使用注释中提到的花哨的(*SKIP)和(*FAIL)动词,但这似乎更简洁)