正则表达式 - 如果数字不等于调试问题,则替换

时间:2022-05-11 10:25:25

Could someone tell me what am I doing wrong here? This is my example data:

有人能告诉我这里我做错了什么吗?这是我的示例数据:

/docman/?view=document&alias=690-uchwala-rady-miasta-nr-xv-100-2015-zmieniajaca-uchwale-nr-xxxiii-151-2012&category_slug=gminne-programy-i-strategie&Itemid=123
/docman/?view=document&alias=690-uchwala-rady-miasta-nr-xv-100-2015-zmieniajaca-uchwale-nr-xxxiii-151-2012&category_slug=gminne-programy-i-strategie&Itemid=494
/docman/?view=document&alias=690-uchwala-rady-miasta-nr-xv-100-2015-zmieniajaca-uchwale-nr-xxxiii-151-2012&category_slug=gminne-programy-i-strategie&Itemid=527

I need to search for Itemid different to 527, and replace the number to 494. This is my code:

我需要搜索与527不同的Itemid,并将数字替换为494.这是我的代码:

$pattern = '/(.*)(Itemid=)(?!527)([1-9]*)/';
$replacement = "494";
$row->text = preg_replace($pattern, '$1'.'$2'.$replacement, $row->text);

And I receive something like this

我收到这样的东西

/docman/?view=document&alias=690-uchwala-rady-miasta-nr-xv-100-2015-zmieniajaca-uchwale-nr-xxxiii-151-2012&category_slug=gminne-programy-i-strategie&94=
/docman/?view=document&alias=690-uchwala-rady-miasta-nr-xv-100-2015-zmieniajaca-uchwale-nr-xxxiii-151-2012&category_slug=gminne-programy-i-strategie&94=
/docman/?view=document&alias=690-uchwala-rady-miasta-nr-xv-100-2015-zmieniajaca-uchwale-nr-xxxiii-151-2012&category_slug=gminne-programy-i-strategie&Itemid=527

This doesn't work properly on RegExr, it does (with g flag though).

这在RegExr上无法正常工作,但确实如此(尽管有g标志)。

Now if I add something after the second expression like this:

现在,如果我在第二个表达式之后添加一些东西,如下

$row->text = preg_replace($pattern, '$1'.'$2'."WTF?".$replacement, $row->text);

Then it starts to look proper... but there is this unneeded WTF inside a link:

然后它开始看起来正确......但是链接中有这个不需要的WTF:

/docman/?view=document&alias=690-uchwala-rady-miasta-nr-xv-100-2015-zmieniajaca-uchwale-nr-xxxiii-151-2012&category_slug=gminne-programy-i-strategie&Itemid=WTF?494
/docman/?view=document&alias=690-uchwala-rady-miasta-nr-xv-100-2015-zmieniajaca-uchwale-nr-xxxiii-151-2012&category_slug=gminne-programy-i-strategie&Itemid=WTF?494
/docman/?view=document&alias=690-uchwala-rady-miasta-nr-xv-100-2015-zmieniajaca-uchwale-nr-xxxiii-151-2012&category_slug=gminne-programy-i-strategie&Itemid=527

I don't get it. What am I doing wrong and how can I debug this? :(

我不明白。我做错了什么,如何调试? :(

2 个解决方案

#1


1  

Try this code:

试试这段代码:

$row->text = preg_replace($pattern, '$1'.'${2}'.$replacement, $row->text);

The reason your previous code was not working was because it was joining $2 and the 494, so rather than looking for Capture Group #2, it was looking for Capture Group #2494 (the 4 is the first digit of 494). So to avoid the issue, add {} around the number so it sees the 494 separate to the 2

你以前的代码不工作的原因是因为它加入了$ 2和494,所以不是寻找Capture Group#2,而是寻找Capture Group#2494(4是494的第一个数字)。因此,要避免此问题,请在数字周围添加{},以便将494与2分开

If you look at this Regex101 demo, you can see the error inside the Substitute section, highlighted in red

如果您查看此Regex101演示,您可以在替换部分中看到错误,以红色突出显示

If you look at the new Regex101 demo, you can see the new Substitute working

如果您查看新的Regex101演示,您可以看到新的替代品工作


Also, you can change your RegEx to be shorter and more efficient:

此外,您可以将RegEx更改为更短且更高效:

(.*Itemid=)(?!527)(\d*)

Combine (.*)(Itemid=) into (.*Itemid=). This means your Substituion can become this, ${1}494

将(。*)(Itemid =)组合成(。* Itemid =)。这意味着您的Substituion可以成为这个,$ {1} 494

Also use \d to select a digit, rather than [1-9] (unless you do not want 0 to be included, but that is something else you may have in your ID that you missed in your RegEx)

也可以使用\ d选择一个数字,而不是[1-9](除非你不想要包含0,但这是你在你的RegEx中错过的ID中的其他内容)

Regex101 Demo


Then, you could make it safer, so I would suggest the following final RegEx and Substitution:

然后,你可以让它更安全,所以我会建议以下最终的RegEx和Substitution:

(.*Itemid=)(?!527$)(\d*)(.*)
${1}494$3

This allows for URL variables after the Itemid, like this, ...&Itemid=494&Foo=Bar

这允许在Itemid之后的URL变量,像这样,...&Itemid = 494&Foo = Bar

It also stops IDs like &Itemid=5279 from not being selected

它还会阻止像&Itemid = 5279这样的ID被选中

Regex101 Demo

#2


1  

Use the braces to define the capture group ID in an unambiguous manner:

使用大括号以明确的方式定义捕获组ID:

$row->text = preg_replace($pattern, '$1'.'${2}'.$replacement, $row->text);
                                           ^^^

See the regex demo

请参阅正则表达式演示

Since the $replacement starts with a digit, the regex engine is looking for a capture group #24 value ($24 is parsed as the 24th capturing group). To avoid that, use braces around the group ID after $ in the replacement pattern. Or use named captures to avoid the issue altogether.

由于$ replacement以数字开头,因此正则表达式引擎正在寻找捕获组#24值($ 24被解析为第24个捕获组)。为避免这种情况,请在替换模式中的$后面的组ID周围使用大括号。或者使用命名捕获来完全避免这个问题。

I also suggest adding $ after 527 in the pattern ((?!527$)) so as not to exclude IDs that start with 527. And surely you can merge the first 2 groups into one and use [0-9] instead of [1-9] to match IDs like 206 where a 0 can appear inside, and remove the unnecessary capturing group from [0-9]*:

我还建议在模式中添加$ 52之后的$((?!527 $))以便不排除以527开头的ID。当然你可以将前两个组合并为一个并使用[0-9]代替[ 1-9]匹配像206这样的ID,其中0可以出现在里面,并从[0-9] *中删除不必要的捕获组:

(.*Itemid=)(?!527$)[0-9]*
                 ^

and replace with '${1}'.$replacement.

并替换为'$ {1}'。$ replacement。

#1


1  

Try this code:

试试这段代码:

$row->text = preg_replace($pattern, '$1'.'${2}'.$replacement, $row->text);

The reason your previous code was not working was because it was joining $2 and the 494, so rather than looking for Capture Group #2, it was looking for Capture Group #2494 (the 4 is the first digit of 494). So to avoid the issue, add {} around the number so it sees the 494 separate to the 2

你以前的代码不工作的原因是因为它加入了$ 2和494,所以不是寻找Capture Group#2,而是寻找Capture Group#2494(4是494的第一个数字)。因此,要避免此问题,请在数字周围添加{},以便将494与2分开

If you look at this Regex101 demo, you can see the error inside the Substitute section, highlighted in red

如果您查看此Regex101演示,您可以在替换部分中看到错误,以红色突出显示

If you look at the new Regex101 demo, you can see the new Substitute working

如果您查看新的Regex101演示,您可以看到新的替代品工作


Also, you can change your RegEx to be shorter and more efficient:

此外,您可以将RegEx更改为更短且更高效:

(.*Itemid=)(?!527)(\d*)

Combine (.*)(Itemid=) into (.*Itemid=). This means your Substituion can become this, ${1}494

将(。*)(Itemid =)组合成(。* Itemid =)。这意味着您的Substituion可以成为这个,$ {1} 494

Also use \d to select a digit, rather than [1-9] (unless you do not want 0 to be included, but that is something else you may have in your ID that you missed in your RegEx)

也可以使用\ d选择一个数字,而不是[1-9](除非你不想要包含0,但这是你在你的RegEx中错过的ID中的其他内容)

Regex101 Demo


Then, you could make it safer, so I would suggest the following final RegEx and Substitution:

然后,你可以让它更安全,所以我会建议以下最终的RegEx和Substitution:

(.*Itemid=)(?!527$)(\d*)(.*)
${1}494$3

This allows for URL variables after the Itemid, like this, ...&Itemid=494&Foo=Bar

这允许在Itemid之后的URL变量,像这样,...&Itemid = 494&Foo = Bar

It also stops IDs like &Itemid=5279 from not being selected

它还会阻止像&Itemid = 5279这样的ID被选中

Regex101 Demo

#2


1  

Use the braces to define the capture group ID in an unambiguous manner:

使用大括号以明确的方式定义捕获组ID:

$row->text = preg_replace($pattern, '$1'.'${2}'.$replacement, $row->text);
                                           ^^^

See the regex demo

请参阅正则表达式演示

Since the $replacement starts with a digit, the regex engine is looking for a capture group #24 value ($24 is parsed as the 24th capturing group). To avoid that, use braces around the group ID after $ in the replacement pattern. Or use named captures to avoid the issue altogether.

由于$ replacement以数字开头,因此正则表达式引擎正在寻找捕获组#24值($ 24被解析为第24个捕获组)。为避免这种情况,请在替换模式中的$后面的组ID周围使用大括号。或者使用命名捕获来完全避免这个问题。

I also suggest adding $ after 527 in the pattern ((?!527$)) so as not to exclude IDs that start with 527. And surely you can merge the first 2 groups into one and use [0-9] instead of [1-9] to match IDs like 206 where a 0 can appear inside, and remove the unnecessary capturing group from [0-9]*:

我还建议在模式中添加$ 52之后的$((?!527 $))以便不排除以527开头的ID。当然你可以将前两个组合并为一个并使用[0-9]代替[ 1-9]匹配像206这样的ID,其中0可以出现在里面,并从[0-9] *中删除不必要的捕获组:

(.*Itemid=)(?!527$)[0-9]*
                 ^

and replace with '${1}'.$replacement.

并替换为'$ {1}'。$ replacement。