regex.replace不替换此表达式中的所有内容。

时间:2022-02-16 05:39:32
[regex]::replace('test test','^(.*?)test', 'barf')

returns 'barf test'

返回'barf test'

Why doesn't it replace all occurrences of 'test'? This must have something to do with the position at which a subsequent replace iteration begins.

为什么不替换所有出现的'测试'?这必须与后续替换迭代开始的位置有关。

4 个解决方案

#1


2  

Quick answer: you anchored it at the beginning of the input (^) and your first group ((.*?)) did not capture anything (since the first occurrence of test was found right after the beginning of line and you use a lazy quantifier -- furthermore you don't use the capture in your replacement string. Had you used a "normal" quantifier, the last occurrence of test would have been replaced).

快速回答:你将它锚定在输入的开头(^),你的第一个组((。*?))没有捕获任何东西(因为第一次出现的测试是在行开始后发现的,你使用了懒惰量词 - 此外你不在替换字符串中使用捕获。如果你使用“正常”量词,最后一次测试将被替换掉)。

Long answer: a regex never needs to match the whole input, only the parts which are necessary. What's more, when cycling through an input, the regex engine will start the next round from the position where it successfully completed a match.

答案很长:正则表达式永远不需要匹配整个输入,只需要匹配必要的部分。更重要的是,当循环输入时,正则表达式引擎将从成功完成匹配的位置开始下一轮。

Here, you want to replace a sequence of characters which is test. Note that it will also means that testosterone will be matched (or untested). If you want to match test as a "word", use the word anchor \b.

在这里,您想要替换一系列正在测试的字符。请注意,这也意味着睾丸激素将匹配(或未经测试)。如果要将test与“单词”匹配,请使用单词anchor \ b。

This works (tested on Powershell v2):

这有效(在Powershell v2上测试):

[regex]::replace('test test','\btest\b', 'barf')

The engine in action looks something like this:

运行中的引擎看起来像这样:

# beginning
regex: |\btest\b
input: |test test
# \b: matched,  beginning of input followed by word character
regex: \b|test\b
input: |test test
# literal matching of t, then e, then s, then t
regex: \btest|\b
input: test| test
# \b: match, word character followed by non word character
regex: \btest\b|
input: test| test
# replacement
regex: \btest\b|
input: barf| test
# beginning of second round
regex: |\btest\b
input: barf| test
# \b: match, word character followed by non word character
regex: \b|test\b
input: barf| test
# t: not matched. Failed matching. Proceeding to next character
regex: |\btest\b
input: barf |test
# \b: match
regex: \b|test\b
input: barf |test
# literal matching of t, then e, then s, then t
regex: \btest|\b
input: barf test|
# \b: match, word character followed by end of input
regex: \btest\b|
input: barf test|
# replacement
regex: \btest\b|
input: barf barf|
# beginning of next round
regex: |\btest\b
input: barf barf|
# end of input: end of processing

#2


0  

Because once the first 'test' is found at the beginning of the string (with /(.*?)/ matching an empty string) the next search starts after that string. Straight away the /^/ cannot match, so no more replacements are made.

因为一旦在字符串的开头找到第一个'test'(/(。*?)/匹配一个空字符串),下一个搜索就会在该字符串之后开始。直接/ ^ /无法匹配,因此不再进行替换。

The regex engine doesn't find all ways that a pattern could possibly match: it claims the first match that it comes across and moves on.

正则表达式引擎找不到模式可能匹配的所有方式:它声称它遇到的第一个匹配并继续前进。

#3


0  

This is because .*? matches as less as possible including the empty string. So you are matching only the first "test" and replacing it.

这是因为 。*?匹配尽可能少,包括空字符串。所以你只匹配第一个“测试”并替换它。

The main reason is because your anchor ^. That means your regex matching only once from the start, after the replacement the regex would continue after the replacement but at this point the anchor is not true, so your regex is done.

主要原因是因为你的主播^。这意味着你的正则表达式从一开始只匹配一次,在替换后正则表达式将在替换后继续,但此时锚点不正确,所以你的正则表达式完成了。

From your comment

从你的评论

BUT! WHY DOESN'T this replace both: [regex]::replace("testntest",'^(.*?)test', 'barf')(The "testntest" has a newline in the middle so the second instance should match the ^

但!为什么不替换它们:[regex] :: replace(“testntest”,'^(。*?)test','barf')(“testntest”在中间有一个换行符,所以第二个实例应匹配^

The anchor ^ matches only the start of the string per default, if you use the modifier m (Multiline), then the anchor ^ will match the start of the row

每次默认情况下,锚点^仅匹配字符串的开头,如果使用修饰符m(多行),则锚点^将匹配行的开头

If you want to replace all occurrences of "test" then match only "test", without ^.*?

如果要替换所有出现的“test”,那么只匹配“test”,而不是^。*?

#4


-1  

The question mark is the lazy operator. It tries to quit as soon as it can. Remove it and thy will be done.

问号是懒惰的运算符。它试图尽快退出。删除它,你的意志将完成。

#1


2  

Quick answer: you anchored it at the beginning of the input (^) and your first group ((.*?)) did not capture anything (since the first occurrence of test was found right after the beginning of line and you use a lazy quantifier -- furthermore you don't use the capture in your replacement string. Had you used a "normal" quantifier, the last occurrence of test would have been replaced).

快速回答:你将它锚定在输入的开头(^),你的第一个组((。*?))没有捕获任何东西(因为第一次出现的测试是在行开始后发现的,你使用了懒惰量词 - 此外你不在替换字符串中使用捕获。如果你使用“正常”量词,最后一次测试将被替换掉)。

Long answer: a regex never needs to match the whole input, only the parts which are necessary. What's more, when cycling through an input, the regex engine will start the next round from the position where it successfully completed a match.

答案很长:正则表达式永远不需要匹配整个输入,只需要匹配必要的部分。更重要的是,当循环输入时,正则表达式引擎将从成功完成匹配的位置开始下一轮。

Here, you want to replace a sequence of characters which is test. Note that it will also means that testosterone will be matched (or untested). If you want to match test as a "word", use the word anchor \b.

在这里,您想要替换一系列正在测试的字符。请注意,这也意味着睾丸激素将匹配(或未经测试)。如果要将test与“单词”匹配,请使用单词anchor \ b。

This works (tested on Powershell v2):

这有效(在Powershell v2上测试):

[regex]::replace('test test','\btest\b', 'barf')

The engine in action looks something like this:

运行中的引擎看起来像这样:

# beginning
regex: |\btest\b
input: |test test
# \b: matched,  beginning of input followed by word character
regex: \b|test\b
input: |test test
# literal matching of t, then e, then s, then t
regex: \btest|\b
input: test| test
# \b: match, word character followed by non word character
regex: \btest\b|
input: test| test
# replacement
regex: \btest\b|
input: barf| test
# beginning of second round
regex: |\btest\b
input: barf| test
# \b: match, word character followed by non word character
regex: \b|test\b
input: barf| test
# t: not matched. Failed matching. Proceeding to next character
regex: |\btest\b
input: barf |test
# \b: match
regex: \b|test\b
input: barf |test
# literal matching of t, then e, then s, then t
regex: \btest|\b
input: barf test|
# \b: match, word character followed by end of input
regex: \btest\b|
input: barf test|
# replacement
regex: \btest\b|
input: barf barf|
# beginning of next round
regex: |\btest\b
input: barf barf|
# end of input: end of processing

#2


0  

Because once the first 'test' is found at the beginning of the string (with /(.*?)/ matching an empty string) the next search starts after that string. Straight away the /^/ cannot match, so no more replacements are made.

因为一旦在字符串的开头找到第一个'test'(/(。*?)/匹配一个空字符串),下一个搜索就会在该字符串之后开始。直接/ ^ /无法匹配,因此不再进行替换。

The regex engine doesn't find all ways that a pattern could possibly match: it claims the first match that it comes across and moves on.

正则表达式引擎找不到模式可能匹配的所有方式:它声称它遇到的第一个匹配并继续前进。

#3


0  

This is because .*? matches as less as possible including the empty string. So you are matching only the first "test" and replacing it.

这是因为 。*?匹配尽可能少,包括空字符串。所以你只匹配第一个“测试”并替换它。

The main reason is because your anchor ^. That means your regex matching only once from the start, after the replacement the regex would continue after the replacement but at this point the anchor is not true, so your regex is done.

主要原因是因为你的主播^。这意味着你的正则表达式从一开始只匹配一次,在替换后正则表达式将在替换后继续,但此时锚点不正确,所以你的正则表达式完成了。

From your comment

从你的评论

BUT! WHY DOESN'T this replace both: [regex]::replace("testntest",'^(.*?)test', 'barf')(The "testntest" has a newline in the middle so the second instance should match the ^

但!为什么不替换它们:[regex] :: replace(“testntest”,'^(。*?)test','barf')(“testntest”在中间有一个换行符,所以第二个实例应匹配^

The anchor ^ matches only the start of the string per default, if you use the modifier m (Multiline), then the anchor ^ will match the start of the row

每次默认情况下,锚点^仅匹配字符串的开头,如果使用修饰符m(多行),则锚点^将匹配行的开头

If you want to replace all occurrences of "test" then match only "test", without ^.*?

如果要替换所有出现的“test”,那么只匹配“test”,而不是^。*?

#4


-1  

The question mark is the lazy operator. It tries to quit as soon as it can. Remove it and thy will be done.

问号是懒惰的运算符。它试图尽快退出。删除它,你的意志将完成。