正则表达式在行的开头替换任意数量的匹配

时间:2022-05-11 10:25:01

I have text with this structure:

我有这种结构的文字:

1.  Text1
2.  Text 2. It has a number with a dot.
3.  1.   Text31

I want to get this text:

我想得到这个文字:

# Text1
# Text 2. It has a number with a dot. (notice that this number did not get replaced)
## Text31

I tried doing the following but it does not work

我尝试了以下操作,但它不起作用

var pattern = @"^(\s*\d+\.\s*)+";
var replaced = Regex.Replace(str, pattern, "#", RegexOptions.Multiline);

Basically, it should start matching at the start of every line and replace every matched group with # symbol. Currently, if more than one group is matched, everything is replaced by a single # symbol. Pattern I am using is probably incorrect, can anyone come up with a solution?

基本上,它应该在每一行的开头开始匹配,并用#符号替换每个匹配的组。目前,如果匹配多个组,则所有内容都将替换为单个#符号。我使用的模式可能不正确,任何人都可以提出解决方案吗?

2 个解决方案

#1


5  

You may use

你可以用

(?:\G|^)\s*\d+\.

It matches the start of string or the end of the previous successful match or start of a line, and then zero or more whitespaces, one or more digits and a dot.

它匹配字符串的开头或上一个成功匹配的结束或一行的开始,然后是零个或多个空格,一个或多个数字和一个点。

Details

  • (?:\G|^) - start of string or end of the previous match (\G) or the start of a line (^)
  • (?:\ G | ^) - 字符串的开头或上一个匹配的结束(\ G)或行的开头(^)

  • \s* - zero or more whitespaces if you want to only match horizontal whitespaces to avoid overflowing to the next lie(s) replace with [\s-[\r\n]]* or [\p{Zs}\t]*)
  • \ s * - 如果您只想匹配水平空格以避免溢出到下一个谎言,则为零或更多空格替换为[\ s - [\ r \ n]] *或[\ p {Zs} \ t] *)

  • \d+ - one or more digits (to match only ASCII digits, replace with [0-9]+ or pass the RegexOptions.ECMAScript option to the Regex constructor)
  • \ d + - 一个或多个数字(仅匹配ASCII数字,替换为[0-9] +或将RegexOptions.ECMAScript选项传递给Regex构造函数)

  • \. - a dot.
  • \。 - 一个点。

The RegexOptions.Multiline option must be passed to the Regex constructor to make ^ match the start of a line. Or add an inline version of the anchor, (?m), at the start of the pattern.

必须将RegexOptions.Multiline选项传递给Regex构造函数才能使^匹配行的开头。或者在模式的开头添加锚点的内联版本(?m)。

For more details about \G anchor, see Continuing at The End of The Previous Match.

有关\ G锚点的更多详细信息,请参阅上一场比赛结束时的继续。

See the RegexStorm demo.

请参阅RegexStorm演示。

#2


0  

Try

(?<![a-z].*)\s*\d+\.

It looks for a sequence of digits \d+ followed by a dot \. preceded by any number of white space characters \s*. This, in turn, must not be preceded by a letter on the line, checked by a negative look-behind (?<![a-z].*) in the beginning of the regex.

它查找一个数字序列\ d +后跟一个点\。前面有任意数量的空白字符\ s *。反过来,这不能在行前面加上一个字母,在正则表达式的开头由负面的后视(?

Here at RegEx Storm.

这里是RegEx Storm。

#1


5  

You may use

你可以用

(?:\G|^)\s*\d+\.

It matches the start of string or the end of the previous successful match or start of a line, and then zero or more whitespaces, one or more digits and a dot.

它匹配字符串的开头或上一个成功匹配的结束或一行的开始,然后是零个或多个空格,一个或多个数字和一个点。

Details

  • (?:\G|^) - start of string or end of the previous match (\G) or the start of a line (^)
  • (?:\ G | ^) - 字符串的开头或上一个匹配的结束(\ G)或行的开头(^)

  • \s* - zero or more whitespaces if you want to only match horizontal whitespaces to avoid overflowing to the next lie(s) replace with [\s-[\r\n]]* or [\p{Zs}\t]*)
  • \ s * - 如果您只想匹配水平空格以避免溢出到下一个谎言,则为零或更多空格替换为[\ s - [\ r \ n]] *或[\ p {Zs} \ t] *)

  • \d+ - one or more digits (to match only ASCII digits, replace with [0-9]+ or pass the RegexOptions.ECMAScript option to the Regex constructor)
  • \ d + - 一个或多个数字(仅匹配ASCII数字,替换为[0-9] +或将RegexOptions.ECMAScript选项传递给Regex构造函数)

  • \. - a dot.
  • \。 - 一个点。

The RegexOptions.Multiline option must be passed to the Regex constructor to make ^ match the start of a line. Or add an inline version of the anchor, (?m), at the start of the pattern.

必须将RegexOptions.Multiline选项传递给Regex构造函数才能使^匹配行的开头。或者在模式的开头添加锚点的内联版本(?m)。

For more details about \G anchor, see Continuing at The End of The Previous Match.

有关\ G锚点的更多详细信息,请参阅上一场比赛结束时的继续。

See the RegexStorm demo.

请参阅RegexStorm演示。

#2


0  

Try

(?<![a-z].*)\s*\d+\.

It looks for a sequence of digits \d+ followed by a dot \. preceded by any number of white space characters \s*. This, in turn, must not be preceded by a letter on the line, checked by a negative look-behind (?<![a-z].*) in the beginning of the regex.

它查找一个数字序列\ d +后跟一个点\。前面有任意数量的空白字符\ s *。反过来,这不能在行前面加上一个字母,在正则表达式的开头由负面的后视(?

Here at RegEx Storm.

这里是RegEx Storm。