正则表达式以应用退格符

时间:2022-09-28 23:23:47

I have a string coming from a telnet client. This string contains backspace characters which I need to apply. Each backspace should remove one previously typed character.

我有一个来自telnet客户端的字符串。该字符串包含我需要应用的退格字符。每个退格键应删除一个以前键入的字符。

I'm trying to do this in a single replace using regular expression:

我正在尝试使用正则表达式在单个替换中执行此操作:

string txt = "Hello7\b World123\b\b\b";
txt = Regex.Replace(txt, ".\\\b", "", RegexOptions.ECMAScript);

Which results in "Hello World12". Of course, I want "12" to be removed too, but it obviously doesn't match my expression.

这导致“Hello World12”。当然,我也希望删除“12”,但它显然与我的表达不符。

In some way, it should repeat replacing until there are no more matches. Any ideas on how to achieve this with a single regular expression?

在某种程度上,它应该重复替换,直到没有更多的匹配。有关如何使用单个正则表达式实现此目的的任何想法?

2 个解决方案

#1


4  

This is basically a variant of How can we match a^n b^n with Java regex?, so we could reuse its answer there:

这基本上是我们如何将一个^ n b ^ n与Java正则表达式匹配的变体?所以我们可以在那里重用它的答案:

var regex = new Regex(@"(?:[^\b](?=[^\b]*((?>\1?)[\b])))+\1");
Console.WriteLine(regex.Replace("Hello7\b World123\b\b\b", ""));

Additionally, the .NET regex engine supports balancing groups, so we could use a different pattern:

此外,.NET正则表达式引擎支持平衡组,因此我们可以使用不同的模式:

var regex = new Regex(@"(?<L>[^\b])+(?<R-L>[\b])+(?(L)(?!))");

(This means:

  1. Match one or more non-backspaces, assigning them with the name "L",
  2. 匹配一个或多个非退格,为其分配名称“L”,

  3. then followed one or more backspaces, assigning them with the name "R", with the condition that every "R" must have one corresponding "L",
  4. 然后跟随一个或多个退格,给它们分配名称“R”,条件是每个“R”必须有一个对应的“L”,

  5. if there are any "L"s left, abandon the match (as (?!) matches nothing).
  6. 如果剩下任何“L”,则放弃比赛(因为(?!)不匹配)。

)

#2


3  

I wouldn't try to use a regular expression for this, since it's very impenetrable to read and I have the feeling that it's not even possible with plain regular expression without any perl-like regex magic-extensions. My suggestion would be something like (python like pseudocode):

我不会尝试使用正则表达式,因为它非常难以阅读,我觉得它甚至不可能使用普通的正则表达式而没有任何类似perl的正则表达式魔术扩展。我的建议是(python like pseudocode):

stack = []
for char in str:
    if char == BACKSPACE and not stack.isEmpty():
        stack.pop()
    else:
        stack.push(char)

result = ''.join(stack)

It'S immediately clear what happens and how it works.

它立即清楚发生了什么以及它是如何工作的。

#1


4  

This is basically a variant of How can we match a^n b^n with Java regex?, so we could reuse its answer there:

这基本上是我们如何将一个^ n b ^ n与Java正则表达式匹配的变体?所以我们可以在那里重用它的答案:

var regex = new Regex(@"(?:[^\b](?=[^\b]*((?>\1?)[\b])))+\1");
Console.WriteLine(regex.Replace("Hello7\b World123\b\b\b", ""));

Additionally, the .NET regex engine supports balancing groups, so we could use a different pattern:

此外,.NET正则表达式引擎支持平衡组,因此我们可以使用不同的模式:

var regex = new Regex(@"(?<L>[^\b])+(?<R-L>[\b])+(?(L)(?!))");

(This means:

  1. Match one or more non-backspaces, assigning them with the name "L",
  2. 匹配一个或多个非退格,为其分配名称“L”,

  3. then followed one or more backspaces, assigning them with the name "R", with the condition that every "R" must have one corresponding "L",
  4. 然后跟随一个或多个退格,给它们分配名称“R”,条件是每个“R”必须有一个对应的“L”,

  5. if there are any "L"s left, abandon the match (as (?!) matches nothing).
  6. 如果剩下任何“L”,则放弃比赛(因为(?!)不匹配)。

)

#2


3  

I wouldn't try to use a regular expression for this, since it's very impenetrable to read and I have the feeling that it's not even possible with plain regular expression without any perl-like regex magic-extensions. My suggestion would be something like (python like pseudocode):

我不会尝试使用正则表达式,因为它非常难以阅读,我觉得它甚至不可能使用普通的正则表达式而没有任何类似perl的正则表达式魔术扩展。我的建议是(python like pseudocode):

stack = []
for char in str:
    if char == BACKSPACE and not stack.isEmpty():
        stack.pop()
    else:
        stack.push(char)

result = ''.join(stack)

It'S immediately clear what happens and how it works.

它立即清楚发生了什么以及它是如何工作的。