从字符串读取到字符串结尾

时间:2021-04-22 21:46:28

Hi I have firmly simple question, but i am not an regex ace: i have a string that looks something like this:

嗨,我有一个简单的问题,但我不是一个正则表达式的王牌:我有一个看起来像这样的字符串:

Some text

Error codes:

10001 iTPM full self test
10003 less than minimum required
10004 bad tag value
10005 bad param size 
10006 fail check

And using regex I am trying to get text from Error codes:, but without it, to the end of string

使用正则表达式我试图从错误代码中获取文本:但没有它,到字符串的末尾

So far I've got:

到目前为止,我有:

(?<=Error codes:\n)(?s)(.*?)(fail check)

It works but its a stretch solution, I want to replace this last group with read till end but so far no luck.

它可以工作,但它是一个伸展性的解决方案,我想用读取结束替换最后一组,但到目前为止还没有运气。

Text contains line breakers as this info is needed.

文本包含断路器,因为需要此信息。

Lets say c# will be my choice of language

让我们说c#将是我选择的语言

Expected outcome shold look like:

预期结果如下:

10001 iTPM full self test
10003 less than minimum required
10004 bad tag value
10005 bad param size 
10006 fail check

I want to read to the end of string as I cannot be sure if some new codes will not be added.

我想读到字符串的末尾,因为我无法确定是否会添加一些新代码。

2 个解决方案

#1


1  

If "Lets say c# will be my choice of language" I suggest combining Linq and regular expressions:

如果“让我们说c#将是我选择的语言”,我建议将Linq和正则表达式结合起来:

using System.Linq;
using System.Text.RegularExpressions;

...

string source =
  @"Some text

Error codes:

10001 iTPM full self test
10003 less than minimum required
10004 bad tag value
10005 bad param size
10006 fail check";

var result = source
  .Split(new char[] { '\r', '\n' }, StringSplitOptions.RemoveEmptyEntries)
  .SkipWhile(line => !line.StartsWith("Error codes:"))
  .Select(line => Regex.Match(line, @"^(?<code>[0-9]+)\s*(?<name>.+)$"))
  .Where(match => match.Success) // Or .TakeWhile(match => match.Success)
  .Select(match => $"{match.Groups["code"].Value} {match.Groups["name"].Value}")
  .ToArray(); // let's represent result as an array

Test:

Console.Write(string.Join(Environment.NewLine, result));

Outcome:

10001 iTPM full self test
10003 less than minimum required
10004 bad tag value
10005 bad param size
10006 fail check

#2


1  

Try with below regex, lookbehind from Error codes with two line breaks.

尝试使用以下正则表达式,从两个换行符的错误代码中查看。

(?<=Error codes:\n\n)[\w\s]+

RegexDemo

#1


1  

If "Lets say c# will be my choice of language" I suggest combining Linq and regular expressions:

如果“让我们说c#将是我选择的语言”,我建议将Linq和正则表达式结合起来:

using System.Linq;
using System.Text.RegularExpressions;

...

string source =
  @"Some text

Error codes:

10001 iTPM full self test
10003 less than minimum required
10004 bad tag value
10005 bad param size
10006 fail check";

var result = source
  .Split(new char[] { '\r', '\n' }, StringSplitOptions.RemoveEmptyEntries)
  .SkipWhile(line => !line.StartsWith("Error codes:"))
  .Select(line => Regex.Match(line, @"^(?<code>[0-9]+)\s*(?<name>.+)$"))
  .Where(match => match.Success) // Or .TakeWhile(match => match.Success)
  .Select(match => $"{match.Groups["code"].Value} {match.Groups["name"].Value}")
  .ToArray(); // let's represent result as an array

Test:

Console.Write(string.Join(Environment.NewLine, result));

Outcome:

10001 iTPM full self test
10003 less than minimum required
10004 bad tag value
10005 bad param size
10006 fail check

#2


1  

Try with below regex, lookbehind from Error codes with two line breaks.

尝试使用以下正则表达式,从两个换行符的错误代码中查看。

(?<=Error codes:\n\n)[\w\s]+

RegexDemo