Hi I have firmly simple question, but i am not an regex ace: i have a string that looks something like this:
嗨,我有一个简单的问题,但我不是一个正则表达式的王牌:我有一个看起来像这样的字符串:
Some text
Error codes:
10001 iTPM full self test
10003 less than minimum required
10004 bad tag value
10005 bad param size
10006 fail check
And using regex I am trying to get text from Error codes:, but without it, to the end of string
使用正则表达式我试图从错误代码中获取文本:但没有它,到字符串的末尾
So far I've got:
到目前为止,我有:
(?<=Error codes:\n)(?s)(.*?)(fail check)
It works but its a stretch solution, I want to replace this last group with read till end but so far no luck.
它可以工作,但它是一个伸展性的解决方案,我想用读取结束替换最后一组,但到目前为止还没有运气。
Text contains line breakers as this info is needed.
文本包含断路器,因为需要此信息。
Lets say c# will be my choice of language
让我们说c#将是我选择的语言
Expected outcome shold look like:
预期结果如下:
10001 iTPM full self test
10003 less than minimum required
10004 bad tag value
10005 bad param size
10006 fail check
I want to read to the end of string as I cannot be sure if some new codes will not be added.
我想读到字符串的末尾,因为我无法确定是否会添加一些新代码。
2 个解决方案
#1
1
If "Lets say c# will be my choice of language" I suggest combining Linq and regular expressions:
如果“让我们说c#将是我选择的语言”,我建议将Linq和正则表达式结合起来:
using System.Linq;
using System.Text.RegularExpressions;
...
string source =
@"Some text
Error codes:
10001 iTPM full self test
10003 less than minimum required
10004 bad tag value
10005 bad param size
10006 fail check";
var result = source
.Split(new char[] { '\r', '\n' }, StringSplitOptions.RemoveEmptyEntries)
.SkipWhile(line => !line.StartsWith("Error codes:"))
.Select(line => Regex.Match(line, @"^(?<code>[0-9]+)\s*(?<name>.+)$"))
.Where(match => match.Success) // Or .TakeWhile(match => match.Success)
.Select(match => $"{match.Groups["code"].Value} {match.Groups["name"].Value}")
.ToArray(); // let's represent result as an array
Test:
Console.Write(string.Join(Environment.NewLine, result));
Outcome:
10001 iTPM full self test
10003 less than minimum required
10004 bad tag value
10005 bad param size
10006 fail check
#2
1
Try with below regex, lookbehind from Error codes with two line breaks.
尝试使用以下正则表达式,从两个换行符的错误代码中查看。
(?<=Error codes:\n\n)[\w\s]+
#1
1
If "Lets say c# will be my choice of language" I suggest combining Linq and regular expressions:
如果“让我们说c#将是我选择的语言”,我建议将Linq和正则表达式结合起来:
using System.Linq;
using System.Text.RegularExpressions;
...
string source =
@"Some text
Error codes:
10001 iTPM full self test
10003 less than minimum required
10004 bad tag value
10005 bad param size
10006 fail check";
var result = source
.Split(new char[] { '\r', '\n' }, StringSplitOptions.RemoveEmptyEntries)
.SkipWhile(line => !line.StartsWith("Error codes:"))
.Select(line => Regex.Match(line, @"^(?<code>[0-9]+)\s*(?<name>.+)$"))
.Where(match => match.Success) // Or .TakeWhile(match => match.Success)
.Select(match => $"{match.Groups["code"].Value} {match.Groups["name"].Value}")
.ToArray(); // let's represent result as an array
Test:
Console.Write(string.Join(Environment.NewLine, result));
Outcome:
10001 iTPM full self test
10003 less than minimum required
10004 bad tag value
10005 bad param size
10006 fail check
#2
1
Try with below regex, lookbehind from Error codes with two line breaks.
尝试使用以下正则表达式,从两个换行符的错误代码中查看。
(?<=Error codes:\n\n)[\w\s]+