如何在正则表达式列表中找到与我的输入匹配的第一个正则表达式?

时间:2021-05-19 06:47:03

Is there another way to write the following?

有没有其他方式写下面的内容?

string input;

var match = Regex.Match(input, @"Type1");

if (!match.Success)
{
  match = Regex.Match(input, @"Type2");
}

if (!match.Success)
{
  match = Regex.Match(input, @"Type3");
}

Basically, I want to run my string thru a gammut of expressions and see which one sticks.

基本上,我想通过一个表达式的gammut运行我的字符串,看看哪一个坚持。

3 个解决方案

#1


10  

var patterns = new[] { "Type1", "Type2", "Type3" };
Match match;
foreach (string pattern in patterns)
{
    match = Regex.Match(input, pattern);
    if (match.Success)
        break;
}

or

要么

var patterns = new[] { "Type1", "Type2", "Type3" };
var match = patterns
    .Select(p => Regex.Match(input, p))
    .FirstOrDefault(m => m.Success);

// In your original example, match will be the last match if all are
// unsuccessful. I expect this is an accident, but if you want this
// behavior, you can do this instead:
var match = patterns
    .Select(p => Regex.Match(input, p))
    .FirstOrDefault(m => m.Success)
    ?? Regex.Match(input, patterns[patterns.Length - 1]);

Because LINQ to Objects uses deferred execution, Regex.Match will only be called until a match is found, so you don't have to worry about this approach being too eager.

因为LINQ to Objects使用延迟执行,所以只有在找到匹配项之后才会调用Regex.Match,因此您不必担心这种方法过于急切。

#2


5  

Yes, I would write it like this to avoid executing the Regex match multiple times:

是的,我会这样写,以避免多次执行正则表达式匹配:

        match = Regex.Match(input, @"Type1|Type2|Type3");

        if (match.Success)
        {
            // loop, in case you are matching to multiple occurrences within the input.
            // However, Regex.Match(string, string) will only match to the first occurrence.
            foreach (Capture capture in match.Captures)
            {
                // if you care to determine which one (Type1, Type2, or Type3) each capture is
                switch (capture.Value)
                {
                    case "Type1":
                        // ...
                        break;
                    case "Type2":
                        // ...
                        break;
                    case "Type3":
                        // ...
                        break;
                }
            }
        }

Alternatively, if you have an arbitrary list of patterns that you want to check:

或者,如果您要检查任意模式列表:

        // assumption is that patterns contains a list of valid Regex expressions
        match = Regex.Match(input, string.Join("|", patterns));

        if (match.Success)
        {
            // obviously, only one of these return statements is needed

            // return the first occurrence
            return match.Captures[0].Value;

            // return an IEnumerable<string> of the matched patterns
            return match.Captures.OfType<Capture>().Select(capture => capture.Value);
        }

Here is another approach that uses named capture groups in order to index each pattern. when a match is found, we attempt to determine which of the capture groups was matched.

这是另一种使用命名捕获组的方法,以便为每个模式编制索引。当找到匹配时,我们尝试确定哪个捕获组匹配。

I very much dislike this code due to the repeated unnecessary concatenation of "Pattern" with the index, but I'm not sure how to do this cleaner:

由于“模式”与索引的重复不必要的连接,我非常不喜欢这段代码,但我不知道如何更清洁:

EDIT: I've cleaned up this code a bit by using a dictionary

编辑:我通过使用字典清理了这段代码

        // assumption is that patterns contains a list of valid Regex expressions
        int i = 0;
        var mapOfGroupNameToPattern = patterns.ToDictionary(pattern => "Pattern" + (i++));

        match = Regex.Match(input, string.Join("|", mapOfGroupNameToPattern.Select(x => "(?<" + x.Key + ">" + x.Value + ")")));

        if (match.Success)
        {
            foreach (var pattern in mapOfGroupNameToPattern)
            {
                if (match.Groups[pattern.Key].Captures.Count > 0)
                {
                    // this is the pattern that was matched
                    return pattern.Value;
                }
            }
        }

#3


0  

Another way of doing it. It iterates through all the list, BUT you can look a variable number or strings to match without having to write x num of if statements.

另一种方式。它遍历所有列表,但你可以查看一个变量数字或字符串来匹配,而不必编写x num的if语句。

string input = "Type1";
List<string> stringsToTest = new List<string>() { @"Type1", @"Type2", @"Type3" };

var q = from string t in stringsToTest
        where Regex.IsMatch(input, t)
        select t;

//This way you can get how many strings on the list matched the input   
foreach(string s in q)
{
    Console.WriteLine(s);
}

#1


10  

var patterns = new[] { "Type1", "Type2", "Type3" };
Match match;
foreach (string pattern in patterns)
{
    match = Regex.Match(input, pattern);
    if (match.Success)
        break;
}

or

要么

var patterns = new[] { "Type1", "Type2", "Type3" };
var match = patterns
    .Select(p => Regex.Match(input, p))
    .FirstOrDefault(m => m.Success);

// In your original example, match will be the last match if all are
// unsuccessful. I expect this is an accident, but if you want this
// behavior, you can do this instead:
var match = patterns
    .Select(p => Regex.Match(input, p))
    .FirstOrDefault(m => m.Success)
    ?? Regex.Match(input, patterns[patterns.Length - 1]);

Because LINQ to Objects uses deferred execution, Regex.Match will only be called until a match is found, so you don't have to worry about this approach being too eager.

因为LINQ to Objects使用延迟执行,所以只有在找到匹配项之后才会调用Regex.Match,因此您不必担心这种方法过于急切。

#2


5  

Yes, I would write it like this to avoid executing the Regex match multiple times:

是的,我会这样写,以避免多次执行正则表达式匹配:

        match = Regex.Match(input, @"Type1|Type2|Type3");

        if (match.Success)
        {
            // loop, in case you are matching to multiple occurrences within the input.
            // However, Regex.Match(string, string) will only match to the first occurrence.
            foreach (Capture capture in match.Captures)
            {
                // if you care to determine which one (Type1, Type2, or Type3) each capture is
                switch (capture.Value)
                {
                    case "Type1":
                        // ...
                        break;
                    case "Type2":
                        // ...
                        break;
                    case "Type3":
                        // ...
                        break;
                }
            }
        }

Alternatively, if you have an arbitrary list of patterns that you want to check:

或者,如果您要检查任意模式列表:

        // assumption is that patterns contains a list of valid Regex expressions
        match = Regex.Match(input, string.Join("|", patterns));

        if (match.Success)
        {
            // obviously, only one of these return statements is needed

            // return the first occurrence
            return match.Captures[0].Value;

            // return an IEnumerable<string> of the matched patterns
            return match.Captures.OfType<Capture>().Select(capture => capture.Value);
        }

Here is another approach that uses named capture groups in order to index each pattern. when a match is found, we attempt to determine which of the capture groups was matched.

这是另一种使用命名捕获组的方法,以便为每个模式编制索引。当找到匹配时,我们尝试确定哪个捕获组匹配。

I very much dislike this code due to the repeated unnecessary concatenation of "Pattern" with the index, but I'm not sure how to do this cleaner:

由于“模式”与索引的重复不必要的连接,我非常不喜欢这段代码,但我不知道如何更清洁:

EDIT: I've cleaned up this code a bit by using a dictionary

编辑:我通过使用字典清理了这段代码

        // assumption is that patterns contains a list of valid Regex expressions
        int i = 0;
        var mapOfGroupNameToPattern = patterns.ToDictionary(pattern => "Pattern" + (i++));

        match = Regex.Match(input, string.Join("|", mapOfGroupNameToPattern.Select(x => "(?<" + x.Key + ">" + x.Value + ")")));

        if (match.Success)
        {
            foreach (var pattern in mapOfGroupNameToPattern)
            {
                if (match.Groups[pattern.Key].Captures.Count > 0)
                {
                    // this is the pattern that was matched
                    return pattern.Value;
                }
            }
        }

#3


0  

Another way of doing it. It iterates through all the list, BUT you can look a variable number or strings to match without having to write x num of if statements.

另一种方式。它遍历所有列表,但你可以查看一个变量数字或字符串来匹配,而不必编写x num的if语句。

string input = "Type1";
List<string> stringsToTest = new List<string>() { @"Type1", @"Type2", @"Type3" };

var q = from string t in stringsToTest
        where Regex.IsMatch(input, t)
        select t;

//This way you can get how many strings on the list matched the input   
foreach(string s in q)
{
    Console.WriteLine(s);
}