提取从特定字符串开始直到某个字符串的值

时间:2022-09-13 08:31:21

Given the following multiline string:

给定以下多行字符串:

WHERE ([EXTENT1].[MY_ID] IN (151,152,214,218,931,932,933,1067,1412,1414,13807,14347,14349,14446)) AND ([EXTENT1].[MY_OTHER_ID] IN (14264, 14335, 14385, 14398, 14603, 14650, 15164, 15374)) AND ([EXTENT2].[PERSON_ID] IN (28,933,14446,179,152,14349,14347,933,130,218,933,1067,931,151,214,152,933,145,931,145,5809,14347,14349,14349,1414,142,1412,179,152,14347,152,90,13807,932,931))
    )  AS [FILTER1]
    GROUP BY [K1], [K2]
)  AS [GROUPBY1]

I want to extract the parenthesized values of the IN clause for [MY_ID]. I can use the following regex (?<=\.\[MY_ID\].*IN.*\().* to truncate off the first portion of the string and return:

我想为[MY_ID]提取IN子句的带括号的值。我可以使用以下正则表达式(?<= \。\ [MY_ID \]。* IN。* \()。*来截断字符串的第一部分并返回:

 151,152,214,218,931,932,933,1067,1412,1414,13807,14347,14349,14446)) AND ([EXTENT1].[MY_OTHER_ID] IN (14264, 14335, 14385, 14398, 14603, 14650, 15164, 15374)) AND ([EXTENT2].[PERSON_ID] IN (28,933,14446,179,152,14349,14347,933,130,218,933,1067,931,151,214,152,933,145,931,145,5809,14347,14349,14349,1414,142,1412,179,152,14347,152,90,13807,932,931))

But I can't figure out how to have it stop at the first closing ) of the in clause.

但我无法弄清楚如何让它在第一次结束时停止。

What I am after is: 151,152,214,218,931,932,933,1067,1412,1414,13807,14347,14349,14446

我所追求的是:151,152,214,218,931,932,933,1067,1412,1414,13807,14347,14349,14446

The regex will eventually be used in with the .NET regex engine.

正则表达式最终将与.NET正则表达式引擎一起使用。

2 个解决方案

#1


1  

Thanks to the fact that .NET regex supports repeated groups, you can use

由于.NET正则表达式支持重复组,您可以使用

\.\[MY_ID]\s*IN\s*\(((?:,?(\d+))+)

and grab either the Group 1 value (that will be 151,152,214,218,931,932,933,1067,1412,1414,13807,14347,14349,14446) or all the captures from the Group 2 capture collection as an array/list.

并获取第1组值(即151,152,214,218,931,932,933,1067,1412,1414,13807,14347,14349,14446)或第2组捕获集合中的所有捕获作为数组/列表。

See the regex demo

请参阅正则表达式演示

Explanation:

  • \.\[MY_ID] - a literal .[MY_ID]
  • \。\ [MY_ID] - 文字。[MY_ID]

  • \s* - 0+ whitespace
  • \ s * - 0+空格

  • IN\s* - IN word followed with 0+ whitespace
  • IN \ s * - IN字后跟0+空格

  • \( - opening literal (
  • \( - 开场文字(

  • ((?:,?(\d+))+) - Group 1 capturing 1+ sequences of:
    • ,? - one or zero comma
    • ,? - 一个或零逗号

    • (\d+) - Group 2 capturing 1+ digits.
    • (\ d +) - 第2组捕获1+位数。

  • ((?:,?(\ d +))+) - 第1组捕获1+序列:,? - 一个或零逗号(\ d +) - 第2组捕获1+个数字。

And here is a C# demo:

这是一个C#演示:

var s = "WHERE ([EXTENT1].[MY_ID] IN (151,152,214,218,931,932,933,1067,1412,1414,13807,14347,14349,14446)) AND ([EXTENT1].[MY_OTHER_ID] IN (14264, 14335, 14385, 14398, 14603, 14650, 15164, 15374)) AND ([EXTENT2].[PERSON_ID] IN (28,933,14446,179,152,14349,14347,933,130,218,933,1067,931,151,214,152,933,145,931,145,5809,14347,14349,14349,1414,142,1412,179,152,14347,152,90,13807,932,931))\n    )  AS [FILTER1]\n    GROUP BY [K1], [K2]\n)  AS [GROUPBY1]"; 
var pattern = @"\.\[MY_ID]\s*IN\s*\(((?:,?(\d+))+)";
var matches = Regex.Matches(s, pattern);
var res1 = matches
                .Cast<Match>().Select(p => p.Groups[2].Captures) // Get a list of ind. numbers
                .ToList();
var res2 = matches
                .Cast<Match>().Select(p => p.Groups[1].Value) // Get the whole substring
                .ToList();
 foreach (var coll in res1)
    foreach (var v in coll)
        Console.WriteLine(v);
 Console.WriteLine("Ex. 2");
    foreach (var v2 in res2)
        Console.WriteLine(v2);

#2


1  

Try with following regex.

尝试使用以下正则表达式。

Regex: (?<=\[MY_ID\] IN \()[^)]*

正则表达式:(?<= \ [MY_ID \] IN \()[^]] *

Explanation:

  • (?<=\[MY_ID\] IN \() will look behind for [MY_ID] IN (

    (?<= \ [MY_ID \] IN \()将查看[MY_ID] IN(

  • [^)]* will match everything till a ) is met, which marks the close of parenthesis.

    [^]] *将匹配所有内容,直到满足a),这标志着括号的结束。

Regex101 Demo

#1


1  

Thanks to the fact that .NET regex supports repeated groups, you can use

由于.NET正则表达式支持重复组,您可以使用

\.\[MY_ID]\s*IN\s*\(((?:,?(\d+))+)

and grab either the Group 1 value (that will be 151,152,214,218,931,932,933,1067,1412,1414,13807,14347,14349,14446) or all the captures from the Group 2 capture collection as an array/list.

并获取第1组值(即151,152,214,218,931,932,933,1067,1412,1414,13807,14347,14349,14446)或第2组捕获集合中的所有捕获作为数组/列表。

See the regex demo

请参阅正则表达式演示

Explanation:

  • \.\[MY_ID] - a literal .[MY_ID]
  • \。\ [MY_ID] - 文字。[MY_ID]

  • \s* - 0+ whitespace
  • \ s * - 0+空格

  • IN\s* - IN word followed with 0+ whitespace
  • IN \ s * - IN字后跟0+空格

  • \( - opening literal (
  • \( - 开场文字(

  • ((?:,?(\d+))+) - Group 1 capturing 1+ sequences of:
    • ,? - one or zero comma
    • ,? - 一个或零逗号

    • (\d+) - Group 2 capturing 1+ digits.
    • (\ d +) - 第2组捕获1+位数。

  • ((?:,?(\ d +))+) - 第1组捕获1+序列:,? - 一个或零逗号(\ d +) - 第2组捕获1+个数字。

And here is a C# demo:

这是一个C#演示:

var s = "WHERE ([EXTENT1].[MY_ID] IN (151,152,214,218,931,932,933,1067,1412,1414,13807,14347,14349,14446)) AND ([EXTENT1].[MY_OTHER_ID] IN (14264, 14335, 14385, 14398, 14603, 14650, 15164, 15374)) AND ([EXTENT2].[PERSON_ID] IN (28,933,14446,179,152,14349,14347,933,130,218,933,1067,931,151,214,152,933,145,931,145,5809,14347,14349,14349,1414,142,1412,179,152,14347,152,90,13807,932,931))\n    )  AS [FILTER1]\n    GROUP BY [K1], [K2]\n)  AS [GROUPBY1]"; 
var pattern = @"\.\[MY_ID]\s*IN\s*\(((?:,?(\d+))+)";
var matches = Regex.Matches(s, pattern);
var res1 = matches
                .Cast<Match>().Select(p => p.Groups[2].Captures) // Get a list of ind. numbers
                .ToList();
var res2 = matches
                .Cast<Match>().Select(p => p.Groups[1].Value) // Get the whole substring
                .ToList();
 foreach (var coll in res1)
    foreach (var v in coll)
        Console.WriteLine(v);
 Console.WriteLine("Ex. 2");
    foreach (var v2 in res2)
        Console.WriteLine(v2);

#2


1  

Try with following regex.

尝试使用以下正则表达式。

Regex: (?<=\[MY_ID\] IN \()[^)]*

正则表达式:(?<= \ [MY_ID \] IN \()[^]] *

Explanation:

  • (?<=\[MY_ID\] IN \() will look behind for [MY_ID] IN (

    (?<= \ [MY_ID \] IN \()将查看[MY_ID] IN(

  • [^)]* will match everything till a ) is met, which marks the close of parenthesis.

    [^]] *将匹配所有内容,直到满足a),这标志着括号的结束。

Regex101 Demo