Given the following multiline string:
给定以下多行字符串:
WHERE ([EXTENT1].[MY_ID] IN (151,152,214,218,931,932,933,1067,1412,1414,13807,14347,14349,14446)) AND ([EXTENT1].[MY_OTHER_ID] IN (14264, 14335, 14385, 14398, 14603, 14650, 15164, 15374)) AND ([EXTENT2].[PERSON_ID] IN (28,933,14446,179,152,14349,14347,933,130,218,933,1067,931,151,214,152,933,145,931,145,5809,14347,14349,14349,1414,142,1412,179,152,14347,152,90,13807,932,931))
) AS [FILTER1]
GROUP BY [K1], [K2]
) AS [GROUPBY1]
I want to extract the parenthesized values of the IN clause for [MY_ID]. I can use the following regex (?<=\.\[MY_ID\].*IN.*\().*
to truncate off the first portion of the string and return:
我想为[MY_ID]提取IN子句的带括号的值。我可以使用以下正则表达式(?<= \。\ [MY_ID \]。* IN。* \()。*来截断字符串的第一部分并返回:
151,152,214,218,931,932,933,1067,1412,1414,13807,14347,14349,14446)) AND ([EXTENT1].[MY_OTHER_ID] IN (14264, 14335, 14385, 14398, 14603, 14650, 15164, 15374)) AND ([EXTENT2].[PERSON_ID] IN (28,933,14446,179,152,14349,14347,933,130,218,933,1067,931,151,214,152,933,145,931,145,5809,14347,14349,14349,1414,142,1412,179,152,14347,152,90,13807,932,931))
But I can't figure out how to have it stop at the first closing ) of the in clause.
但我无法弄清楚如何让它在第一次结束时停止。
What I am after is: 151,152,214,218,931,932,933,1067,1412,1414,13807,14347,14349,14446
我所追求的是:151,152,214,218,931,932,933,1067,1412,1414,13807,14347,14349,14446
The regex will eventually be used in with the .NET regex engine.
正则表达式最终将与.NET正则表达式引擎一起使用。
2 个解决方案
#1
1
Thanks to the fact that .NET regex supports repeated groups, you can use
由于.NET正则表达式支持重复组,您可以使用
\.\[MY_ID]\s*IN\s*\(((?:,?(\d+))+)
and grab either the Group 1 value (that will be 151,152,214,218,931,932,933,1067,1412,1414,13807,14347,14349,14446
) or all the captures from the Group 2 capture collection as an array/list.
并获取第1组值(即151,152,214,218,931,932,933,1067,1412,1414,13807,14347,14349,14446)或第2组捕获集合中的所有捕获作为数组/列表。
See the regex demo
请参阅正则表达式演示
Explanation:
-
\.\[MY_ID]
- a literal.[MY_ID]
-
\s*
- 0+ whitespace -
IN\s*
-IN
word followed with 0+ whitespace -
\(
- opening literal(
-
((?:,?(\d+))+)
- Group 1 capturing 1+ sequences of:-
,?
- one or zero comma -
(\d+)
- Group 2 capturing 1+ digits.
,? - 一个或零逗号
(\ d +) - 第2组捕获1+位数。
-
\。\ [MY_ID] - 文字。[MY_ID]
\ s * - 0+空格
IN \ s * - IN字后跟0+空格
\( - 开场文字(
((?:,?(\ d +))+) - 第1组捕获1+序列:,? - 一个或零逗号(\ d +) - 第2组捕获1+个数字。
And here is a C# demo:
这是一个C#演示:
var s = "WHERE ([EXTENT1].[MY_ID] IN (151,152,214,218,931,932,933,1067,1412,1414,13807,14347,14349,14446)) AND ([EXTENT1].[MY_OTHER_ID] IN (14264, 14335, 14385, 14398, 14603, 14650, 15164, 15374)) AND ([EXTENT2].[PERSON_ID] IN (28,933,14446,179,152,14349,14347,933,130,218,933,1067,931,151,214,152,933,145,931,145,5809,14347,14349,14349,1414,142,1412,179,152,14347,152,90,13807,932,931))\n ) AS [FILTER1]\n GROUP BY [K1], [K2]\n) AS [GROUPBY1]";
var pattern = @"\.\[MY_ID]\s*IN\s*\(((?:,?(\d+))+)";
var matches = Regex.Matches(s, pattern);
var res1 = matches
.Cast<Match>().Select(p => p.Groups[2].Captures) // Get a list of ind. numbers
.ToList();
var res2 = matches
.Cast<Match>().Select(p => p.Groups[1].Value) // Get the whole substring
.ToList();
foreach (var coll in res1)
foreach (var v in coll)
Console.WriteLine(v);
Console.WriteLine("Ex. 2");
foreach (var v2 in res2)
Console.WriteLine(v2);
#2
1
Try with following regex.
尝试使用以下正则表达式。
Regex: (?<=\[MY_ID\] IN \()[^)]*
正则表达式:(?<= \ [MY_ID \] IN \()[^]] *
Explanation:
-
(?<=\[MY_ID\] IN \()
willlook behind
for[MY_ID] IN (
(?<= \ [MY_ID \] IN \()将查看[MY_ID] IN(
-
[^)]*
will match everything till a)
is met, which marks the close of parenthesis.[^]] *将匹配所有内容,直到满足a),这标志着括号的结束。
#1
1
Thanks to the fact that .NET regex supports repeated groups, you can use
由于.NET正则表达式支持重复组,您可以使用
\.\[MY_ID]\s*IN\s*\(((?:,?(\d+))+)
and grab either the Group 1 value (that will be 151,152,214,218,931,932,933,1067,1412,1414,13807,14347,14349,14446
) or all the captures from the Group 2 capture collection as an array/list.
并获取第1组值(即151,152,214,218,931,932,933,1067,1412,1414,13807,14347,14349,14446)或第2组捕获集合中的所有捕获作为数组/列表。
See the regex demo
请参阅正则表达式演示
Explanation:
-
\.\[MY_ID]
- a literal.[MY_ID]
-
\s*
- 0+ whitespace -
IN\s*
-IN
word followed with 0+ whitespace -
\(
- opening literal(
-
((?:,?(\d+))+)
- Group 1 capturing 1+ sequences of:-
,?
- one or zero comma -
(\d+)
- Group 2 capturing 1+ digits.
,? - 一个或零逗号
(\ d +) - 第2组捕获1+位数。
-
\。\ [MY_ID] - 文字。[MY_ID]
\ s * - 0+空格
IN \ s * - IN字后跟0+空格
\( - 开场文字(
((?:,?(\ d +))+) - 第1组捕获1+序列:,? - 一个或零逗号(\ d +) - 第2组捕获1+个数字。
And here is a C# demo:
这是一个C#演示:
var s = "WHERE ([EXTENT1].[MY_ID] IN (151,152,214,218,931,932,933,1067,1412,1414,13807,14347,14349,14446)) AND ([EXTENT1].[MY_OTHER_ID] IN (14264, 14335, 14385, 14398, 14603, 14650, 15164, 15374)) AND ([EXTENT2].[PERSON_ID] IN (28,933,14446,179,152,14349,14347,933,130,218,933,1067,931,151,214,152,933,145,931,145,5809,14347,14349,14349,1414,142,1412,179,152,14347,152,90,13807,932,931))\n ) AS [FILTER1]\n GROUP BY [K1], [K2]\n) AS [GROUPBY1]";
var pattern = @"\.\[MY_ID]\s*IN\s*\(((?:,?(\d+))+)";
var matches = Regex.Matches(s, pattern);
var res1 = matches
.Cast<Match>().Select(p => p.Groups[2].Captures) // Get a list of ind. numbers
.ToList();
var res2 = matches
.Cast<Match>().Select(p => p.Groups[1].Value) // Get the whole substring
.ToList();
foreach (var coll in res1)
foreach (var v in coll)
Console.WriteLine(v);
Console.WriteLine("Ex. 2");
foreach (var v2 in res2)
Console.WriteLine(v2);
#2
1
Try with following regex.
尝试使用以下正则表达式。
Regex: (?<=\[MY_ID\] IN \()[^)]*
正则表达式:(?<= \ [MY_ID \] IN \()[^]] *
Explanation:
-
(?<=\[MY_ID\] IN \()
willlook behind
for[MY_ID] IN (
(?<= \ [MY_ID \] IN \()将查看[MY_ID] IN(
-
[^)]*
will match everything till a)
is met, which marks the close of parenthesis.[^]] *将匹配所有内容,直到满足a),这标志着括号的结束。