I have written a code, but it doesn't work correctly. Here you can find my RegEx
, what I have as the input and what I expect as the output. I am using a non-capturing group, because I want to read the text unti I get "Bundle" word, but I don't want to include it in the captured one. But I don't know what I have done wrongly which causes it not to work.
我编写了一个代码,但它无法正常工作。在这里你可以找到我的RegEx,我有什么作为输入和我期望的输出。我正在使用一个非捕获组,因为我想读取文本,我得到“Bundle”字,但我不想将其包含在捕获的字中。但我不知道我做错了什么导致它无法正常工作。
Here is my code:
这是我的代码:
Pattern pattern = Pattern.compile(
"((Bundle\\s+Components)|(Included\\s+Components))\\s+(.*?)(?:Bundle)", Pattern.DOTALL);
Matcher matcher = pattern.matcher(tableInformation);
while (matcher.find()) {
String bundleComponents = matcher.group();
System.out.println(bundleComponents);
}
Here are the examples: Example 1:
以下是示例:示例1:
Bundle Components bla blah\blabla?!()\\ANY CHARACTER IS POSSIBLE HERE, EVEN LINEBREAK,blah blah
Bundle Type
Example 2:
Included Components
blah blah, like above,
Bundle Type
output I expect for Ex. 1:
输出我期望Ex。 1:
Bundle Components bla blah\blabla?!()\\ANY CHARACTER IS POSSIBLE HERE, EVEN LINEBREAK,blah blah
output I expect for Ex. 2:
输出我期望Ex。 2:
Included Components
blah blah, like above,
What I get as the output for Ex. 2:
我得到的作为Ex的输出。 2:
Bundle Components bla blah\blabla?!()\\ANY CHARACTER IS POSSIBLE HERE, EVEN LINEBREAK,blah blah
Bundle Type
What I get as the output for Ex. 2:
我得到的作为Ex的输出。 2:
Included Components
blah blah, like above,
Bundle Type
2 个解决方案
#1
1
In Full Match you get everything that regex says about, even non-capturing groups. You need to get appropriate Match to get rid of non-capturing groups. The other solution is to use positive lookahead instead of capturing group. Check the regex below. I also removed some unnecessary (IMO) groups.
在完全匹配中,您可以获得正则表达式所说的所有内容,甚至是非捕获组。您需要获得适当的匹配以摆脱非捕获组。另一种解决方案是使用正向前瞻而不是捕获组。检查下面的正则表达式。我还删除了一些不必要的(IMO)组。
(?:Bundle\s+Components|Included\s+Components)\s+.*?(?=Bundle)
It results with only one, full, match.
它只有一个完整的匹配结果。
PS: The sign of new line just before "Bundle" will be captured as well in this solution.
PS:在此解决方案中也将捕获“Bundle”之前的新行的符号。
#2
1
You can do this with positive lookahead, since with this one the pattern inside the lookahead group is not included in the match:
你可以用积极的前瞻来做到这一点,因为有了这个,前瞻组中的模式不包含在匹配中:
((?:Bundle\\s+Components)|(?:Included\\s+Components))\\s+(.*?)(?=Bundle)
(not tested)
#1
1
In Full Match you get everything that regex says about, even non-capturing groups. You need to get appropriate Match to get rid of non-capturing groups. The other solution is to use positive lookahead instead of capturing group. Check the regex below. I also removed some unnecessary (IMO) groups.
在完全匹配中,您可以获得正则表达式所说的所有内容,甚至是非捕获组。您需要获得适当的匹配以摆脱非捕获组。另一种解决方案是使用正向前瞻而不是捕获组。检查下面的正则表达式。我还删除了一些不必要的(IMO)组。
(?:Bundle\s+Components|Included\s+Components)\s+.*?(?=Bundle)
It results with only one, full, match.
它只有一个完整的匹配结果。
PS: The sign of new line just before "Bundle" will be captured as well in this solution.
PS:在此解决方案中也将捕获“Bundle”之前的新行的符号。
#2
1
You can do this with positive lookahead, since with this one the pattern inside the lookahead group is not included in the match:
你可以用积极的前瞻来做到这一点,因为有了这个,前瞻组中的模式不包含在匹配中:
((?:Bundle\\s+Components)|(?:Included\\s+Components))\\s+(.*?)(?=Bundle)
(not tested)