Java正则表达式如何找到父匹配?

时间:2020-12-30 19:23:52

Any page from Wikipedia:

来自*的任何页面:

...
abas asdn asf asfs af
{{Template1
|a = Name surname
|b = jhsdf sdf
|c = {{Template2}}
|d = 
|e = [[f]] and [[g]]
|h = asd asdasfgasgasg asgas jygh trdx dftf xcth
|i = 73
|j = {{Template2|abc|123}}
|j = {{Template3|aa=kkk|bb={{Template4|cc=uu}}}}
}}

asd wetd gdsgwew g

{{OtherTemplate
|sdf = 213
}}
...

How can i find Template1's content (start is |a end is }}) with Java regexes?

如何使用Java正则表达式找到Template1的内容(开始是|结束是}})?

I tried:

我试过了:

String pattern = "\\{\\{\\s*Template1\\s*(.*?)\\}\\}";

Pattern p = Pattern.compile(pattern, Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL);
Matcher m = p.matcher(content);

while (m.find()) {
    if (!m.group().equals("")) {
        System.out.println(m.group());
        System.out.println("-----------------------");
    }
}

But in here the regex is finding the first }} (which is Template2 }}) then stops.
I want to pass }} is any {{ is open. Then I want to find top parent match.

但在这里正则表达式找到第一个}}(这是Template2}})然后停止。我想传递}}是任何{{是开放的。然后我想找到*父母比赛。

I want to get top Template1 content between top {{ and }}?.

我想在顶部{{和}}之间获得*的Template1内容?

EDIT:

编辑:

Please keep in mind that I am parsing content after removing white spaces.

请记住,我在删除空格后解析内容。

content.replaceAll("\\s+","");  

Think of content as writing a single line.

将内容视为写一行。

3 个解决方案

#1


1  

/^{{Template1(.*?)^}}/sm

/^{{Template1(.*?)^}}/sm

returns:

收益:

|a = Name surname
|b = jhsdf sdf
|c = {{Template2}}
|d = 
|e = [[f]] and [[g]]
|h = asd asdasfgasgasg asgas jygh trdx dftf xcth
|i = 73
|j = {{Template2|abc|123}}
|j = {{Template3|aa=kkk|bb={{Template4|cc=uu}}}}

https://regex101.com/r/qC6cM1/1 (DEMO)

https://regex101.com/r/qC6cM1/1(DEMO)

#2


0  

\\{\\{\\s*Template1\\s*(.*?)\\n\\}\\}

                        ^^

Just include \n.See demo.

只需包括\ n。查看演示。

https://regex101.com/r/uF4oY4/72

https://regex101.com/r/uF4oY4/72

#3


0  

I think parser would do better jub in this case, but if you want regex, how about this one:

在这种情况下,我认为解析器会做得更好jub,但是如果你想要正则表达式,那么这个怎么样:

{{Template1(?:[^{}]*?(?:{{[^}]+?}}))+(?:[}\n\s]+})*

DEMO

DEMO

I assumed that your input is like single line.

我假设你的输入就像单行。

#1


1  

/^{{Template1(.*?)^}}/sm

/^{{Template1(.*?)^}}/sm

returns:

收益:

|a = Name surname
|b = jhsdf sdf
|c = {{Template2}}
|d = 
|e = [[f]] and [[g]]
|h = asd asdasfgasgasg asgas jygh trdx dftf xcth
|i = 73
|j = {{Template2|abc|123}}
|j = {{Template3|aa=kkk|bb={{Template4|cc=uu}}}}

https://regex101.com/r/qC6cM1/1 (DEMO)

https://regex101.com/r/qC6cM1/1(DEMO)

#2


0  

\\{\\{\\s*Template1\\s*(.*?)\\n\\}\\}

                        ^^

Just include \n.See demo.

只需包括\ n。查看演示。

https://regex101.com/r/uF4oY4/72

https://regex101.com/r/uF4oY4/72

#3


0  

I think parser would do better jub in this case, but if you want regex, how about this one:

在这种情况下,我认为解析器会做得更好jub,但是如果你想要正则表达式,那么这个怎么样:

{{Template1(?:[^{}]*?(?:{{[^}]+?}}))+(?:[}\n\s]+})*

DEMO

DEMO

I assumed that your input is like single line.

我假设你的输入就像单行。