使用正则表达式在java中拆分字符串

时间:2022-12-25 21:43:28

I am trying to split a string using a regex "A.*B", which works just fine to retrieve strings between 'A' and 'B'. But the dot '.' doesn't include new line characters \n,\r. Can you please guide me on how to achieve this?

我试图使用正则表达式“A. * B”分割字符串,它可以很好地检索“A”和“B”之间的字符串。但点''。不包括换行符\ n,\ r \ n。你能指导我如何实现这个目标吗?

Thanks

谢谢


Thanks all. Pattern.DOTALL worked like a charm.

谢谢大家。 Pattern.DOTALL就像一个魅力。

I had another question related to this. What should be done if I need to extract all the strings between 'A' and 'B' (which basically match the above regex).

我有另一个与此相关的问题。如果我需要提取'A'和'B'之间的所有字符串(基本上匹配上面的正则表达式),应该怎么做。

I tried using find() and group() of matcher class, but with the pattern below it seems to return the whole string.

我尝试使用matcher类的find()和group(),但是下面的模式似乎返回整个字符串。

Pattern p = Pattern.compile("A.*B",Pattern.DOTALL);

模式p = Pattern.compile(“A。* B”,Pattern.DOTALL);

6 个解决方案

#1


0  

Have a look at java.util.regex.Pattern.compile(String regex, int flags), esp. the DOTALL flag

看看java.util.regex.Pattern.compile(String regex,int flags),尤其是。 DOTALL标志

#2


1  

Use a java.util.regex.Pattern with the MULTILINE flag:

将java.util.regex.Pattern与MULTILINE标志一起使用:

import java.util.regex.Pattern;

Pattern pattern = Pattern.compile("A.*B", Pattern.MULTILINE);
pattern.split(string);

#3


1  

Compile the regex with this option: Pattern regex = Pattern.compile("A.*B",Pattern.DOTALL)

使用此选项编译正则表达式:Pattern regex = Pattern.compile(“A. * B”,Pattern.DOTALL)

#4


1  

Try "A[.\\s]*B"

试试“A [。\\ s] * B”

Or you may specify the DOTALL switch so that "." will include even line terminators. Take a look ať the documentation of the Pattern class.

或者您可以指定DOTALL开关以便“。”将包括偶数行终止符。看一下Pattern类的文档。

#5


0  

I assume you use the Pattern, Matcher classes for this.

我假设您使用Pattern,Matcher类。

Have you tried providing MULTILINE to your Pattern.compile() method?

您是否尝试过为您的Pattern.compile()方法提供MULTILINE?

Pattern.compile(regex, Pattern.MULTILINE)

Pattern.compile(正则表达式,Pattern.MULTILINE)

'.' = Any character (may or may not match line terminators)

'' =任何字符(可能与行终止符匹配也可能不匹配)

http://docs.oracle.com/javase/1.4.2/docs/api/java/util/regex/Pattern.html#lt

http://docs.oracle.com/javase/1.4.2/docs/api/java/util/regex/Pattern.html#lt

#6


0  

Try changing yor regex to "A(.|\\s)*B" This means A followed by any character(.) or any white character(\s) any number of times followed by B (double scaped \s is needed at java Code).

尝试将yor正则表达式更改为“A(。| \\ s)* B”这意味着A后跟任意字符(。)或任何白色字符(\ s),任意次数后跟B(需要双重scaped \ s) java代码)。

Reference for Regular Expressions (constructs, spacial characters, etc.) in Java: http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html

Java中正则表达式(结构,空格字符等)的参考:http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html

#1


0  

Have a look at java.util.regex.Pattern.compile(String regex, int flags), esp. the DOTALL flag

看看java.util.regex.Pattern.compile(String regex,int flags),尤其是。 DOTALL标志

#2


1  

Use a java.util.regex.Pattern with the MULTILINE flag:

将java.util.regex.Pattern与MULTILINE标志一起使用:

import java.util.regex.Pattern;

Pattern pattern = Pattern.compile("A.*B", Pattern.MULTILINE);
pattern.split(string);

#3


1  

Compile the regex with this option: Pattern regex = Pattern.compile("A.*B",Pattern.DOTALL)

使用此选项编译正则表达式:Pattern regex = Pattern.compile(“A. * B”,Pattern.DOTALL)

#4


1  

Try "A[.\\s]*B"

试试“A [。\\ s] * B”

Or you may specify the DOTALL switch so that "." will include even line terminators. Take a look ať the documentation of the Pattern class.

或者您可以指定DOTALL开关以便“。”将包括偶数行终止符。看一下Pattern类的文档。

#5


0  

I assume you use the Pattern, Matcher classes for this.

我假设您使用Pattern,Matcher类。

Have you tried providing MULTILINE to your Pattern.compile() method?

您是否尝试过为您的Pattern.compile()方法提供MULTILINE?

Pattern.compile(regex, Pattern.MULTILINE)

Pattern.compile(正则表达式,Pattern.MULTILINE)

'.' = Any character (may or may not match line terminators)

'' =任何字符(可能与行终止符匹配也可能不匹配)

http://docs.oracle.com/javase/1.4.2/docs/api/java/util/regex/Pattern.html#lt

http://docs.oracle.com/javase/1.4.2/docs/api/java/util/regex/Pattern.html#lt

#6


0  

Try changing yor regex to "A(.|\\s)*B" This means A followed by any character(.) or any white character(\s) any number of times followed by B (double scaped \s is needed at java Code).

尝试将yor正则表达式更改为“A(。| \\ s)* B”这意味着A后跟任意字符(。)或任何白色字符(\ s),任意次数后跟B(需要双重scaped \ s) java代码)。

Reference for Regular Expressions (constructs, spacial characters, etc.) in Java: http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html

Java中正则表达式(结构,空格字符等)的参考:http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html