I need to split a string on any of the following sequences:
我需要在以下任何一个序列上拆分一个字符串:
1 or more spaces
0 or more spaces, followed by a comma, followed by 0 or more spaces,
0 or more spaces, followed by "=>", followed by 0 or more spaces
1个或多个空格0或更多空格,后面跟着逗号,后面跟着0或更多空格,0或更多空格,后面跟着“=>”,后面跟着0或更多空格
Haven't had experience doing Java regexs before, so I'm a little confused. Thanks!
以前没有做Java regexs的经验,所以我有点困惑。谢谢!
Example:
add r10,r12 => r10
store r10 => r1
示例:添加r10,r12 => r10存储r10 => r1
3 个解决方案
#1
27
Just create regex matching any of your three cases and pass it into split
method:
只需创建与您的三种情况之一匹配的regex,并将其传递到分割方法:
string.split("\\s*(=>|,|\\s)\\s*");
Regex here means literally
正则表达式是指字面上
- Zero or more whitespaces (
\\s*
) - 零个或多个白牙(只\s*)
- Arrow, or comma, or whitespace (
=>|,|\\s
) - 箭头、逗号或空格(=>|,|\ s)
- Zero or more whitespaces (
\\s*
) - 零个或多个白牙(只\s*)
You can replace whitespace \\s
(detects spaces, tabs, line breaks, etc) with plain space character if necessary.
如果需要的话,您可以用普通空格字符替换空白\s(检测空格、制表符、换行符等)。
#2
13
Strictly translated
For simplicity, I'm going to interpret you indication of "space" () as "any whitespace" (
\s
).
为了简单起见,我将把“space”()的指示解释为“any whitespace”(\s)。
Translating your spec more or less "word for word" is to delimit on any of:
翻译你的规范或多或少的“逐字逐句”是对任何一个:
- 1 or more spaces
\s+
- \ s +
- 1个或更多的空间\s+
- 0 or more spaces (
\s*
), followed by a comma (,
), followed by 0 or more spaces (\s*
)\s*,\s*
- \ s * \ s *
- 0或更多的空格(\s*),后面跟着逗号(,),后面跟着0或更多的空格(\s*) \s*、\s* *
- 0 or more spaces (
\s*
), followed by a "=>" (=>
), followed by 0 or more spaces (\s*
)-
\s*=>\s*
- \ s * = > \ s *
-
- 0或更多的空间(\s*),然后是“=>”(=>),然后是0或更多的空间(\s*) \s*=>\s*
To match any of the above: (\s+|\s*,\s*|\s*=>\s*)
(\s+|\s*,\s*|\s*=>\s*)
Reduced form
However, your spec can be "reduced" to:
但是,您的规格可以“缩减”为:
- 0 or more spaces
-
\s*
, - \ s *,
-
- 0或更多空间\s*,
- followed by either a space, comma, or "=>"
(\s|,|=>)
- (\ s | | = >)
- 然后是空格、逗号或“=>”(\s|,|=>)
- followed by 0 or more spaces
\s*
- \ s *
- 然后是0或更多的空间\s*
Put it all together: \s*(\s|,|=>)\s*
把它们放在一起:\s*(\s|,|=>)\s*
The reduced form gets around some corner cases with the strictly translated form that makes some unexpected empty "matches".
简化后的表单会绕过一些带有严格翻译的表单,从而产生一些意料之外的空“匹配”。
Code
Here's some code:
这里有一些代码:
import java.util.regex.Pattern;
public class Temp {
// Strictly translated form:
//private static final String REGEX = "(\\s+|\\s*,\\s*|\\s*=>\\s*)";
// "Reduced" form:
private static final String REGEX = "\\s*(\\s|=>|,)\\s*";
private static final String INPUT =
"one two,three=>four , five six => seven,=>";
public static void main(final String[] args) {
final Pattern p = Pattern.compile(REGEX);
final String[] items = p.split(INPUT);
// Shorthand for above:
// final String[] items = INPUT.split(REGEX);
for(final String s : items) {
System.out.println("Match: '"+s+"'");
}
}
}
Output:
输出:
Match: 'one'
Match: 'two'
Match: 'three'
Match: 'four'
Match: 'five'
Match: 'six'
Match: 'seven'
#3
3
String[] splitArray = subjectString.split(" *(,|=>| ) *");
should do it.
应该这样做。
#1
27
Just create regex matching any of your three cases and pass it into split
method:
只需创建与您的三种情况之一匹配的regex,并将其传递到分割方法:
string.split("\\s*(=>|,|\\s)\\s*");
Regex here means literally
正则表达式是指字面上
- Zero or more whitespaces (
\\s*
) - 零个或多个白牙(只\s*)
- Arrow, or comma, or whitespace (
=>|,|\\s
) - 箭头、逗号或空格(=>|,|\ s)
- Zero or more whitespaces (
\\s*
) - 零个或多个白牙(只\s*)
You can replace whitespace \\s
(detects spaces, tabs, line breaks, etc) with plain space character if necessary.
如果需要的话,您可以用普通空格字符替换空白\s(检测空格、制表符、换行符等)。
#2
13
Strictly translated
For simplicity, I'm going to interpret you indication of "space" () as "any whitespace" (
\s
).
为了简单起见,我将把“space”()的指示解释为“any whitespace”(\s)。
Translating your spec more or less "word for word" is to delimit on any of:
翻译你的规范或多或少的“逐字逐句”是对任何一个:
- 1 or more spaces
\s+
- \ s +
- 1个或更多的空间\s+
- 0 or more spaces (
\s*
), followed by a comma (,
), followed by 0 or more spaces (\s*
)\s*,\s*
- \ s * \ s *
- 0或更多的空格(\s*),后面跟着逗号(,),后面跟着0或更多的空格(\s*) \s*、\s* *
- 0 or more spaces (
\s*
), followed by a "=>" (=>
), followed by 0 or more spaces (\s*
)-
\s*=>\s*
- \ s * = > \ s *
-
- 0或更多的空间(\s*),然后是“=>”(=>),然后是0或更多的空间(\s*) \s*=>\s*
To match any of the above: (\s+|\s*,\s*|\s*=>\s*)
(\s+|\s*,\s*|\s*=>\s*)
Reduced form
However, your spec can be "reduced" to:
但是,您的规格可以“缩减”为:
- 0 or more spaces
-
\s*
, - \ s *,
-
- 0或更多空间\s*,
- followed by either a space, comma, or "=>"
(\s|,|=>)
- (\ s | | = >)
- 然后是空格、逗号或“=>”(\s|,|=>)
- followed by 0 or more spaces
\s*
- \ s *
- 然后是0或更多的空间\s*
Put it all together: \s*(\s|,|=>)\s*
把它们放在一起:\s*(\s|,|=>)\s*
The reduced form gets around some corner cases with the strictly translated form that makes some unexpected empty "matches".
简化后的表单会绕过一些带有严格翻译的表单,从而产生一些意料之外的空“匹配”。
Code
Here's some code:
这里有一些代码:
import java.util.regex.Pattern;
public class Temp {
// Strictly translated form:
//private static final String REGEX = "(\\s+|\\s*,\\s*|\\s*=>\\s*)";
// "Reduced" form:
private static final String REGEX = "\\s*(\\s|=>|,)\\s*";
private static final String INPUT =
"one two,three=>four , five six => seven,=>";
public static void main(final String[] args) {
final Pattern p = Pattern.compile(REGEX);
final String[] items = p.split(INPUT);
// Shorthand for above:
// final String[] items = INPUT.split(REGEX);
for(final String s : items) {
System.out.println("Match: '"+s+"'");
}
}
}
Output:
输出:
Match: 'one'
Match: 'two'
Match: 'three'
Match: 'four'
Match: 'five'
Match: 'six'
Match: 'seven'
#3
3
String[] splitArray = subjectString.split(" *(,|=>| ) *");
should do it.
应该这样做。