如何在Java中使用正则表达式拆分String而不会丢失数字

时间:2021-12-17 21:39:10


This is the following string
(R01)(R10)
and the output should be like this:
1 10

这是以下字符串(R01)(R10),输出应如下所示:1 10

I was using \\)|\\(|[A-Z] but it's doesn't work
What should I do?

我正在使用\\)| \\(| [A-Z]但它不起作用我该怎么办?

5 个解决方案

#1


You can use the following regex:

您可以使用以下正则表达式:

"\\(R0*(\\d+)\\)"

It means that the expression:

这意味着表达式:

  • should be in parenthesis "\\( \\)"
  • 应该在括号“\\(\\)”中

  • starts with the character R
  • 从字符R开始

  • then followed by zero or multiple 0 0*.
  • 然后是零或多个0 0 *。

  • followed by one or more digit that you capture in a group (\\d+)
  • 然后是您在组中捕获的一个或多个数字(\\ d +)

0* will consume every 0 that appears before the first digit that match. So in the case of '000', 0* will consume the first two 0 since we need at least one digit after (which will be the last 0). There might be some backtracking involved.

0 *将消耗匹配的第一个数字之前出现的每0。所以在'000'的情况下,0 *将消耗前两个0,因为我们之后需要至少一个数字(这将是最后的0)。可能会涉及一些回溯。

For example:

String s = "(R0)(R10)(R001)(R000)";
Pattern p = Pattern.compile("\\(R0*(\\d+)\\)");

Matcher m = p.matcher(s);
while(m.find()) {
    System.out.println(m.group(1));
}

Output:

0
10
1
0

#2


For the regex bit, you probably want to learn about lookahead/lookbehind

对于正则表达式位,您可能想要了解前瞻/后瞻

(?<=R)\d+

and then use Integer.parseInt on the matches.

然后在匹配项上使用Integer.parseInt。

For practising Regexes: http://www.regexplanet.com/advanced/java/index.html and many others

练习正则表达式:http://www.regexplanet.com/advanced/java/index.html和许多其他人

#3


try {
    String resultString = subjectString.replaceAll("([^\\d][0]|\\D)", "");
} catch (PatternSyntaxException ex) {
    // Syntax error in the regular expression
} catch (IllegalArgumentException ex) {
    // Syntax error in the replacement text (unescaped $ signs?)
} catch (IndexOutOfBoundsException ex) {
    // Non-existent backreference used the replacement text
}

Explanation:

([^\d][0]|\D)

Match the regex below and capture its match into backreference number 1 «([^\d][0]|\D)»
   Match this alternative «[^\d][0]»
      Match a single character that is NOT a “digit” «[^\d]»
      Match the character “0” literally «[0]»
   Or match this alternative «\D»
      Match a single character that is NOT a “digit” «\D»

#4


This code will work for you :

此代码适用于您:

public static void main(String[] args) {
    String s = "(R01)(R10)";
    s = s.replaceAll(".*?(\\d+.*\\d+).*", "$1"); // replace leading/ trailing non-numeric charcaters.
    String[] arr = s.split("\\D+"); // split based on non-numeric characters
    for (int i = 0; i < arr.length; i++) {
        arr[i] = String.valueOf(Integer.parseInt(arr[i])); // convert to base-10 i.e, remove the leading "0"

    }
    for (String str : arr)
        System.out.println(str);
}

O/P :

1 
10

#5


You can do this :

你可以这样做 :

String in = "(R01)(R10)";
System.out.println(Arrays.toString(
    Pattern.compile("(?:\\D+0*)").splitAsStream(in)
   .filter(x -> x.length()>0).toArray()
));

Output : [1, 10]

输出:[1,10]

The advantage of this construct is you can easily extend it, for example to get floats instead of strings:

这个结构的优点是你可以轻松扩展它,例如获取浮点数而不是字符串:

String in = "(R01)(R10)";
System.out.println(Arrays.toString(
    Pattern.compile("(?:\\D+)").splitAsStream(in)
    .filter(x -> x.length() > 0).map(Float::parseFloat)
    .toArray()
));

Output: [1.0, 10.0]

输出:[1.0,10.0]

#1


You can use the following regex:

您可以使用以下正则表达式:

"\\(R0*(\\d+)\\)"

It means that the expression:

这意味着表达式:

  • should be in parenthesis "\\( \\)"
  • 应该在括号“\\(\\)”中

  • starts with the character R
  • 从字符R开始

  • then followed by zero or multiple 0 0*.
  • 然后是零或多个0 0 *。

  • followed by one or more digit that you capture in a group (\\d+)
  • 然后是您在组中捕获的一个或多个数字(\\ d +)

0* will consume every 0 that appears before the first digit that match. So in the case of '000', 0* will consume the first two 0 since we need at least one digit after (which will be the last 0). There might be some backtracking involved.

0 *将消耗匹配的第一个数字之前出现的每0。所以在'000'的情况下,0 *将消耗前两个0,因为我们之后需要至少一个数字(这将是最后的0)。可能会涉及一些回溯。

For example:

String s = "(R0)(R10)(R001)(R000)";
Pattern p = Pattern.compile("\\(R0*(\\d+)\\)");

Matcher m = p.matcher(s);
while(m.find()) {
    System.out.println(m.group(1));
}

Output:

0
10
1
0

#2


For the regex bit, you probably want to learn about lookahead/lookbehind

对于正则表达式位,您可能想要了解前瞻/后瞻

(?<=R)\d+

and then use Integer.parseInt on the matches.

然后在匹配项上使用Integer.parseInt。

For practising Regexes: http://www.regexplanet.com/advanced/java/index.html and many others

练习正则表达式:http://www.regexplanet.com/advanced/java/index.html和许多其他人

#3


try {
    String resultString = subjectString.replaceAll("([^\\d][0]|\\D)", "");
} catch (PatternSyntaxException ex) {
    // Syntax error in the regular expression
} catch (IllegalArgumentException ex) {
    // Syntax error in the replacement text (unescaped $ signs?)
} catch (IndexOutOfBoundsException ex) {
    // Non-existent backreference used the replacement text
}

Explanation:

([^\d][0]|\D)

Match the regex below and capture its match into backreference number 1 «([^\d][0]|\D)»
   Match this alternative «[^\d][0]»
      Match a single character that is NOT a “digit” «[^\d]»
      Match the character “0” literally «[0]»
   Or match this alternative «\D»
      Match a single character that is NOT a “digit” «\D»

#4


This code will work for you :

此代码适用于您:

public static void main(String[] args) {
    String s = "(R01)(R10)";
    s = s.replaceAll(".*?(\\d+.*\\d+).*", "$1"); // replace leading/ trailing non-numeric charcaters.
    String[] arr = s.split("\\D+"); // split based on non-numeric characters
    for (int i = 0; i < arr.length; i++) {
        arr[i] = String.valueOf(Integer.parseInt(arr[i])); // convert to base-10 i.e, remove the leading "0"

    }
    for (String str : arr)
        System.out.println(str);
}

O/P :

1 
10

#5


You can do this :

你可以这样做 :

String in = "(R01)(R10)";
System.out.println(Arrays.toString(
    Pattern.compile("(?:\\D+0*)").splitAsStream(in)
   .filter(x -> x.length()>0).toArray()
));

Output : [1, 10]

输出:[1,10]

The advantage of this construct is you can easily extend it, for example to get floats instead of strings:

这个结构的优点是你可以轻松扩展它,例如获取浮点数而不是字符串:

String in = "(R01)(R10)";
System.out.println(Arrays.toString(
    Pattern.compile("(?:\\D+)").splitAsStream(in)
    .filter(x -> x.length() > 0).map(Float::parseFloat)
    .toArray()
));

Output: [1.0, 10.0]

输出:[1.0,10.0]