使用[a-z]正则表达式在Java中拆分String

I have two regexpressions:

我有两个regexpressions：

[a-c] : any character from a-c

[a-z] : any character from a-z

And a test:

并测试：

public static void main(String[] args) {
    String s = "abcde";
    String[] arr1 = s.split("[a-c]");
    String[] arr2 = s.split("[a-z]");

    System.out.println(arr1.length); //prints 4 : "", "", "", "de"
    System.out.println(arr2.length); //prints 0 
}

Why the second splitting behaves like this? I would expect a reslut with 6 empty string "" results.

为什么第二次分裂表现得像这样？我希望有一个带有6个空字符串“”结果的reslut。

3 个解决方案

#1

According to the documentation of the single-argument String.split:

根据单参数String.split的文档：

This method works as if by invoking the two-argument split method with the given expression and a limit argument of zero. Trailing empty strings are therefore not included in the resulting array.

此方法的作用就像通过调用给定表达式和limit参数为零的双参数split方法一样。因此，结尾的空字符串不包含在结果数组中。

To keep the trailing strings, you can use the two-argument version, and specify a negative limit:

要保留尾随字符串，可以使用双参数版本，并指定负限制：

    String s = "abcde";
    String[] arr1 = s.split("[a-c]", -1); // ["", "", "", "de"]
    String[] arr2 = s.split("[a-z]", -1); // ["", "", "", "", "", ""]

#2

By default, split discards trailing empty strings. In the arr2 case, they were all trailing empty strings, so they were all discarded.

默认情况下，拆分丢弃尾随空字符串。在arr2的情况下，它们都是尾随空字符串，所以它们都被丢弃了。

To get 6 empty strings, pass a negative limit as the second parameter to the split method, which will keep all trailing empty strings.

要获得6个空字符串，请将负限制作为第二个参数传递给split方法，这将保留所有尾随空字符串。

String[] arr2 = s.split("[a-z]", -1);

If n is non-positive then the pattern will be applied as many times as possible and the array can have any length.

如果n是非正数，那么模式将被应用尽可能多的次数，并且数组可以具有任何长度。

#3

String.split():

String.split（）：

Splits this string around matches of the given regular expression.

将此字符串拆分为给定正则表达式的匹配项。

Around means that the matches themselves are removed. For example, splitting "a,b,c" on commas would be just a as well as b and c.

周围意味着匹配本身被删除。例如，在逗号上拆分“a，b，c”只是和b和c一样。

The first split removes the a, b, and c.

第一个拆分删除a，b和c。

The second removes all letters, thus all characters from that string.

第二个删除所有字母，从而删除该字符串中的所有字符。

#1