Kotlin和Java String之间的区别与Regex分开

If we have a val txt: kotlin.String = "1;2;3;" and like to split it into an array of numbers, we can try the following:

如果我们有一个val txt：kotlin.String =“1; 2; 3;”并且喜欢将其拆分为数字数组，我们可以尝试以下方法：

val numbers = string.split(";".toRegex())
//gives: [1, 2, 3, ]

The trailing empty String is included in the result of CharSequence.split.

尾随的空字符串包含在CharSequence.split的结果中。

On the other hand, if we look at Java Strings, the result is different:

另一方面，如果我们看一下Java Strings，结果会有所不同：

val numbers2 = (string as java.lang.String).split(";")
//gives: [1, 2, 3]

This time, using java.lang.String.split, the result does not include the trailing empty String. This behaviour actually is intended given the corresponding JavaDoc:

这次，使用java.lang.String.split，结果不包括尾随空字符串。这个行为实际上是给定相应的JavaDoc：

This method works as if by invoking the two-argument split method with the given expression and a limit argument of zero. Trailing empty strings are therefore not included in the resulting array.

此方法的作用就像通过调用给定表达式和limit参数为零的双参数split方法一样。因此，结尾的空字符串不包含在结果数组中。

In Kotlin's version though, 0 also is the default limit argument as documented here, yet internally Kotlin maps that 0 on a negative value -1 when java.util.regex.Pattern::split is called:

但是在Kotlin的版本中，0也是这里记录的默认限制参数，但是当调用java.util.regex.Pattern :: split时，内部Kotlin在负值-1上映射0：

nativePattern.split(input, if (limit == 0) -1 else limit).asList()

It seems to be working as intended but I'm wondering why the language seems to be restricting the Java API since a limit of 0 is not provided anymore.

它似乎按预期工作但我想知道为什么该语言似乎限制Java API，因为不再提供0的限制。

1 个解决方案

#1

The implementation implies that it's the behavior of java.lang.String.split achieved by passing limit = 0 that is lost in Kotlin. Actually, from my point of view, it was removed to achieve consistency between the possible options in Kotlin.

实现意味着通过传递在Kotlin中丢失的limit = 0来实现java.lang.String.split的行为。实际上，从我的观点来看，它被删除以实现Kotlin中可能的选项之间的一致性。

Consider a string a:b:c:d: and a pattern :.

考虑一个字符串a：b：c：d：和一个模式：。

Take a look at what we can have in Java:

看看我们在Java中可以拥有的东西：

limit < 0 → [a, b, c, d, ]
limit = 0 → [a, b, c, d]
limit = 1 → [a:b:c:d:]
limit = 2 → [a, b:c:d:]
limit = 3 → [a, b, c:d:]
limit = 4 → [a, b, c, d:]
limit = 5 → [a, b, c, d, ] (goes on the same as with limit < 0)
limit = 6 → [a, b, c, d, ]
...

limit <0→[a，b，c，d，] limit = 0→[a，b，c，d] limit = 1→[a：b：c：d：] limit = 2→[a，b： c：d：] limit = 3→[a，b，c：d：] limit = 4→[a，b，c，d：] limit = 5→[a，b，c，d，]（继续与极限<0）极限= 6→[a，b，c，d，]相同......

It appears that the limit = 0 option is somewhat unique: it has the trailing : neither replaced by an additional entry, as with limit < 0 or limit >= 5, nor retained in the last resulting item (as with limit in 1..4).

似乎limit = 0选项有点独特：它具有尾随：既不替换为附加条目，也不是限制<0或限制> = 5，也不保留在最后生成的项目中（与限制在1中一样）。 4）。

It seems to me that the Kotlin API improves the consistency here: there's no special case that, in some sense, loses the information about the last delimiter followed by an empty string – it's left in place either as the delimiter in the last resulting item or as a trailing empty entry.

在我看来，Kotlin API在这里提高了一致性：在某种意义上，没有特殊情况会丢失关于最后一个分隔符后跟一个空字符串的信息 - 它作为最后一个结果项中的分隔符留在原位或者作为一个尾随的空条目。

IMO, the Kotlin function seems to better fit the principle of least astonishment. The zero limit in java.lang.String.split, on contrary, looks more like a special value modifying the method's semantics. And so do the negative values, that evidently don't make intuitive sense as a limit and are not quite clear without digging through the Javadoc.

IMO，Kotlin功能似乎更符合最不惊讶的原则。相反，java.lang.String.split中的零限制看起来更像是修改方法语义的特殊值。负值也是如此，显然没有直观意义作为限制，如果不挖掘Javadoc就不太清楚。

#1