What is the time complexity of the String#substring()
method in Java?
Java中String#substring()方法的时间复杂度是多少?
5 个解决方案
#1
93
New answer
新回答
As of update 6 within Java 7's lifetime, the behaviour of substring
changed to create a copy - so every String
refers to a char[]
which is not shared with any other object, as far as I'm aware. So at that point, substring()
became an O(n) operation where n is the numbers in the substring.
在Java 7的生命周期内更新6时,子字符串的行为发生了变化,从而创建了一个副本——因此,每一个字符串都引用一个char[],这是我所知道的,而不是与任何其他对象共享的。在那个时候,substring()变成了O(n)操作其中n是子串中的数字。
Old answer: pre-Java 7
老答:pre-Java 7
Undocumented - but in practice O(1) if you assume no garbage collection is required, etc.
无文件-但在实践中O(1)如果你认为不需要垃圾收集,等等。
It simply builds a new String
object referring to the same underlying char[]
but with different offset and count values. So the cost is the time taken to perform validation and construct a single new (reasonably small) object. That's O(1) as far as it's sensible to talk about the complexity of operations which can vary in time based on garbage collection, CPU caches etc. In particular, it doesn't directly depend on the length of the original string or the substring.
它只是构建一个新的字符串对象,引用相同的底层char[],但是使用不同的偏移量和计数值。因此,花费时间来执行验证和构造一个新的(合理的小的)对象。这是O(1),对于基于垃圾收集、CPU缓存等不同时间的操作的复杂性来说,这是非常明智的。特别是,它不直接依赖于原始字符串的长度或子字符串。
#2
27
It was O(1) in older versions of Java - as Jon stated, it just created a new String with the same underlying char[], and a different offset and length.
在Java的旧版本中,它是O(1), Jon说,它只是创建了一个新的字符串,具有相同的底层char[],以及一个不同的偏移量和长度。
However, this has actually changed started with Java 7 update 6.
然而,这实际上是由Java 7更新开始的。
The char[] sharing was eliminated, and the offset and length fields were removed. substring() now just copies all the characters into a new String.
消除了char[]共享,删除了偏移量和长度字段。substring()现在将所有字符复制到一个新字符串中。
Ergo, substring is O(n) in Java 7 update 6
Ergo, substring是O(n)在Java 7更新6中。
#3
5
It's now linear complexity. This is after fixing a memory leak issue for substring.
现在是线性复杂度。这是在修复子字符串的内存泄漏问题之后。
So from Java 1.7.0_06 remember that String.substring now has a linear complexity instead of a constant one.
所以从Java 1.7.0_06,记住这个字符串。子字符串现在有一个线性复杂度,而不是常数。
#4
2
O(1) because no copying of the original string is done, it just creates a new wrapper object with different offset information.
O(1)因为没有复制原来的字符串,它只是创建了一个新的包装对象,其中包含了不同的偏移量信息。
#5
1
Judge for yourself from following, but Java's performance drawbacks lie somewhere else, not here in substring of a string. Code:
从以下方面判断您自己,但是Java的性能缺陷在其他地方,而不是在字符串的子字符串中。代码:
public static void main(String[] args) throws IOException {
String longStr = "asjf97zcv.1jm2497z20`1829182oqiwure92874nvcxz,nvz.,xo" +
"aihf[oiefjkas';./.,z][p\\°°°°°°°°?!(*#&(@*&#!)^(*&(*&)(*&" +
"fasdznmcxzvvcxz,vc,mvczvcz,mvcz,mcvcxvc,mvcxcvcxvcxvcxvcx";
int[] indices = new int[32 * 1024];
int[] lengths = new int[indices.length];
Random r = new Random();
final int minLength = 6;
for (int i = 0; i < indices.length; ++i)
{
indices[i] = r.nextInt(longStr.length() - minLength);
lengths[i] = minLength + r.nextInt(longStr.length() - indices[i] - minLength);
}
long start = System.nanoTime();
int avoidOptimization = 0;
for (int i = 0; i < indices.length; ++i)
//avoidOptimization += lengths[i]; //tested - this was cheap
avoidOptimization += longStr.substring(indices[i],
indices[i] + lengths[i]).length();
long end = System.nanoTime();
System.out.println("substring " + indices.length + " times");
System.out.println("Sum of lengths of splits = " + avoidOptimization);
System.out.println("Elapsed " + (end - start) / 1.0e6 + " ms");
}
Output:
输出:
substring 32768 times Sum of lengths of splits = 1494414 Elapsed 2.446679 ms
If it is O(1) or not, depends. If you just reference same String in memory, then imagine very long String, you make substring and stop referencing long one. Wouldn't be nice to release memory for long one?
如果是O(1),则取决于。如果你只是在内存中引用相同的字符串,那么想象一下很长的字符串,你会生成子字符串并停止引用长字符串。长时间释放内存不是很好吗?
#1
93
New answer
新回答
As of update 6 within Java 7's lifetime, the behaviour of substring
changed to create a copy - so every String
refers to a char[]
which is not shared with any other object, as far as I'm aware. So at that point, substring()
became an O(n) operation where n is the numbers in the substring.
在Java 7的生命周期内更新6时,子字符串的行为发生了变化,从而创建了一个副本——因此,每一个字符串都引用一个char[],这是我所知道的,而不是与任何其他对象共享的。在那个时候,substring()变成了O(n)操作其中n是子串中的数字。
Old answer: pre-Java 7
老答:pre-Java 7
Undocumented - but in practice O(1) if you assume no garbage collection is required, etc.
无文件-但在实践中O(1)如果你认为不需要垃圾收集,等等。
It simply builds a new String
object referring to the same underlying char[]
but with different offset and count values. So the cost is the time taken to perform validation and construct a single new (reasonably small) object. That's O(1) as far as it's sensible to talk about the complexity of operations which can vary in time based on garbage collection, CPU caches etc. In particular, it doesn't directly depend on the length of the original string or the substring.
它只是构建一个新的字符串对象,引用相同的底层char[],但是使用不同的偏移量和计数值。因此,花费时间来执行验证和构造一个新的(合理的小的)对象。这是O(1),对于基于垃圾收集、CPU缓存等不同时间的操作的复杂性来说,这是非常明智的。特别是,它不直接依赖于原始字符串的长度或子字符串。
#2
27
It was O(1) in older versions of Java - as Jon stated, it just created a new String with the same underlying char[], and a different offset and length.
在Java的旧版本中,它是O(1), Jon说,它只是创建了一个新的字符串,具有相同的底层char[],以及一个不同的偏移量和长度。
However, this has actually changed started with Java 7 update 6.
然而,这实际上是由Java 7更新开始的。
The char[] sharing was eliminated, and the offset and length fields were removed. substring() now just copies all the characters into a new String.
消除了char[]共享,删除了偏移量和长度字段。substring()现在将所有字符复制到一个新字符串中。
Ergo, substring is O(n) in Java 7 update 6
Ergo, substring是O(n)在Java 7更新6中。
#3
5
It's now linear complexity. This is after fixing a memory leak issue for substring.
现在是线性复杂度。这是在修复子字符串的内存泄漏问题之后。
So from Java 1.7.0_06 remember that String.substring now has a linear complexity instead of a constant one.
所以从Java 1.7.0_06,记住这个字符串。子字符串现在有一个线性复杂度,而不是常数。
#4
2
O(1) because no copying of the original string is done, it just creates a new wrapper object with different offset information.
O(1)因为没有复制原来的字符串,它只是创建了一个新的包装对象,其中包含了不同的偏移量信息。
#5
1
Judge for yourself from following, but Java's performance drawbacks lie somewhere else, not here in substring of a string. Code:
从以下方面判断您自己,但是Java的性能缺陷在其他地方,而不是在字符串的子字符串中。代码:
public static void main(String[] args) throws IOException {
String longStr = "asjf97zcv.1jm2497z20`1829182oqiwure92874nvcxz,nvz.,xo" +
"aihf[oiefjkas';./.,z][p\\°°°°°°°°?!(*#&(@*&#!)^(*&(*&)(*&" +
"fasdznmcxzvvcxz,vc,mvczvcz,mvcz,mcvcxvc,mvcxcvcxvcxvcxvcx";
int[] indices = new int[32 * 1024];
int[] lengths = new int[indices.length];
Random r = new Random();
final int minLength = 6;
for (int i = 0; i < indices.length; ++i)
{
indices[i] = r.nextInt(longStr.length() - minLength);
lengths[i] = minLength + r.nextInt(longStr.length() - indices[i] - minLength);
}
long start = System.nanoTime();
int avoidOptimization = 0;
for (int i = 0; i < indices.length; ++i)
//avoidOptimization += lengths[i]; //tested - this was cheap
avoidOptimization += longStr.substring(indices[i],
indices[i] + lengths[i]).length();
long end = System.nanoTime();
System.out.println("substring " + indices.length + " times");
System.out.println("Sum of lengths of splits = " + avoidOptimization);
System.out.println("Elapsed " + (end - start) / 1.0e6 + " ms");
}
Output:
输出:
substring 32768 times Sum of lengths of splits = 1494414 Elapsed 2.446679 ms
If it is O(1) or not, depends. If you just reference same String in memory, then imagine very long String, you make substring and stop referencing long one. Wouldn't be nice to release memory for long one?
如果是O(1),则取决于。如果你只是在内存中引用相同的字符串,那么想象一下很长的字符串,你会生成子字符串并停止引用长字符串。长时间释放内存不是很好吗?