A recent question about string literals in .NET caught my eye. I know that string literals are interned so that different strings with the same value refer to the same object. I also know that a string can be interned at runtime:
最近在。net中有一个关于字符串文字的问题引起了我的注意。我知道字符串的字面量是交错的,所以具有相同值的不同字符串引用相同的对象。我也知道一个字符串可以在运行时被插入:
string now = DateTime.Now.ToString().Intern();
Obviously a string that is interned at runtime resides on the heap but I had assumed that a literal is placed in the program's data segment (and said so in my answer to said question). However I don't remember seeing this anywhere. I assume this is the case since it's how I would do it and the fact that the ldstr
IL instruction is used to get literals and no allocation seems to take place seems to back me up.
显然,在运行时嵌入的字符串驻留在堆中,但是我假设一个文字被放置在程序的数据段中(在我对上述问题的回答中是这么说的)。但是我不记得在哪里见过这个。我假设是这样的,因为我是这样做的,而ldstr IL指令是用来获取文字的,而且似乎没有分配似乎是支持我的。
To cut a long story short, where do string literals reside? Is it on the heap, the data segment or some-place I haven't thought of?
长话短说,字符串文字在哪里?是在堆上,数据段还是某个我没想过的地方?
Edit: If string literals do reside on the heap, when are they allocated?
编辑:如果字符串文字确实驻留在堆上,它们是什么时候分配的?
7 个解决方案
#1
102
Strings in .NET are reference types, so they are always on the heap (even when they are interned). You can verify this using a debugger such as WinDbg.
. net中的字符串是引用类型,所以它们总是在堆上(即使是在中间)。您可以使用诸如WinDbg这样的调试器来验证这一点。
If you have the class below
如果你有下面的课程。
class SomeType {
public void Foo() {
string s = "hello world";
Console.WriteLine(s);
Console.WriteLine("press enter");
Console.ReadLine();
}
}
And you call Foo()
on an instance, you can use WinDbg to inspect the heap.
在实例上调用Foo(),可以使用WinDbg检查堆。
The reference will most likely be stored in a register for a small program, so the easiest is to find the reference to the specific string is by doing a !dso
. This gives us the address of our string in question:
引用很可能存储在一个小程序的寄存器中,所以最简单的方法是执行一个!dso来找到对特定字符串的引用。这就给出了问题字符串的地址:
0:000> !dso
OS Thread Id: 0x1660 (0)
ESP/REG Object Name
002bf0a4 025d4bf8 Microsoft.Win32.SafeHandles.SafeFileHandle
002bf0b4 025d4bf8 Microsoft.Win32.SafeHandles.SafeFileHandle
002bf0e8 025d4e5c System.Byte[]
002bf0ec 025d4c0c System.IO.__ConsoleStream
002bf110 025d4c3c System.IO.StreamReader
002bf114 025d4c3c System.IO.StreamReader
002bf12c 025d5180 System.IO.TextReader+SyncTextReader
002bf130 025d4c3c System.IO.StreamReader
002bf140 025d5180 System.IO.TextReader+SyncTextReader
002bf14c 025d5180 System.IO.TextReader+SyncTextReader
002bf15c 025d2d04 System.String hello world // THIS IS THE ONE
002bf224 025d2ccc System.Object[] (System.String[])
002bf3d0 025d2ccc System.Object[] (System.String[])
002bf3f8 025d2ccc System.Object[] (System.String[])
Now use !gcgen
to find out which generation the instance is in:
现在使用gcgen来查找实例所在的年代:
0:000> !gcgen 025d2d04
Gen 0
It's in generation zero - i.e. it has just be allocated. Who's rooting it?
它在第0代,也就是说它刚刚被分配。是谁支持它吗?
0:000> !gcroot 025d2d04
Note: Roots found on stacks may be false positives. Run "!help gcroot" for
more info.
Scan Thread 0 OSTHread 1660
ESP:2bf15c:Root:025d2d04(System.String)
Scan Thread 2 OSTHread 16b4
DOMAIN(000E4840):HANDLE(Pinned):6513f4:Root:035d2020(System.Object[])->
025d2d04(System.String)
The ESP is the stack for our Foo()
method, but notice that we have a object[]
as well. That's the intern table. Let's take a look.
ESP是Foo()方法的堆栈,但请注意我们也有一个对象[]。这是实习表。让我们看一看。
0:000> !dumparray 035d2020
Name: System.Object[]
MethodTable: 006984c4
EEClass: 00698444
Size: 528(0x210) bytes
Array: Rank 1, Number of elements 128, Type CLASS
Element Methodtable: 00696d3c
[0] 025d1360
[1] 025d137c
[2] 025d139c
[3] 025d13b0
[4] 025d13d0
[5] 025d1400
[6] 025d1424
...
[36] 025d2d04 // THIS IS OUR STRING
...
[126] null
[127] null
I reduced the output somewhat, but you get the idea.
我减少了输出,但是你们懂的。
In conclusion: strings are on the heap - even when they are interned. The interned table holds a reference to the instance on the heap. I.e. interned strings are not collected during GC because the interned table roots them.
总而言之,字符串在堆上——即使它们被保存。interned表保存对堆上实例的引用。也就是说,在GC过程中不收集interned字符串,因为interned的表根是它们。
#2
12
In Java (from the Java Glossary):
在Java(从Java术语表中):
In Sun’s JVM, the interned Strings (which includes String literals) are stored in a special pool of RAM called the perm gen, where the JVM also loads classes and stores natively compiled code. However, the intered Strings behave no differently than had they been stored in the ordinary object heap.
在Sun的JVM中,嵌入的字符串(包括字符串文本)存储在一个称为perm gen的特殊RAM池中,在这个池中,JVM还加载类并存储本地编译的代码。然而,与存储在普通对象堆中的字符串没有什么不同。
#3
3
Correct me if I am wrong but don't all objects reside on the heap, in both Java and .NET?
如果我错了,请纠正我,但不是所有对象都驻留在堆中,在Java和。net中?
#4
1
In .Net, string literals when "interned", are stored in a special data structure called, the "intern table". This is separate from the heap and the stack. Not all strings are interned however... I'm pretty sure that those that aren't are stored on the heap.
在。net中,“interned”时的字符串文本存储在一个称为“实习生表”的特殊数据结构中。这与堆和堆栈是分开的。并不是所有的字符串都被保存……我很确定那些不是存储在堆上的。
Don't know about Java
不知道Java
#5
1
I found this on MSDN's site about the ldstr
IL instruction:
我在MSDN的网站上找到了关于ldstr IL指令的信息:
The
ldstr
instruction pushes an object reference (type O) to a new string object representing the specific string literal stored in the metadata. Theldstr
instruction allocates the requisite amount of memory and performs any format conversion required to convert the string literal from the form used in the file to the string format required at runtime.ldstr指令将对象引用(类型O)推到一个新的字符串对象,该对象表示存储在元数据中的特定字符串文字。ldstr指令分配所需的内存,并执行任何格式转换,以将字符串文字从文件中使用的表单转换为运行时所需的字符串格式。
The Common Language Infrastructure (CLI) guarantees that the result of two ldstr instructions referring to two metadata tokens that have the same sequence of characters return precisely the same string object (a process known as "string interning").
公共语言基础设施(CLI)保证,引用两个具有相同字符序列的元数据令牌的ldstr指令的结果将返回完全相同的字符串对象(称为“string interning”)。
This implies that the string literals are in fact stored on the heap in .NET (unlike Java as pointed out by mmyers).
这意味着字符串常量实际上存储在. net中的堆中(与mmyers指出的Java不同)。
#6
0
In Java, strings like all objects reside in the heap. Only local primitive variables (ints, chars and references to objects) reside in stack.
在Java中,像所有对象一样的字符串驻留在堆中。只有本地原始变量(int、chars和对象引用)驻留在堆栈中。
#7
-1
Interned String's in java are located in a separate Pool called the String Pool. This pool is maintained by the String class and resides on the normal Heap (not the Perm pool as mentioned above, that is used for storing the class data).
java中的Interned字符串位于一个叫做String Pool的单独的池中。这个池由String类维护,并驻留在正常堆上(不像上面提到的Perm池,用于存储类数据)。
As I understand it not all Strings are interned, but calling myString.intern() returns a String that is guaranteed from the String Pool.
根据我的理解,并不是所有的字符串都被保存,而是调用myString.intern()返回一个字符串,该字符串是由字符串池保证的。
See also: http://www.javaranch.com/journal/200409/ScjpTipLine-StringsLiterally.html and the javadoc http://java.sun.com/j2se/1.5.0/docs/api/java/lang/String.html#intern()
参见:http://www.javaranch.com/journal/200409/ scjppline - stringsliterally.html和javadoc http://java.sun.com/j2se/1.5.0/docs/api/java/lang/String.html#intern()
#1
102
Strings in .NET are reference types, so they are always on the heap (even when they are interned). You can verify this using a debugger such as WinDbg.
. net中的字符串是引用类型,所以它们总是在堆上(即使是在中间)。您可以使用诸如WinDbg这样的调试器来验证这一点。
If you have the class below
如果你有下面的课程。
class SomeType {
public void Foo() {
string s = "hello world";
Console.WriteLine(s);
Console.WriteLine("press enter");
Console.ReadLine();
}
}
And you call Foo()
on an instance, you can use WinDbg to inspect the heap.
在实例上调用Foo(),可以使用WinDbg检查堆。
The reference will most likely be stored in a register for a small program, so the easiest is to find the reference to the specific string is by doing a !dso
. This gives us the address of our string in question:
引用很可能存储在一个小程序的寄存器中,所以最简单的方法是执行一个!dso来找到对特定字符串的引用。这就给出了问题字符串的地址:
0:000> !dso
OS Thread Id: 0x1660 (0)
ESP/REG Object Name
002bf0a4 025d4bf8 Microsoft.Win32.SafeHandles.SafeFileHandle
002bf0b4 025d4bf8 Microsoft.Win32.SafeHandles.SafeFileHandle
002bf0e8 025d4e5c System.Byte[]
002bf0ec 025d4c0c System.IO.__ConsoleStream
002bf110 025d4c3c System.IO.StreamReader
002bf114 025d4c3c System.IO.StreamReader
002bf12c 025d5180 System.IO.TextReader+SyncTextReader
002bf130 025d4c3c System.IO.StreamReader
002bf140 025d5180 System.IO.TextReader+SyncTextReader
002bf14c 025d5180 System.IO.TextReader+SyncTextReader
002bf15c 025d2d04 System.String hello world // THIS IS THE ONE
002bf224 025d2ccc System.Object[] (System.String[])
002bf3d0 025d2ccc System.Object[] (System.String[])
002bf3f8 025d2ccc System.Object[] (System.String[])
Now use !gcgen
to find out which generation the instance is in:
现在使用gcgen来查找实例所在的年代:
0:000> !gcgen 025d2d04
Gen 0
It's in generation zero - i.e. it has just be allocated. Who's rooting it?
它在第0代,也就是说它刚刚被分配。是谁支持它吗?
0:000> !gcroot 025d2d04
Note: Roots found on stacks may be false positives. Run "!help gcroot" for
more info.
Scan Thread 0 OSTHread 1660
ESP:2bf15c:Root:025d2d04(System.String)
Scan Thread 2 OSTHread 16b4
DOMAIN(000E4840):HANDLE(Pinned):6513f4:Root:035d2020(System.Object[])->
025d2d04(System.String)
The ESP is the stack for our Foo()
method, but notice that we have a object[]
as well. That's the intern table. Let's take a look.
ESP是Foo()方法的堆栈,但请注意我们也有一个对象[]。这是实习表。让我们看一看。
0:000> !dumparray 035d2020
Name: System.Object[]
MethodTable: 006984c4
EEClass: 00698444
Size: 528(0x210) bytes
Array: Rank 1, Number of elements 128, Type CLASS
Element Methodtable: 00696d3c
[0] 025d1360
[1] 025d137c
[2] 025d139c
[3] 025d13b0
[4] 025d13d0
[5] 025d1400
[6] 025d1424
...
[36] 025d2d04 // THIS IS OUR STRING
...
[126] null
[127] null
I reduced the output somewhat, but you get the idea.
我减少了输出,但是你们懂的。
In conclusion: strings are on the heap - even when they are interned. The interned table holds a reference to the instance on the heap. I.e. interned strings are not collected during GC because the interned table roots them.
总而言之,字符串在堆上——即使它们被保存。interned表保存对堆上实例的引用。也就是说,在GC过程中不收集interned字符串,因为interned的表根是它们。
#2
12
In Java (from the Java Glossary):
在Java(从Java术语表中):
In Sun’s JVM, the interned Strings (which includes String literals) are stored in a special pool of RAM called the perm gen, where the JVM also loads classes and stores natively compiled code. However, the intered Strings behave no differently than had they been stored in the ordinary object heap.
在Sun的JVM中,嵌入的字符串(包括字符串文本)存储在一个称为perm gen的特殊RAM池中,在这个池中,JVM还加载类并存储本地编译的代码。然而,与存储在普通对象堆中的字符串没有什么不同。
#3
3
Correct me if I am wrong but don't all objects reside on the heap, in both Java and .NET?
如果我错了,请纠正我,但不是所有对象都驻留在堆中,在Java和。net中?
#4
1
In .Net, string literals when "interned", are stored in a special data structure called, the "intern table". This is separate from the heap and the stack. Not all strings are interned however... I'm pretty sure that those that aren't are stored on the heap.
在。net中,“interned”时的字符串文本存储在一个称为“实习生表”的特殊数据结构中。这与堆和堆栈是分开的。并不是所有的字符串都被保存……我很确定那些不是存储在堆上的。
Don't know about Java
不知道Java
#5
1
I found this on MSDN's site about the ldstr
IL instruction:
我在MSDN的网站上找到了关于ldstr IL指令的信息:
The
ldstr
instruction pushes an object reference (type O) to a new string object representing the specific string literal stored in the metadata. Theldstr
instruction allocates the requisite amount of memory and performs any format conversion required to convert the string literal from the form used in the file to the string format required at runtime.ldstr指令将对象引用(类型O)推到一个新的字符串对象,该对象表示存储在元数据中的特定字符串文字。ldstr指令分配所需的内存,并执行任何格式转换,以将字符串文字从文件中使用的表单转换为运行时所需的字符串格式。
The Common Language Infrastructure (CLI) guarantees that the result of two ldstr instructions referring to two metadata tokens that have the same sequence of characters return precisely the same string object (a process known as "string interning").
公共语言基础设施(CLI)保证,引用两个具有相同字符序列的元数据令牌的ldstr指令的结果将返回完全相同的字符串对象(称为“string interning”)。
This implies that the string literals are in fact stored on the heap in .NET (unlike Java as pointed out by mmyers).
这意味着字符串常量实际上存储在. net中的堆中(与mmyers指出的Java不同)。
#6
0
In Java, strings like all objects reside in the heap. Only local primitive variables (ints, chars and references to objects) reside in stack.
在Java中,像所有对象一样的字符串驻留在堆中。只有本地原始变量(int、chars和对象引用)驻留在堆栈中。
#7
-1
Interned String's in java are located in a separate Pool called the String Pool. This pool is maintained by the String class and resides on the normal Heap (not the Perm pool as mentioned above, that is used for storing the class data).
java中的Interned字符串位于一个叫做String Pool的单独的池中。这个池由String类维护,并驻留在正常堆上(不像上面提到的Perm池,用于存储类数据)。
As I understand it not all Strings are interned, but calling myString.intern() returns a String that is guaranteed from the String Pool.
根据我的理解,并不是所有的字符串都被保存,而是调用myString.intern()返回一个字符串,该字符串是由字符串池保证的。
See also: http://www.javaranch.com/journal/200409/ScjpTipLine-StringsLiterally.html and the javadoc http://java.sun.com/j2se/1.5.0/docs/api/java/lang/String.html#intern()
参见:http://www.javaranch.com/journal/200409/ scjppline - stringsliterally.html和javadoc http://java.sun.com/j2se/1.5.0/docs/api/java/lang/String.html#intern()