C# to IL 6 Reference and Value Types(引用类型和值类型)

时间:2020-12-22 13:48:37

C# to IL 6 Reference and Value Types(引用类型和值类型)

An interface is a reference type, in spite of the fact that it has no code at all. Thus, we
cannot instantiate an interface. We can use it as a construct for the creation of new types.
An interface defines a contract that is left to the class to implement.
An interface can have static fields. If an interface contains 10 abstract virtual functions,
then the class implementing from that interface has to supply the code for all 10 of them.
Thus, if a class does not provide all the function implementations, then we cannot use the
class. In such a scenario, a class derived from it must provide the implementation.
The interface keyword in C# is a class, which the documentation describes as a semantic
attribute.

C# to IL 6 Reference and Value Types(引用类型和值类型)

We are not allowed to place any code in an interface. An interface consists only of the
function prototype, followed by a pair of curly braces {}

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

vijay1 is a function created in the interface yyy. As this is not permitted, the il assembler
has the domino effect as shown above

C# to IL 6 Reference and Value Types(引用类型和值类型)

No variables either can be placed inside an interface. This rule is similar to that of C#.
Even though the documentation says that we can place static fields in an interface, when
we tried to do so, an error was generated.

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

If we create an object such as an interface using newobj, the assembler does not generate
any error, but the runtime throws an exception

C# to IL 6 Reference and Value Types(引用类型和值类型)

The above program has only one function, a1 in the interface. This function has a pair of
curly braces {}, but as mentioned earlier, we are not allowed to place any code within them.
The class zzz has been derived from yyy, using the keyword implements. The assembler
does not check whether all code of the interface is implemented as it is done only at
runtime, and hence, an exception is generated. Thus, we can see that most IL errors occur
at run time, and not at compile time

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

To reiterate what we have said earlier, an interface in C# becomes a class directive in IL
with the interface modifier added to it. The two functions a1 and a2 become actual
functions in the class ddd, and are marked as virtual, newslot as well as abstract, i.e.
having no implementation

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

We cannot place any code, including a ret, in a method that is marked as abstract. This
modifier signifies that the code for the function will be provided from some other source.
Inspite of what the documentation says, a static constructor cannot be placed in an
interface.

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

For the purposes of inheritance, C# does not differentiate between an interface and a class.
There is, however, a subtle difference between them in a sense that, we can derive from
more than one interface, but not from more than one class.
In IL, there is a marked differentiation between an interface and a class. We extend from a

class and implement an interface. This is the same syntax that the Java programming
language uses.
When one compares C# with Java, their support for features such as these should be
highlighted

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

We created two locals that look like class yyy and interface ddd. Then, we created an object
that looks like yyy and initialized the variable V_0 to it.
The statement d =a is translated to: loading the value of V_0 on the stack, and using
instruction stloc.1 to initialize the variable V_1. Thereafter, calling the function a1 of the
interface ddd.
We loaded the variable value V_1 on the stack. Since it was called through the interface, we
used callvirt instead of call. If we had called it through the object of type yyy, then we
would have used call.
Thus, IL understands that a call through an interface object is to be treated in a special
manner. We can change the last occurrence of ldloc.1 to ldloc.0, since both have the same
values. A call to an interface is evaluated at run time, as the assembler does not convert it
into a class access. In the locals directive, the word class, and not interface, is placed in
front of ddd.
Thus, calling a function through an interface is equivalent to using callvirt since, there is
no code in the interface. A callvirt takes more time to execute than the plain call
instruction. However, callvirt introduces dynamism at runtime.

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

We have created a class yyy that is derived from two interfaces, ddd and eee, that have the
same function a1.
Since we want a separate implementation for each, we have to preface each occurrence of
a1 with the name of the interface, i.e. either ddd or eee, in the class yyy.
We have created the two objects that look like yyy and stored them in classes that look like
ddd and eee. Since the function is called from an interface pointer, we have to use callvirt
instead of call. In the IL code, the two interfaces are created as shown earlier.
In class yyy, we implement from ddd and eee, but since the two functions cannot have the
same name, we have to preface the name of the function with the name of the interface.

In the method, we have used a directive called .override. This directive clearly specifies as
to which function from a specified interface the function override. Calling of an interface is
a run time issue. The CLR does all the routine work.

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

The above program clears a large number of cobwebs. Let us start analysing this program
from the beginning.
We have a class interface ddd that has two functions a1 and a2. We then create a class
yyy, that implements from ddd and contains code for only one function a1.
This makes the class incomplete and hence, we tag it with the modifier abstract. However,
this modifier is optional. One more class xxx is created, that derives from yyy and
implements the second function a2. All goes well so far.
Then, using call, the function a2 of class xxx is called. However, when we call the same
function off the interface ddd using callvirt, the function is called off class yyy and not xxx.
This is so as the function in class xxx has nothing to do with the one in class yyy.
Compare this example with the override modifier example shown earlier. If we eliminate the
code of function a2 from the class yyy, we get the following error:

C# to IL 6 Reference and Value Types(引用类型和值类型)

The function a2 is present in the class xxx, but it has been eliminated from the class yyy.
However, the function needs to be present in both the classes.
The same rules of nesting apply to interfaces also. Nothing stops an interface from
implementing another interfaces, using the keyword implements. Here, the word
implements may be misleading.
The keyword implies that the class that implements this interface must provide the code
for it.

An interface has five restrictions:
i. All methods must be either virtual or static.
ii. The virtual methods must be abstract and public.
iii. No instance fields are allowed.
iv. An interface is abstract and cannot be instantiated.
v. An interface cannot inherit from a class.
vi. Under no circumstances can an interface contain code.

C# to IL 6 Reference and Value Types(引用类型和值类型)

A structure handles memory more efficiently than a class. IL does not support a struct type
directly. As IL does not recognise a structure, it does not enforce the following rules:
• Constructors must have parameters
• All members of a structure must be initialised before leaving the constructor.
Also, structures are derived from ValueType and not Object.
The type system of the .Net world is simplicity personified. It divides all the known types
into one of the two categories: a value type or a reference type.
• A reference type is known by a reference, that is, a memory location that stores the
address where the object resides in memory.
• A value type, however, is directly stored in the memory location occupied by the
variable that represents the type.
Value types are used to represent small data items like local variables, integers, numbers
with decimal places etc. The memory allocated is on the stack and not on the heap.
To access a reference type, the location of the variable in memory is to be first determined.
This is not true for a value type. Hence, there is no overhead of an indirection involved with
a value type and therefore, it is much more efficient.
The disadvantage of a value type is that they cannot be derived from and, if the data they
represent is fairly large, then copying the type on the stack is not an efficient way of
representing that type. There is no need to instantiate a variable of value type, as it is
already instantiated. Apart from these variations, value types are similar to reference types.

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

In the above example, we have created a value class or a value type called xxx, that has
two public fields i and j. We used the instruction initobj to create a new value type.
To display the value of i, we first created a local variable that represents our value class. In
our case, the variable is called v. ldloca is used to load the address of the variable v on the
stack. Then we called initobj with the name of the value class xxx as a parameter thus
creating a a new value type.
We, then, again load the address of the value type v on the stack and call ldfld. This
instruction needs the address of the value type on the stack to work with. The only reason
that the value of i is ZERO is that the instruction initobj guarantees that all members of
the value type will be initialized to zero.

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

The correct way to initialize a value class is to call the constructor. We first have to load
the address of the value type v on the stack. Then, since the constructor expects a single
parameter on the stack, we place the number 2 on the stack using ldc. The constructor is
then called in the same manner as we call any other function.
In the constructor, we first place the this pointer on the stack. The this pointer, or the first
invisible parameter to a function, is a reference to the starting location of the object in
memory. Parameter 1 is placed on the stack and stfld is called.
The constructor initializes all members of a value class. The static fields of a value class
are initialized when the value type is first loaded.

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

Not using initobj, like in the above example will assign a random value to the value type.
The use of initobj is optional. This instruction requires a managed pointer to an instance of
the value type and it is one of the few instructions that does not return anything on the
stack. The constructor is never called by the initobj instruction. The sole role initobj
performs is to initialize all the value class members to ZERO.
While verifying code, one should ensure that all the fields of a value type are assigned a
value before they are read or passed as parameters to a method. The code in constructor
assigns values to every field.
You can see the contrast between initobj and newobj. Value Types use initobj whereas
reference types use newobj. Also, value types are derived from System.ValueType.
Value types can have static, instance and virtual methods. Here, the static methods are
called in a similar manner when in a class.

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

To call an instance function of a value class, there is no need for either initobj or the
constructor call. But, it is a good practice to do so. We have to place the address of the
value type or the this pointer on the stack and then place the parameters. The function a1
uses the this pointer to access the fields.
We modified the function abc to read as follows:
.method public virtual instance void a1(int32 p) il managed
Despite making the function virtual, the program executes as before. The order of the
virtual modifier is very important.
You may recall that a virtual function has to be called using the instruction callvirt and not
the instruction call. However, in the case of a value class, we cannot use callvirt. Instead,
the instruction call is used.

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

In the above program, we specified an interface ddd, that contains a single function called
a1. We created a value class xxx that implements from ddd.
Our intention is to call the function a1 from the interface ddd. As mentioned earlier, to call
a function off an interface, the instruction callvirt has to be used and not the instruction
call, as, an interface does not contain any code.
The callvirt instruction requires a reference type on the stack because it does not work
with value types. Thus, we use ldloca to load the address of the value type on the stack.
Then, we use the box instruction to convert it into a reference type.

If we comment out the box instruction, the following exception is generated because callvirt
looks for a a boxed type on the stack:

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

The constructor assigns values to fields. It places the values on the stack and uses stfld to
assign the values to the fields. The question that arises is that what happens when we
equate reference objects with each other.
The explanation is very simple: A reference object is simply a memory location stored in a
local variable. The variable V_0 contains a reference to the newly created object in memory.
We place this value on the stack and use ldloc.1 to initialize the variable V_1 to this value.
Thus, a reference object is a number representing the memory location of an object. Here,
the same number is stored in the objects a and b. Hence b.i displays the number 10. Here,
the constructor does not get called again, as no new object is created.

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

This program and the next one should be read in conjunction if you want to grasp the
following:
• Concepts of boxing and unboxing
• The major difference between a class and a structure.
The concept of a structure is not supported by IL. On conversion to IL, a struct becomes a
class with the modifier value added to it. It is sealed and derived from ValueType hence
referred to as a value class.
In the C# program, an object is created, which is an instance of the structure xxx. The
constructor is passed the constant value 1, which is used to initialize the int field x to 1.
Then, the object b is initialized to a. Next, we change the value of the member x from 1 to 2
using the value object a.

We display the value of the field x using b. The cast operator is used as the data type of b
is Object and not xxx. We notice that there are two x ints in memory:
• One with a value of 2 that is associated with a,
• The other one with the value of 1 that is associated with b.
So much for the C# program, let us see as to what happens in our IL program. We create 3
objects in IL i.e. two variables V_1 and V_2 of the class xxx and one that looks like an
Object.
We place the address of V_0 on the stack followed by the value 1. Then, we call the
constructor using a call and not newobj, since we have a value class or structure and not a
pure class. The constructor initializes the field x to 1.
We have to convert this value class to a pure object that is an instance of the class Object.
We load the address of V_O and call the box instruction, which converts a value class into
a class and places the reference of the newly created object on the stack.
Then, we store this reference in the local variable V_1 using stloc.1. This is the code
generated when the statement object b = a is converted to IL. We have created a fresh
object using the box instruction. Thus there are two xxx objects in memory, one as the
value object V_0 and one as a reference object V_1.
We now need to initialize the field x to 2. To do so, the constant 2 is placed on the stack
and stfld is called. The easier part of the code is over.
The problem is in the expression WriteLine((xxx)b).x. The object b or V_1 is a reference
object. We have to cast it to a value object. To do this, we need to unbox it. The act of
converting a reference object to a value object is called unboxing. The unbox instruction
requires a reference type on the stack and it will place a value type whose data type is
specified by the name following xxx.
The instruction ldobj loads an instance of xxx on the stack whose pointer is already
present on the stack. We store this instance in V_2 and load this value type again on the
stack. Then we load the value of x and display it using WriteLine.

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

c is an object of type yyy and holds a value of 1 in its member x. Object d is another class
or call it a structure, it does not create a new object in memory but instead, points to the
same object referenced by c. Thus, we have one yyy object in memory, and any changes
made to the value of x using d will be reflected when using c and vice-versa.
Here, as we are dealing with a class, the instruction newobj is used to create it. To initialize
the object d to c, we first use ldloc.0 to place its value on the stack and then use the
instruction stloc.1 to initialize local V_1.
Then, we initialize c.x to 2 in the usual manner, by first placing the reference on the stack
using ldloc.0 and then, placing the value on the stack using stfld.
Object d is already a reference object and yyy is a class. Hence we simply use castclass. It
is easy to use casting here because neither boxing nor unboxing is required to be carried
out.
The important point to be mentioned is that we are not creating another yyy in memory,
and hence, there is only one field x in memory. This was not the case earlier case, when a
structure was used.

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

This example is deceptively similar to the one above.
First we take a reference object b and equate it to a value object f. Then we cast the
reference type object b to a value type int. The C# compiler gives us no errors but a
runtime exception is thrown.
Back to IL. We first create a long or an int64 V_0 using locals. Then we create an Object
V_1 and finally an int V_2.
We thereafter, place 1 on the stack, convert it into 8 bytes using conv.i8 and use stloc.0 to
store the value in V_0. The address is then placed on the stack, as we need to use the box
instruction to convert it into a reference type, which is finally to be stored in b or V_1.
Unbox the object b, the one created out of a value type, and store in an int. To do this, we
need to place a reference on the stack and call unbox. This will place a value address on
the stack and use ldind.i4 to fetch the value stored at this address. Then, we use stloc.2 to
initialize the variable V_2. The exception clearly states that we cannot cast an object that is
a reference type to a value type.

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

Languages decide on how you write code and name variables. In C# a field and a parameter
to a function can have the same names, but the parameter name has more visibility than
the field name.
this.x refers to the field name in the function abc unlike the parameter named x. In IL, this
dilemma does not arise, as we have one set of instructions that deals with fields, a second
set that deals with parameters to functions and a third set that deals with locals. Thus
there is no way that a name * can ever occur.

C# to IL 6 Reference and Value Types(引用类型和值类型)

C# to IL 6 Reference and Value Types(引用类型和值类型)

We take one more program on structures before we close this chapter. . We have created a
struct containing a field i and a function abc. Structures are value objects and are stored
on the stack and not on the heap. Thus, the word value has been used in the locals
directive. It is a class, but of a value type.
In the definition of the structure, we have added two modifiers, sealed and value

Therefore, we cannot derive from this value class. Everything else is similar to a class.