The Truth About .NET Objects And Sharing Them Between AppDomains

时间:2024-01-01 12:54:45

From http://geekswithblogs.net/akraus1/archive/2012/07/25/150301.aspx

I have written already some time ago how big a .NET object is. John Skeet as also made a very detailed post about object sizes in .NET. I wanted to know if we can deduce the object size not by experiments (measuring) but by looking at the Rotor source code. There is indeed a simple definition in the object headers how big a .NET object minimally can be. A CLR object is still a (sophisticated) structure which is at an address that is changed quite often by the garbage collector.

image

The picture above shows that every .NET object contains an object header which contains information about which thread in which AppDomain has locked the object (means called Monitor.Enter). Next comes the Method Table Pointer which defines a managed type for one AppDomain. If the assembly is loaded AppDomain neutral this pointer to the type object will have the same value in all AppDomains. This basic building block of the CLR type system is also visible in managed code via Type.TypeHandle.Value which has IntPtr size.

\sscli20\clr\src\vm\object.h

//
// The generational GC requires that every object be at least 12 bytes
// in size.
#define MIN_OBJECT_SIZE (2*sizeof(BYTE*) + sizeof(ObjHeader))
A .NET object has basically this layout:

class Object
{
protected:
MethodTable* m_pMethTab;

};
class ObjHeader
{
private:
// !!! Notice: m_SyncBlockValue *MUST* be the last field in ObjHeader.
DWORD m_SyncBlockValue; // the Index and the Bits
};

For x86 the minimum size is therefore 12 bytes = 2*4+4. And for x64 it is 24 bytes = 2*8+8. The ObjectHeader struct is padded with another 4 bytes in x64 which does add up to 24 bytes for every object instance. The MIN_OBJECT_SIZE definition has actually a factor two inside it whereas we would expect 8 as minimum empty object size. The previous sentence does contain already the answer to it. It makes little sense to define empty objects. Most meaningful objects have at least one member variable of class type which is indeed another pointer sized member hence the minimum size of 12 bytes (24) bytes in x86/x64.

It is interesting to know that the garbage collector does not know anything about AppDomains. For him the managed heap does only consist of objects which have roots or not and does clean up everything which is not rooted anymore. I found this during the development of WMemoryProfiler which uses DumpHeap of Windbg to get all object references from the managed heap. When I did access all objects found this way I got actually objects from other AppDomains as well. And they did work! It is therefore possible to share objects directly between AppDomains.

Why would you want to do that? Well it is fun and you can do really dirty stuff with that. Do you remember that you cannot unload assemblies from an AppDomain? Yes that is still true but why would you ever want to unload an assembly? Mostly because you were doing some dynamic code generation which will at some point in time dominate your overall memory consumption if you load generated assemblies into your AppDomain. I have seen this stuff many times for dynamic query generation. The problem is that if you load the dynamically created code into another AppDomain you need to serialize the data to the other AppDomain as well because you cannot share plain objects between AppDomains. To serialize potentially much data across AppDomain is prohibitively slow and therefore people live with the restriction that code gen will increase the working set quite a lot. With some tricks you can now share plain objects between AppDomain and get unloadable code as well.

Warning: This following stuff well beyond the specs but it does work since .NET 2.0 up to 4.5.
Do not try this at work!

When you load an assembly into your (default) AppDomain you will load it only for your current AppDomain. The types defined there are not shared anywhere. There is one exception though: The types defined in mscorlib are always shared between all AppDomains. The mscorlib assembly is loaded into a so called Shared Domain. This is not a real AppDomain but simply a placeholder domain where all assemblies are loaded which can be shared between AppDomains. An assembly loaded into the Shared Domain is loaded therefore AppDomain neutral. Assemblies loaded AppDomain neutral have one special behavior:

AppDomain neutral assemblies are never unloaded even when no “real” AppDomain is using them anymore.
The picture below shows in which scenarios assemblies are loaded AppDomain neutral (green) from the Shared Domain.

image

The first one is the most common one. You load an assembly into the default AppDomain. This defaults to LoaderOptimization.SingleAppDomain where every assembly is compiled from scratch again when other AppDomains with no special flags are created. Only the basic CLR types located in mscorlib are loaded AppDomain neutral and always shared between AppDomains no matter what (with .NET 1.1 this was not the case) flags are used.

If you create e.g. 5 AppDomains with the default settings you will load and JIT every assembly (except mscorlib) again and get different types for each AppDomain although the were all loaded from the same assembly in the same loader context.

The opposite is LoaderOptimization.MultiDomain where every assembly is loaded as AppDomain neutral assembly which ensures that all assemblies loaded in any AppDomain which have this attribute set are loaded and JITed only once and share the types (=Same Method Table pointer) between AppDomains.

An interesting hybrid is LoaderOptimization.MultiDomainHost which does load only these assemblies AppDomain neutral which are loaded from the GAC. That means if you load the same assembly one time from the GAC and a second time the same one unsigned from your probing path you will not get identical types but different Method Table pointers for the types.

Since we know that the GC does not know anything about AppDomains we (at least the managed heap objects do not contain infos in which AppDomain they reside) we should be able to pass object via a pointer to another AppDomain. This could come in handy if we generate a lot of code dynamically for each query which is made against a data source but we want a way to get rid of the compiled code by unloading the Query AppDomain from time to time without the normally required cost to copy the data to be queried every time from the Default AppDomain into the Query AppDomain.

image

You can see this in action in the sample Application AppDomainTests which is part of the test suite for WMemoryProfiler. Here is the code for the main application which does create AppDomains in a loop send data via an IntPtr to another AppDomain and get the calculated result back without passing the data via Marshalling by value to the other AppDomain.

`class Program
{
/// <summary>
/// Show how to pass an object by reference directly into another appdomain without serializing it at all.
/// </summary>
/// <param name="args"></param>
[LoaderOptimization(LoaderOptimization.MultiDomainHost)]
static public void Main(string[] args)
{
for (int i = 0; i < 10000; i++) // try it often to see how the AppDomains do behave
{
// To load our assembly appdomain neutral we need to use MultiDomainHost on our hosting and child domain
// If not we would get different Method tables for the same types which would result in InvalidCastExceptions
// for the same type.
// Prerequisite for MultiDomainHost is that the assembly we share the data is
// a) Installed into the GAC (which requires as strong name as well)
// If you would use MultiDomain then it would work but all AppDomain neutral assemblies will never be unloaded.
var other = AppDomain.CreateDomain("Test"+i.ToString(), AppDomain.CurrentDomain.Evidence, new AppDomainSetup
{
LoaderOptimization = LoaderOptimization.MultiDomainHost,
});

// Create gate object in other appdomain
DomainGate gate = (DomainGate)other.CreateInstanceAndUnwrap(Assembly.GetExecutingAssembly().FullName, typeof(DomainGate).FullName);

// now lets create some data
CrossDomainData data = new CrossDomainData();
data.Input = Enumerable.Range(0, 10).ToList();

// process it in other AppDomain
DomainGate.Send(gate, data);

// Display result calculated in other AppDomain
Console.WriteLine("Calculation in other AppDomain got: {0}", data.Aggregate);

AppDomain.Unload(other);
// check in debugger now if UnitTests.dll has been unloaded.
Console.WriteLine("AppDomain unloaded");

}
}`
To enable code unloading in the other AppDomain I did use LoaderOptimzation.MultiDomainHost which forces all non GAC assemblies to be unloadable. At the same time we must ensure that the assembly that defines CrossDomainData is loaded from the GAC to get an equal MethodTable pointer accross all AppDomains. The actual magic does happen in the DomainGate class which has a method DoSomething which expects not a CLR object but the object address as parameter to weasel a plain CLR reference into another AppDomain. This sounds highly dirty and it certainly is but it is also quite cool ;-).

/// <summary>
/// Enables sharing of data between appdomains as plain objects without any marsalling overhead.
/// </summary>
class DomainGate : MarshalByRefObject
{
/// <summary>
/// Operate on a plain object which is shared from another AppDomain.
/// </summary>
/// <param name="gcCount">Total number of GCs</param>
/// <param name="objAddress">Address to managed object.</param>
public void DoSomething(int gcCount, IntPtr objAddress)
{
if (gcCount != ObjectAddress.GCCount)
{
throw new NotSupportedException("During the call a GC did happen. Please try again.");
}

// If you get an exception here disable under Projces/Debugging/Enable Visual Studio Hosting Process
// The appdomain which is used there seems to use LoaderOptimization.SingleDomain
CrossDomainData data = (CrossDomainData) PtrConverter<Object>.Default.ConvertFromIntPtr(objAddress);;

// process input data from other domain
foreach (var x in data.Input)
{
Console.WriteLine(x);
}

OtherAssembliesUsage user = new OtherAssembliesUsage();

// generate output data
data.Aggregate = data.Input.Aggregate((x, y) => x + y);
}

public static void Send(DomainGate gate, object o)
{
var old = GCSettings.LatencyMode;
try
{
GCSettings.LatencyMode = GCLatencyMode.Batch; // try to keep the GC out of our stuff
var addandGCCount = ObjectAddress.GetAddress(o);
gate.DoSomething(addandGCCount.Value, addandGCCount.Key);
}
finally
{
GCSettings.LatencyMode = old;
}

}

}
To get an object address I use Marshal.UnsafeAddrOfPinnedArrayElement and then try to work around the many race conditions this does impose. But it is not as bad is it sounds since you do need only to pass an object via a pointer once into the other AppDomain and use this as data gateway to exchange input and output data. This way you can pass data via a pointer to another AppDomain which can be fully unloaded after you are done with it. To make the code unloadable I need to use LoadOptimization.MultiDomainHost for all AppDomains. The data exchange type is located in another assembly which is strong named and you need to put it into the GAC before you let the sample run. Otherwise it will fail with this exception

Unhandled Exception: System.InvalidCastException: [A]AppDomainTests.CrossDomainData cannot be cast to [B]AppDomainTests.CrossDomainData. Type A originates from 'StrongNamedDomainGateDll, Version=1.0.0.0, Culture=neutral, PublicKeyToken=98f280cda3cbf035' in the context 'Default' at location 'C:\Source\WindbgAuto\bin\AnyCPU\Release\StrongNamedDomainGateDll.dll'. Type B originates from 'StrongNamedDomainGateDll, Version=1.0.0.0, Culture=neutral, PublicKeyToken=98f280cda3cbf035' in the context 'Default' at location 'C:\Source\WindbgAuto\bin\AnyCPU\Release\StrongNamedDomainGateDll.dll'.
at AppDomainTests.DomainGate.DoSomething(Int32 gcCount, IntPtr objAddress) in C:\Source\WindbgAuto\Tests\AppDomainTests\DomainGate.cs:line 24
at AppDomainTests.DomainGate.DoSomething(Int32 gcCount, IntPtr objAddress)
at AppDomainTests.DomainGate.Send(DomainGate gate, Object o) in C:\Source\WindbgAuto\Tests\AppDomainTests\DomainGate.cs:line 50
at AppDomainTests.Program.Main(String[] args) in C:\Source\WindbgAuto\Tests\AppDomainTests\Program.cs:line 41

at first it looks a little pointless to deny a cast to an object which was loaded in the default loader context for the very same assembly. But we do know now that the Method Table pointer for CrossDomainData is different between the two AppDomains. When you install the assembly into the GAC (be sure to use the .NET 4 gacutil!) the error goes away and we then get:

0
1
2
3
4
5
6
7
8
9
Calculation in other AppDomain got: 45

which shows that we can get data and are able to modify it directly between AppDomains. If you use this code in production and it does break. I have warned you. This is far beyond what the MS engineers want us to do and it can break the CLR in subtle unintended ways I have not found yet. Now you have got (hopefully) a much better understanding how the CLR type system and the managed heap does work. If questions are left. Start the application and look at !DumpDomain and !DumpHeap –stat and its related commands to see for yourself.