CLR via C# 3rd - 01 - The CLR's Execution Model

时间:2023-03-09 22:32:17
CLR via C# 3rd - 01 - The CLR's Execution Model
1. Assemly
    A managed module is a standard 32-bit Microsoft Windoes portable executable (PE32) file or a standard 64-bit Windows portable executable (PE32+) file that requires the CLR to execute.
     Managed Module = IL + metadata.
     Assembly offers a way to treat a group of files as a single entity.
     An assembly is a collection of one or more files containing type definitions and resource files.
     Here are some characteristics of assemblies
  • An assembly defines the reusable types.
  • An assembly is marked with a version number.
  • An assembly can have security information associated with it.
     Assembly contains a block of data called the manifest. The manifest is simply another set of metadata tables. These tables describe the files that make up the assembley, the publicly exported types implented by the files in the assembly, and the resource or data files that are associated with the assembly.
     By default, the compilers actually do the work of turning the emitted managed module into an assembly; that is, the C# comiler emits a managed module that contains a manifest. The manifest indicates that the assembly consists of just the one file. So, for projects that have just one managed module and no resource (or data) files, the assembly will be the managed module, and you don't have any additional steps to perform during your build process.
     An assembly's modules also include information about referenced assemblies (including their version numbers). This information makes an assembly self-describing.
     To summarize, an assembly is a unit of reuse, versioning, and security. It allows you to partition your types and resources into separate files so that you, and consumers of your assembly, get to determine which files to package together and deploy. Once the CLR loads the file containing the manifest, it can determine which of the assembly’s other files contain the types and resources the application is referencing. Anyone consuming the assembly is required to know only the name of the file containing the manifest; the file partitioning is then abstracted away from the  consumer and can change in the future without breaking the application’s behavior.
     If you have multiple types that can share a single version number and security settings, it is recommended that you place all of the types in a single file rather than spread the types out over separate files, let alone separate assemblies. The reason is performance. Loading a file/assembly takes the CLR and Windows time to find the assembly, load it, and initialize it. The fewer files/assemblies loaded the better, because loading fewer assemblies helps reduce working set and also reduces fragmentation of a process’s address space. Finally, nGen.exe can perform better optimizations when processing larger files.
2. MSCorEE.dll
     You can tell if the .NET Framework has been installed by looking for the MSCorEE.dll file in the %SystemRoot%\System32 directory. The existence of this file tells you that the .NET Framework is installed. However, several versions of the .NET Framework can be installed on a single machine simultaneously. If you want to determine exactly which versions of the .NET Framework are  installed, examine the subkeys under the following registry key:
     HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\NET Framework Setup\NDP
 
     On an x86 version of Windows, the x86 version of MSCorEE.dll can be found in the C:\Windows\System32 directory. On an x64 or IA64 version of Windows, the x86 version of MSCorEE.dll can be found in the C:\Windows\SysWow64 directory, whereas the 64-bit version (x64 or IA64) can be found in the C:\Windows\System32 directory (for backward compatibility reasons).
 3、Platforms
     If your assembly files contain only type-safe managed code, you are writing code that should work on both 32-bit and 64-bit versions of Windows. No source code changes are required for your code to run on either version of Windows. In fact, the resulting EXE/DLL file produced by the compiler will run on 32-bit Windows as well as the x64 and IA64 versions of 64-bit Windows! In other words, the one file will run on any machine that has a version of the .NET Framework installed on it.
     On extremely rare occasions, developers want to write code that works only on a specific version of Windows. Developers might do this when using unsafe code or when interoperating with unmanaged code that is targeted to a specific CPU architecture. To aid these developers, the C# compiler offers a /platform command-line switch. This switch allows you to specify whether the resulting assembly can run on x86 machines running 32-bit Windows versions only, x64 machines running 64-bit Windows only, or Intel Itanium machines running 64-bit Windows only. If you don’t specify a platform, the default is anycpu, which indicates that the resulting assembly can run on any version of Windows. Users of Visual Studio can set a project’s target platform by displaying the project’s property pages, clicking the Build tab, and then selecting an option in the Platform Target list.
4. Tools
     CLRVer.exe
          command-line utility, shows all of the CLR versions installed on a machine.
     DumpBin.exe
          command-line utility, examine the header information emitted in a managed module by the compiler.
     CorFlags.exe
          command-line utility, examine the header information emitted in a managed module by the compiler.
5. Runing an excutable file
     When running an executable file, Windows examines this EXE file’s header to determine whether the application requires a 32-bit or 64-bit address space. A file with a PE32 header can run with a 32-bit or 64-bit address space, and a file with a PE32+ header requires a 64-bit address space. Windows also checks the CPU architecture information embedded inside the header to ensure that it matches the CPU type in the computer. Lastly, 64-bit versions of Windows offer a technology that allows 32-bit Windows applications to run. This technology is called WoW64 (for Windows on Windows64). This technology even allows 32-bit applications with x86 native code in them to run on an Itanium machine, because the WoW64 technology can emulate the x86 instruction set; albeit with a significant performance cost.
     After Windows has examined the EXE file’s header to determine whether to create a 32-bit process, a 64-bit process, or a WoW64 process, Windows loads the x86, x64, or IA64 version of MSCorEE.dll into the process’s address space.
     Then, the process’s primary thread calls a method defined inside MSCorEE.dllThis method initializes the CLR, loads the EXE assembly, and then calls its entry point method (Main). At this point, the managed application is up and running.
CLR via C# 3rd - 01 - The CLR's Execution Model
6. IL - Intermediate Language
     IL is a CPU-independent machine language created by Microsoft after consultation with several external commercial and academic language/compiler writers. IL is a much higher-level language than most CPU machine languages. IL can access and manipulate object types and has instructions to create and initialize objects, call virtual methods on objects, and manipulate array elements directly. It even has instructions to throw and catch exceptions for error handling. You can think of IL as an object-oriented machine language.
     Microsoft provides an IL Assembler, ILAsm.exe. Microsoft also provides an IL Disassembler, ILDasm.exe.
7. JIT Comiler
     Just before the Main method executes, the CLR detects all of the types that are referenced by Main’s code. This causes the CLR to allocate an internal data structure that is used to manage
access to the referenced types. In Figure 1-4, the Main method refers to a single type, Console, causing the CLR to allocate a single internal structure. This internal data structure contains an entry for each method defined by the Console type. Each entry holds the address where the method’s implementation can be found. When initializing this structure, the CLR sets each entry to an internal, undocumented function contained inside the CLR itself. I call this function JITCompiler.
     When Main makes its first call to WriteLine, the JITCompiler function is called. The JITCompiler function is responsible for compiling a method’s IL code into native CPU instructions. Because the IL is being compiled “just in time,” this component of the CLR is frequently referred to as a JITter or a JIT compiler
     A performance hit is incurred only the first time a method is called. All subsequent calls to the method execute at the full speed of the native code because verification and compilation to native code don’t need to be performed again.
8. Unsafe code
     When the JIT compiler attempts to compile an unsafe method, it checks to see if the assembly containing the method has been granted the System.Security.Permissions.Security
Permission with the System.Security.Permissions.SecurityPermissionFlag’s SkipVerification flag set. If this flag is set, the JIT compiler will compile the unsafe code and allow it to execute. The CLR is trusting this code and is hoping the direct address and byte manipulations do not cause any harm. If the flag is not set, the JIT compiler throws either a System.InvalidProgramException or a System.Security.VerificationException, preventing the method from executing. In fact, the whole application will probably terminate at this point, but at least no harm can be done.
     Microsoft supplies a utility called PEVerify.exe, which examines all of an assembly’s methods and notifies you of any methods that contain unsafe code. You may want to consider running
PEVerify.exe on assemblies that you are referencing; this will let you know if there may be problems running your application via the intranet or Internet.
9. The framework class library - FCL
     The .NET Framework includes the Framework Class Library (FCL).
     Some General FCL Namespaces
     System                                   All of the basic types used by every application
     System.Data                              Types for communicating with a database and processing data
     System.IO                              Types for doing stream I/O and walking directories and files
     System.Net                              Types that allow for low-level network communications and working with some common Internet protocols.
     System.Runtime.InteropServices     Types that allow managed code to access unmanaged OS platform facilities such as COM components and functions in Win32 or custom DLLs
     System.Security                         Types used for protecting data and resources
     System.Text                              Types to work with text in different encodings, such as ASCII and Unicode
     System.Threading                    Types used for asynchronous operations and synchronizing access to resources
     System.Xml                              Types used for processing Extensible Markup Language (XML) schemas and data
10. The Common Type System - CTS
     It should be obvious to you that the CLR is all about types. Types expose functionality to your applications and other types. Types are the mechanism by which code written in one programming language can talk to code written in a different programming language. Because types are at the root of the CLR, Microsoft created a formal specification—the Common Type System (CTS)—that describes how types are defined and how they behave
     (1) Type Members 
     The CTS specification states that a type can contain zero or more members.
     Field A data variable that is part of the object’s state. Fields are identified by their name and type.
      Method A function that performs an operation on the object, often changing the object’s state. Methods have a name, a signature, and modifiers. The signature specifies the number of parameters (and their sequence), the types of the parameters, whether a value is returned by the method, and if so, the type of the value returned by the method.
      Property To the caller, this member looks like a field. But to the type implementer, it looks like a method (or two). Properties allow an implementer to validate input parameters and object state before accessing the value and/or calculating a value only when necessary. They also allow a user of the type to have simplified syntax. Finally, properties allow you to create read-only or write-only “fields."
      Event An event allows a notification mechanism between an object and other interested objects. For example, a button could offer an event that notifies other objects when the button is clicked.
     (2) Type access 
     A type that is visible to a caller can further restrict the ability of the caller to access the type’s members. The following list shows the valid options for controlling access to a member:
     Private The member is accessible only by other members in the same class type.
     Family The member is accessible by derived types, regardless of whether they are within the same assembly. Note that many languages (such as C++ and C#) refer to family as protected.
     Family and assembly The member is accessible by derived types, but only if the derived type is defined in the same assembly. Many languages (such as C# and Visual Basic) don’t offer this access control. Of course, IL Assembly language makes it available.
     Assembly The member is accessible by any code in the same assembly. Many languages refer to assembly as internal.
    Family or assembly The member is accessible by derived types in any assembly. The member is also accessible by any types in the same assembly. C# refers to family or assembly as protected internal.
     Public The member is accessible by any code in any assembly.
     (3) The CTS allows a type to derive from only one base class.
 
     (4) All types must (ultimately) inherit from a predefined type: System.Object.
11. The Common Language Specification
     If you intend to create types that are easily accessible from other programming languages, you need to use only features of your programming language that are guaranteed to be available in all other languages. To help you with this, Microsoft has defined a Common Language Specification (CLS) that details for compiler vendors the minimum set of features their compilers must support if these compilers are to generate types compatible with other components written by other CLS-compliant languages on top of the CLR.
CLR via C# 3rd - 01 - The CLR's Execution Model
12. Interoperability with Unmanaged Code
     Specifically, the CLR supports three interoperability scenarios:
     Managed code can call an unmanaged function in a DLL
     Managed code can use an existing COM component (server)
     Unmanaged code can use a managed type (server)