I want to understand what precisely is happening behind the scene when I spawn a new thread in .NET, something like here:
我想了解当我在.NET中生成一个新线程时,场景背后究竟发生了什么,如下所示:
Thread t = new Thread(DoWork); //I am not interested in DoWork per se
t.Start();
1. What thread-related objects are created in CLR and Windows kernel?
2. Why are those objects needed?
3. How much managed/unmanaged memory (heap and stack) is allocated on x86, x64 Windows?
1.在CLR和Windows内核中创建了哪些与线程相关的对象? 2.为什么需要这些物品? 3.在x86,x64 Windows上分配了多少托管/非托管内存(堆和堆栈)?
UPDATE
I am looking for such objects as managed thread object, which is I assume is t, but perhaps some other additional managed objects; kernel thread object, user thread environment block and alike.
更新我正在寻找托管线程对象这样的对象,我假设它是t,但也许是其他一些额外的托管对象;内核线程对象,用户线程环境块等。
Many thanks!
3 个解决方案
#1
9
Win32 and Kernel memory allocated
I'm not exactly sure how the .NET part works, but if the runtime does decide to create a real thread with the OS, it would eventually call the Win32 API CreateThread in kernel32.dll, probably from mscorlib.ni.dll
我不确定.NET部分是如何工作的,但是如果运行时决定用OS创建一个真正的线程,它最终会调用kernel32.dll中的Win32 API CreateThread,可能来自mscorlib.ni.dll
By default, new threads get 1MB of virtual address for the stack, which is committed as needed. This can be controlled with the maxStackSize
parameter. The main thread's stack size comes from a parameter in the executable file itself.
默认情况下,新线程为堆栈获取1MB的虚拟地址,并根据需要提交。这可以使用maxStackSize参数进行控制。主线程的堆栈大小来自可执行文件本身的参数。
In the process's address space, a TEB (thread environment block) will be allocated (see also). Incidentally, the FS register on x86 points to this for things like thread local storage and structured exception handling (SEH). There are probably other things allocated by Win32 that are not documented.
在进程的地址空间中,将分配TEB(线程环境块)(另请参见)。顺便提一下,x86上的FS寄存器指向线程本地存储和结构化异常处理(SEH)之类的东西。可能还有其他由Win32分配的未记录的内容。
In creating the Win32 thread, the Win32 server process (csrss.exe) is contacted. You can see that csrss has handles open to all Win32 processes and threads in Process Explorer for some kind of bookkeeping.
在创建Win32线程时,将联系Win32服务器进程(csrss.exe)。您可以看到csrss具有对Process Explorer中的所有Win32进程和线程开放的句柄,用于某种簿记。
DLLs loaded in the process will be notified of the new thread and may allocate their own memory for tracking the thread.
将在进程中加载的DLL通知新线程,并可以分配自己的内存来跟踪线程。
The kernel will create an ETHREAD
[layout] (derived from KTHREAD) object from kernel non-paged pool to track the thread's state. There will also be a kernel stack allocated (12k default for x86) which can be paged out (unless the thread is in a kernel mode wait state).
内核将从内核非页面缓冲池创建一个ETHREAD [layout](从KTHREAD派生)对象,以跟踪线程的状态。还将分配一个内核堆栈(x86默认为12k),可以将其分页(除非线程处于内核模式等待状态)。
Why so many things need to allocate memory for a thread
Threads are the smallest preemptively scheduled unit that the OS provides and there is a lot of context connected to them. Many different components need to provide separate context for each thread because system services need to be able to deal with multiple threads doing different things all at the same time.
线程是操作系统提供的最小的抢先调度单元,并且有很多连接它们的上下文。许多不同的组件需要为每个线程提供单独的上下文,因为系统服务需要能够同时处理多个执行不同操作的线程。
Some services require you to declare new threads to them explicitly but most are expected to work with new threads automatically. Sometimes this means allocating space right when the thread is started. As the thread engages other services, the amount of memory used to track the thread can increase as those services set up their own context for the thread.
某些服务要求您明确地向它们声明新线程,但大多数服务需要自动使用新线程。有时这意味着在线程启动时正确分配空间。当线程使用其他服务时,用于跟踪线程的内存量会随着这些服务为线程设置自己的上下文而增加。
How much memory is allocated
It's hard to say how much memory is allocated for a thread since it is spread across several address spaces and heaps. It will vary between Windows versions, installed components and what is loaded into the process currently.
很难说为一个线程分配了多少内存,因为它分布在几个地址空间和堆上。它将在Windows版本,已安装组件和当前加载到进程中的内容之间有所不同。
The largest cost is generally accepted to be the 1MB of address space used by default for new threads, but even this limit can allow many hundreds to be used in a single process without running out of space.
最大的成本通常被认为是新线程默认使用的1MB地址空间,但即使这个限制也可以允许在单个进程中使用数百个而不会耗尽空间。
If the design is using many more OS threads than the number of CPUs in the system, it should be reviewed. Work queues with a thread pool and lightweight threads with user mode scheduling with fibers or another library's implementation should be able to handle mulithreading without requiring an excessive number of OS threads, rendering the memory cost of the threads to be unimportant.
如果设计使用的OS线程数多于系统中CPU的数量,则应对其进行检查。具有线程池和轻量级线程的工作队列以及使用光纤或其他库实现的用户模式调度应该能够处理多线程处理而不需要过多的OS线程,从而使线程的内存成本变得不重要。
#2
2
So this is a really complicated question that does not really have a great answer of "x".
所以这是一个非常复杂的问题,对“x”的回答并不是很好。
- The CLR is not required to map a single CLR thread to a single OS fiber. So... this is hard to answer. I think the current version of .NET (4.0) attempts to use a 1-to-1 relationship between CLR threads and OS fibers when possible on all OSes. Previous versions of .NET (more like <= 1.1) I'm not sure this was the case on all OSes. The scheduler handles most of the these objects and they won't be part of any .NET object graph. This scheduler is part of the CLR and not part of the
Thread
object. If you dig into the IL, you'll see many internal calls for actual execution. - I assume the question is "Why are those objects needed?" If so, it's because the OS host has to actually have the fiber to execute the code for that thread on it.
ThreadPool
usage can greatly reduce this cost of creating them each time. - Sorry... depends. A lot of it unmanaged as well, which means the OS host could choose to handle this differently depending on load and system version.
CLR不需要将单个CLR线程映射到单个OS光纤。所以......这很难回答。我认为当前版本的.NET(4.0)尝试在所有操作系统上尽可能使用CLR线程和OS光纤之间的1对1关系。早期版本的.NET(更像是<= 1.1)我不确定所有操作系统都是如此。调度程序处理大多数这些对象,它们不会成为任何.NET对象图的一部分。此调度程序是CLR的一部分,而不是Thread对象的一部分。如果你深入研究IL,你会看到许多内部调用实际执行。
我假设问题是“为什么需要这些物品?”如果是这样,那是因为OS主机必须实际上有光纤来执行该线程的代码。 ThreadPool的使用可以大大降低每次创建它们的成本。
对不起......取决于。其中很多都是非托管的,这意味着操作系统主机可以根据负载和系统版本选择不同的处理方式。
"The logical abstraction of a thread of control is captured by an instance of the System.Threading.Thread
object in the class library." http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-335.pdf
“控制线程的逻辑抽象由类库中的System.Threading.Thread对象的实例捕获。” http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-335.pdf
So EMCA standard really doesn't say anything about the topic. But luckily we have...
因此,EMCA标准真的没有说明这个话题。但幸运的是我们......
"Because the CLR thread object is per-fiber, any information hanging off of it is also per-fiber. Thread.ManagedThreadId returns a stable ID that flows around with the CLR thread. It is not dependent on the identity of the physical OS thread, which means using it implies no form of affinity. Different fibers running on the same thread return different IDs. " From Joe Duffy http://www.bluebytesoftware.com/blog/2006/11/10/FibersAndTheCLR.aspx
“因为CLR线程对象是每个光纤,所以挂起它的任何信息也是每个光纤.Thread.ManagedThreadId返回一个与CLR线程一起流动的稳定ID。它不依赖于物理OS线程的标识,这意味着使用它意味着没有形式的亲和力。在同一个线程上运行的不同光纤返回不同的ID。“来自Joe Duffy http://www.bluebytesoftware.com/blog/2006/11/10/FibersAndTheCLR.aspx
#3
1
Look here; there is a mapping between managed (i.e. CLR) primitives and unmanaged (i.e. NT kernel) ones that may answer most of your questions.
看这里;托管(即CLR)原语和非托管(即NT内核)原语之间存在映射,可以回答大部分问题。
#1
9
Win32 and Kernel memory allocated
I'm not exactly sure how the .NET part works, but if the runtime does decide to create a real thread with the OS, it would eventually call the Win32 API CreateThread in kernel32.dll, probably from mscorlib.ni.dll
我不确定.NET部分是如何工作的,但是如果运行时决定用OS创建一个真正的线程,它最终会调用kernel32.dll中的Win32 API CreateThread,可能来自mscorlib.ni.dll
By default, new threads get 1MB of virtual address for the stack, which is committed as needed. This can be controlled with the maxStackSize
parameter. The main thread's stack size comes from a parameter in the executable file itself.
默认情况下,新线程为堆栈获取1MB的虚拟地址,并根据需要提交。这可以使用maxStackSize参数进行控制。主线程的堆栈大小来自可执行文件本身的参数。
In the process's address space, a TEB (thread environment block) will be allocated (see also). Incidentally, the FS register on x86 points to this for things like thread local storage and structured exception handling (SEH). There are probably other things allocated by Win32 that are not documented.
在进程的地址空间中,将分配TEB(线程环境块)(另请参见)。顺便提一下,x86上的FS寄存器指向线程本地存储和结构化异常处理(SEH)之类的东西。可能还有其他由Win32分配的未记录的内容。
In creating the Win32 thread, the Win32 server process (csrss.exe) is contacted. You can see that csrss has handles open to all Win32 processes and threads in Process Explorer for some kind of bookkeeping.
在创建Win32线程时,将联系Win32服务器进程(csrss.exe)。您可以看到csrss具有对Process Explorer中的所有Win32进程和线程开放的句柄,用于某种簿记。
DLLs loaded in the process will be notified of the new thread and may allocate their own memory for tracking the thread.
将在进程中加载的DLL通知新线程,并可以分配自己的内存来跟踪线程。
The kernel will create an ETHREAD
[layout] (derived from KTHREAD) object from kernel non-paged pool to track the thread's state. There will also be a kernel stack allocated (12k default for x86) which can be paged out (unless the thread is in a kernel mode wait state).
内核将从内核非页面缓冲池创建一个ETHREAD [layout](从KTHREAD派生)对象,以跟踪线程的状态。还将分配一个内核堆栈(x86默认为12k),可以将其分页(除非线程处于内核模式等待状态)。
Why so many things need to allocate memory for a thread
Threads are the smallest preemptively scheduled unit that the OS provides and there is a lot of context connected to them. Many different components need to provide separate context for each thread because system services need to be able to deal with multiple threads doing different things all at the same time.
线程是操作系统提供的最小的抢先调度单元,并且有很多连接它们的上下文。许多不同的组件需要为每个线程提供单独的上下文,因为系统服务需要能够同时处理多个执行不同操作的线程。
Some services require you to declare new threads to them explicitly but most are expected to work with new threads automatically. Sometimes this means allocating space right when the thread is started. As the thread engages other services, the amount of memory used to track the thread can increase as those services set up their own context for the thread.
某些服务要求您明确地向它们声明新线程,但大多数服务需要自动使用新线程。有时这意味着在线程启动时正确分配空间。当线程使用其他服务时,用于跟踪线程的内存量会随着这些服务为线程设置自己的上下文而增加。
How much memory is allocated
It's hard to say how much memory is allocated for a thread since it is spread across several address spaces and heaps. It will vary between Windows versions, installed components and what is loaded into the process currently.
很难说为一个线程分配了多少内存,因为它分布在几个地址空间和堆上。它将在Windows版本,已安装组件和当前加载到进程中的内容之间有所不同。
The largest cost is generally accepted to be the 1MB of address space used by default for new threads, but even this limit can allow many hundreds to be used in a single process without running out of space.
最大的成本通常被认为是新线程默认使用的1MB地址空间,但即使这个限制也可以允许在单个进程中使用数百个而不会耗尽空间。
If the design is using many more OS threads than the number of CPUs in the system, it should be reviewed. Work queues with a thread pool and lightweight threads with user mode scheduling with fibers or another library's implementation should be able to handle mulithreading without requiring an excessive number of OS threads, rendering the memory cost of the threads to be unimportant.
如果设计使用的OS线程数多于系统中CPU的数量,则应对其进行检查。具有线程池和轻量级线程的工作队列以及使用光纤或其他库实现的用户模式调度应该能够处理多线程处理而不需要过多的OS线程,从而使线程的内存成本变得不重要。
#2
2
So this is a really complicated question that does not really have a great answer of "x".
所以这是一个非常复杂的问题,对“x”的回答并不是很好。
- The CLR is not required to map a single CLR thread to a single OS fiber. So... this is hard to answer. I think the current version of .NET (4.0) attempts to use a 1-to-1 relationship between CLR threads and OS fibers when possible on all OSes. Previous versions of .NET (more like <= 1.1) I'm not sure this was the case on all OSes. The scheduler handles most of the these objects and they won't be part of any .NET object graph. This scheduler is part of the CLR and not part of the
Thread
object. If you dig into the IL, you'll see many internal calls for actual execution. - I assume the question is "Why are those objects needed?" If so, it's because the OS host has to actually have the fiber to execute the code for that thread on it.
ThreadPool
usage can greatly reduce this cost of creating them each time. - Sorry... depends. A lot of it unmanaged as well, which means the OS host could choose to handle this differently depending on load and system version.
CLR不需要将单个CLR线程映射到单个OS光纤。所以......这很难回答。我认为当前版本的.NET(4.0)尝试在所有操作系统上尽可能使用CLR线程和OS光纤之间的1对1关系。早期版本的.NET(更像是<= 1.1)我不确定所有操作系统都是如此。调度程序处理大多数这些对象,它们不会成为任何.NET对象图的一部分。此调度程序是CLR的一部分,而不是Thread对象的一部分。如果你深入研究IL,你会看到许多内部调用实际执行。
我假设问题是“为什么需要这些物品?”如果是这样,那是因为OS主机必须实际上有光纤来执行该线程的代码。 ThreadPool的使用可以大大降低每次创建它们的成本。
对不起......取决于。其中很多都是非托管的,这意味着操作系统主机可以根据负载和系统版本选择不同的处理方式。
"The logical abstraction of a thread of control is captured by an instance of the System.Threading.Thread
object in the class library." http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-335.pdf
“控制线程的逻辑抽象由类库中的System.Threading.Thread对象的实例捕获。” http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-335.pdf
So EMCA standard really doesn't say anything about the topic. But luckily we have...
因此,EMCA标准真的没有说明这个话题。但幸运的是我们......
"Because the CLR thread object is per-fiber, any information hanging off of it is also per-fiber. Thread.ManagedThreadId returns a stable ID that flows around with the CLR thread. It is not dependent on the identity of the physical OS thread, which means using it implies no form of affinity. Different fibers running on the same thread return different IDs. " From Joe Duffy http://www.bluebytesoftware.com/blog/2006/11/10/FibersAndTheCLR.aspx
“因为CLR线程对象是每个光纤,所以挂起它的任何信息也是每个光纤.Thread.ManagedThreadId返回一个与CLR线程一起流动的稳定ID。它不依赖于物理OS线程的标识,这意味着使用它意味着没有形式的亲和力。在同一个线程上运行的不同光纤返回不同的ID。“来自Joe Duffy http://www.bluebytesoftware.com/blog/2006/11/10/FibersAndTheCLR.aspx
#3
1
Look here; there is a mapping between managed (i.e. CLR) primitives and unmanaged (i.e. NT kernel) ones that may answer most of your questions.
看这里;托管(即CLR)原语和非托管(即NT内核)原语之间存在映射,可以回答大部分问题。