I'm working on designing the kernel (which I'm going to actually call the "core" just to be different, but its basically the same) for an OS I'm working on. The specifics of the OS itself are irrelevant if I can't get multi-tasking, memory management, and other basic things up and running, so I need to work on that first. I've some questinos about designing a malloc routine.
对于我正在研究的操作系统,我正在设计内核(我将实际上称之为“核心”只是为了与众不同,但它基本相同)。如果我无法完成多任务,内存管理和其他基本操作,那么操作系统本身的细节就无关紧要了,所以我需要先解决这个问题。我有一些关于设计malloc例程的任务。
I figure that malloc() is either going to be a part of the kernel itself (I'm leaning towards this) or a part of the program, but I'm going to have to write my own implementation of the C standard library either way, so I get to write a malloc. My question is actually rather simple in this regard, how does C (or C++) manage its heap?
我认为malloc()要么是内核本身的一部分(我倾向于这个),要么是程序的一部分,但是我将不得不编写自己的C标准库实现方式,所以我写了一个malloc。在这方面,我的问题实际上相当简单,C(或C ++)如何管理它的堆?
What I've always been taught in theorey classes is that the heap is an ever expanding piece of memory, starting at a specified address, and in a lot of senses behaving like a stack. In this way, I know that variables declared in global scope are at the beginning, and more variables are "pushed" onto the heap as they are declared in their respective scopes, and variables that go out of scope are simply left in memory space, but that space is marked as free so the heap can expand more if it needs to.
我在理论类中一直被教导的是,堆是一个不断扩展的内存块,从指定的地址开始,并且在很多方面表现得像堆栈。通过这种方式,我知道在全局范围内声明的变量在开头,并且更多变量被“推”到堆上,因为它们在各自的范围内声明,超出范围的变量只是留在内存空间中,但是该空间被标记为空闲,因此如果需要,堆可以扩展更多。
What I need to know is, how on earth does C actually handle a dynamically expanding heap in this manner? Does a compiled C program make its own calls to a malloc routine and handle its own heap, or do I need to provide it with an automatically expanding space? Also, how does the C program know where the heap begins?
我需要知道的是,C实际上如何以这种方式处理动态扩展堆?编译的C程序是否自己调用malloc例程并处理自己的堆,还是需要为它提供自动扩展的空间?另外,C程序如何知道堆的开始位置?
Oh, and I know that the same concepts apply to other languages, but I would like any examples to be in C/C++ because I'm most comfortable with that language. I also would like to not worry about other things such as the stack, as I think I'm able to handle things like this on my own.
哦,我知道相同的概念适用于其他语言,但我希望任何示例都在C / C ++中,因为我对这种语言最为满意。我也不想担心其他事情,比如堆栈,因为我认为我能够自己处理这样的事情。
So I suppose my real question is, other than malloc/free (which handles getting and freeing pages for itself, etc) does a program need the OS to provide anything else?
所以我想我真正的问题是,除了malloc / free(处理获取和释放页面本身等)之外,程序是否需要操作系统提供其他任何东西?
Thanks!
EDIT I'm more interested in how C uses malloc in relation with the heap than in the actual workings of the malloc routine itself. If it helps, I'm doing this on x86, but C is cross compiler so it shouldn't matter. ^_^
编辑我更感兴趣的是C如何使用malloc与堆相关而不是在malloc例程本身的实际工作中。如果它有帮助,我在x86上这样做,但C是交叉编译器所以它应该没关系。 ^ _ ^
EDIT FURTHER: I understand that I may be getting terms confused. I was taught that the "heap" was where the program stored things like global/local variables. I'm used to dealing with a "stack" in assembly programming, and I just realized that I probably mean that instead. A little research on my part shows that "heap" is more commonly used to refer to the total memory that a program has allocated for itself, or, the total number (and order) of pages of memory the OS has provided.
进一步编辑:我理解我可能会让术语混乱。我被教导说“堆”是程序存储诸如全局/局部变量之类的东西。我习惯于在汇编编程中处理“堆栈”,我只是意识到我可能意味着相反。对我的一点研究表明,“堆”更常用于指代程序为自己分配的总内存,或者操作系统提供的内存页面的总数(和顺序)。
So, with that in mind, how do I deal with an ever expanding stack? (it does appear that my C theory class was mildly... flawed.)
因此,考虑到这一点,我如何处理不断扩大的堆栈? (看来我的C理论课有点......有缺陷。)
7 个解决方案
#1
15
malloc
is generally implemented in the C runtime in userspace, relying on specific OS system calls to map in pages of virtual memory. The job of malloc
and free
is to manage those pages of memory, which are fixed in size (typically 4 KB, but sometimes bigger), and to slice and dice them into pieces that applications can use.
malloc通常在用户空间的C运行时中实现,依赖于特定的OS系统调用来映射虚拟内存的页面。 malloc和free的工作是管理那些大小固定的内存页面(通常为4 KB,但有时更大),并将它们切片并分成应用程序可以使用的部分。
See, for example, the GNU libc implementation.
例如,请参阅GNU libc实现。
For a much simpler implementation, check out the MIT operating systems class from last year. Specifically, see the final lab handout, and take a look at lib/malloc.c
. This code uses the operating system JOS developed in the class. The way it works is that it reads through the page tables (provided read-only by the OS), looking for unmapped virtual address ranges. It then uses the sys_page_alloc
and sys_page_unmap
system calls to map and unmap pages into the current process.
有关更简单的实现,请查看去年的MIT操作系统类。具体来说,请参阅最后的实验讲义,并查看lib / malloc.c。此代码使用在类中开发的操作系统JOS。它的工作方式是读取页面表(由OS提供只读),查找未映射的虚拟地址范围。然后,它使用sys_page_alloc和sys_page_unmap系统调用将页面映射和取消映射到当前进程。
#2
13
There are multiple ways to tackle the problem.
有多种方法可以解决这个问题。
Most often C programs have their own malloc/free functionality. That one will work for the small objects. Initially (and as soon as the memory is exhausted) the memory manager will ask the OS for more memory. Traditional methods to do this are mmap and sbrk on the unix variants (GlobalAlloc / LocalAlloc on Win32).
大多数情况下,C程序都有自己的malloc / free功能。那一个适用于小物件。最初(并且一旦内存耗尽),内存管理器将要求操作系统获得更多内存。执行此操作的传统方法是unix变体上的mmap和sbrk(Win32上的GlobalAlloc / LocalAlloc)。
I suggest that you take a look at the Doug Lea memory allocator (google: dlmalloc) from a memory provider (e.g. OS) point of view. That allocator is top notch in a very good one and has hooks for all major operation system. If you want to know what a high performance allocator expects from an OS this is code is your first choice.
我建议您从内存提供商(例如OS)的角度来看看Doug Lea内存分配器(google:dlmalloc)。该分配器在一个非常好的分配器中是*的,并且具有适用于所有主要操作系统的钩子。如果您想知道高性能分配器对操作系统的期望,那么代码就是您的首选。
#3
4
Are you confusing the heap and the stack?
你是否混淆了堆和堆栈?
I ask because you mention "an ever expanding piece of memory", scope and pushing variables on the heap as they are declared. That sure sounds like you are actually talking about the stack.
我问,因为你提到“一个不断扩展的内存片段”,范围和在声明它们时在堆上推送变量。这听起来好像你实际上在谈论堆栈。
In the most common C implementations declarations of automatic variables like
在最常见的C实现中声明自动变量之类的
int i;
are generally going to result in i being allocated on the stack. In general malloc won't get involved unless you explicitly invoke it, or some library call you make invokes it.
通常会导致我被分配到堆栈上。通常,malloc不会涉及,除非您显式调用它,或者您调用它的某些库调用。
I'd recommend looking at "Expert C Programming" by Peter Van Der Linden for background on how C programs typically work with the stack and the heap.
我建议看看Peter Van Der Linden的“Expert C Programming”,了解C程序通常如何使用堆栈和堆。
#4
1
Compulsory reading: Knuth - Art of Computer Programming, Volume 1, Chapter 2, Section 2.5. Otherwise, you could read Kernighan & Ritchie "The C Programming Language" to see an implementation; or, you could read Plauger "The Standard C Library" to see another implementation.
必读:Knuth - 计算机程序设计,第1卷,第2章,第2.5节。否则,您可以阅读Kernighan&Ritchie的“C编程语言”来查看实现;或者,您可以阅读Plauger“标准C库”以查看另一个实现。
I believe that what you need to do inside your core will be somewhat different from what the programs outside the core see. In particular, the in-core memory allocation for programs will be dealing with virtual memory, etc, whereas the programs outside the code simply see the results of what the core has provided.
我相信你在核心内部需要做的事情会与核心以外的程序有所不同。特别是,程序的内核内存分配将处理虚拟内存等,而代码外的程序只能看到内核提供的结果。
#5
1
Read about virtual memory management (paging). It's highly CPU-specific, and every OS implements VM management specially for every supported CPU. If you're writing your OS for x86/amd64, read their respective manuals.
阅读有关虚拟内存管理(分页)的信息。它具有高度CPU特性,每个操作系统都专门为每个支持的CPU实施VM管理。如果您正在为x86 / amd64编写操作系统,请阅读各自的手册。
#6
0
Generally, the C library handles the implementation of malloc
, requesting memory from the OS (either via anonymous mmap
or, in older systems, sbrk
) as necessary. So your kernel side of things should handle allocating whole pages via something like one of those means.
通常,C库处理malloc的实现,根据需要从OS(通过匿名mmap或在旧系统中,sbrk)请求内存。所以你的内核方面应该处理通过类似其中一种方式分配整个页面。
Then it's up to malloc
to dole out memory in a way that doesn't fragment the free memory too much. I'm not too au fait with the details of this, though; however, the term arena comes to mind. If I can hunt down a reference, I'll update this post.
然后由malloc以一种不会过多地分割空闲内存的方式发出内存。不过,我对这方面的细节并不太了解。然而,脑海中出现了竞技场这个词。如果我可以搜索引用,我会更新这篇文章。
#7
0
Danger Danger!! If your even considering attempting kernel development, you should be very aware of the cost of your resources and their relatively limited availability...
危险危险!!如果您甚至考虑尝试内核开发,您应该非常了解资源的成本及其相对有限的可用性......
One thing about recursion, is that it's very, expensive (at least in kernel land), you're not going to see many functions written to simply continue unabaided, or else your kernel will panic.
关于递归的一件事是,它非常昂贵(至少在内核中),你不会看到许多函数被编写为简单地继续无关,否则你的内核会感到恐慌。
To underscore my point here, (at *.com heh), check out this post from the NT Debugging blog about kernel stack overflow's, specificially,
为了强调我的观点,(在*.com嘿),请查看NT调试博客中有关内核堆栈溢出的帖子,具体来说,
· On x86-based platforms, the kernel-mode stack is 12K.
·在基于x86的平台上,内核模式堆栈为12K。
· On x64-based platforms, the kernel-mode stack is 24K. (x64-based platforms include systems with processors using the AMD64 architecture and processors using the Intel EM64T architecture).
·在基于x64的平台上,内核模式堆栈为24K。 (基于x64的平台包括使用AMD64架构的处理器和使用Intel EM64T架构的处理器的系统)。
· On Itanium-based platforms, the kernel-mode stack is 32K with a 32K backing store.
·在基于Itanium的平台上,内核模式堆栈为32K,具有32K后备存储。
That's really, not a whole lot;
那真的,不是很多;
The Usual Suspects
1. Using the stack liberally.
2. Calling functions recursively.
If you read over the blog a bit, you will see how hard kernel development can be with a rather unique set of issues. You're theory class was not wrong, it was simply, simple. ;)
如果您稍微阅读一下博客,您将会看到一系列相当独特的问题可以让内核开发变得更加困难。你的理论课没有错,简单,简单。 ;)
To go from theory -> kernel development is about as significant of a context switch as is possible (perhaps save some hypervisor interaction in the mix!!).
从理论出发 - >内核开发与上下文切换一样重要(可能会在混合中保存一些虚拟机管理程序交互!!)。
Anyhow, never assume, validate and test your expectations.
无论如何,永远不要假设,验证和测试您的期望。
#1
15
malloc
is generally implemented in the C runtime in userspace, relying on specific OS system calls to map in pages of virtual memory. The job of malloc
and free
is to manage those pages of memory, which are fixed in size (typically 4 KB, but sometimes bigger), and to slice and dice them into pieces that applications can use.
malloc通常在用户空间的C运行时中实现,依赖于特定的OS系统调用来映射虚拟内存的页面。 malloc和free的工作是管理那些大小固定的内存页面(通常为4 KB,但有时更大),并将它们切片并分成应用程序可以使用的部分。
See, for example, the GNU libc implementation.
例如,请参阅GNU libc实现。
For a much simpler implementation, check out the MIT operating systems class from last year. Specifically, see the final lab handout, and take a look at lib/malloc.c
. This code uses the operating system JOS developed in the class. The way it works is that it reads through the page tables (provided read-only by the OS), looking for unmapped virtual address ranges. It then uses the sys_page_alloc
and sys_page_unmap
system calls to map and unmap pages into the current process.
有关更简单的实现,请查看去年的MIT操作系统类。具体来说,请参阅最后的实验讲义,并查看lib / malloc.c。此代码使用在类中开发的操作系统JOS。它的工作方式是读取页面表(由OS提供只读),查找未映射的虚拟地址范围。然后,它使用sys_page_alloc和sys_page_unmap系统调用将页面映射和取消映射到当前进程。
#2
13
There are multiple ways to tackle the problem.
有多种方法可以解决这个问题。
Most often C programs have their own malloc/free functionality. That one will work for the small objects. Initially (and as soon as the memory is exhausted) the memory manager will ask the OS for more memory. Traditional methods to do this are mmap and sbrk on the unix variants (GlobalAlloc / LocalAlloc on Win32).
大多数情况下,C程序都有自己的malloc / free功能。那一个适用于小物件。最初(并且一旦内存耗尽),内存管理器将要求操作系统获得更多内存。执行此操作的传统方法是unix变体上的mmap和sbrk(Win32上的GlobalAlloc / LocalAlloc)。
I suggest that you take a look at the Doug Lea memory allocator (google: dlmalloc) from a memory provider (e.g. OS) point of view. That allocator is top notch in a very good one and has hooks for all major operation system. If you want to know what a high performance allocator expects from an OS this is code is your first choice.
我建议您从内存提供商(例如OS)的角度来看看Doug Lea内存分配器(google:dlmalloc)。该分配器在一个非常好的分配器中是*的,并且具有适用于所有主要操作系统的钩子。如果您想知道高性能分配器对操作系统的期望,那么代码就是您的首选。
#3
4
Are you confusing the heap and the stack?
你是否混淆了堆和堆栈?
I ask because you mention "an ever expanding piece of memory", scope and pushing variables on the heap as they are declared. That sure sounds like you are actually talking about the stack.
我问,因为你提到“一个不断扩展的内存片段”,范围和在声明它们时在堆上推送变量。这听起来好像你实际上在谈论堆栈。
In the most common C implementations declarations of automatic variables like
在最常见的C实现中声明自动变量之类的
int i;
are generally going to result in i being allocated on the stack. In general malloc won't get involved unless you explicitly invoke it, or some library call you make invokes it.
通常会导致我被分配到堆栈上。通常,malloc不会涉及,除非您显式调用它,或者您调用它的某些库调用。
I'd recommend looking at "Expert C Programming" by Peter Van Der Linden for background on how C programs typically work with the stack and the heap.
我建议看看Peter Van Der Linden的“Expert C Programming”,了解C程序通常如何使用堆栈和堆。
#4
1
Compulsory reading: Knuth - Art of Computer Programming, Volume 1, Chapter 2, Section 2.5. Otherwise, you could read Kernighan & Ritchie "The C Programming Language" to see an implementation; or, you could read Plauger "The Standard C Library" to see another implementation.
必读:Knuth - 计算机程序设计,第1卷,第2章,第2.5节。否则,您可以阅读Kernighan&Ritchie的“C编程语言”来查看实现;或者,您可以阅读Plauger“标准C库”以查看另一个实现。
I believe that what you need to do inside your core will be somewhat different from what the programs outside the core see. In particular, the in-core memory allocation for programs will be dealing with virtual memory, etc, whereas the programs outside the code simply see the results of what the core has provided.
我相信你在核心内部需要做的事情会与核心以外的程序有所不同。特别是,程序的内核内存分配将处理虚拟内存等,而代码外的程序只能看到内核提供的结果。
#5
1
Read about virtual memory management (paging). It's highly CPU-specific, and every OS implements VM management specially for every supported CPU. If you're writing your OS for x86/amd64, read their respective manuals.
阅读有关虚拟内存管理(分页)的信息。它具有高度CPU特性,每个操作系统都专门为每个支持的CPU实施VM管理。如果您正在为x86 / amd64编写操作系统,请阅读各自的手册。
#6
0
Generally, the C library handles the implementation of malloc
, requesting memory from the OS (either via anonymous mmap
or, in older systems, sbrk
) as necessary. So your kernel side of things should handle allocating whole pages via something like one of those means.
通常,C库处理malloc的实现,根据需要从OS(通过匿名mmap或在旧系统中,sbrk)请求内存。所以你的内核方面应该处理通过类似其中一种方式分配整个页面。
Then it's up to malloc
to dole out memory in a way that doesn't fragment the free memory too much. I'm not too au fait with the details of this, though; however, the term arena comes to mind. If I can hunt down a reference, I'll update this post.
然后由malloc以一种不会过多地分割空闲内存的方式发出内存。不过,我对这方面的细节并不太了解。然而,脑海中出现了竞技场这个词。如果我可以搜索引用,我会更新这篇文章。
#7
0
Danger Danger!! If your even considering attempting kernel development, you should be very aware of the cost of your resources and their relatively limited availability...
危险危险!!如果您甚至考虑尝试内核开发,您应该非常了解资源的成本及其相对有限的可用性......
One thing about recursion, is that it's very, expensive (at least in kernel land), you're not going to see many functions written to simply continue unabaided, or else your kernel will panic.
关于递归的一件事是,它非常昂贵(至少在内核中),你不会看到许多函数被编写为简单地继续无关,否则你的内核会感到恐慌。
To underscore my point here, (at *.com heh), check out this post from the NT Debugging blog about kernel stack overflow's, specificially,
为了强调我的观点,(在*.com嘿),请查看NT调试博客中有关内核堆栈溢出的帖子,具体来说,
· On x86-based platforms, the kernel-mode stack is 12K.
·在基于x86的平台上,内核模式堆栈为12K。
· On x64-based platforms, the kernel-mode stack is 24K. (x64-based platforms include systems with processors using the AMD64 architecture and processors using the Intel EM64T architecture).
·在基于x64的平台上,内核模式堆栈为24K。 (基于x64的平台包括使用AMD64架构的处理器和使用Intel EM64T架构的处理器的系统)。
· On Itanium-based platforms, the kernel-mode stack is 32K with a 32K backing store.
·在基于Itanium的平台上,内核模式堆栈为32K,具有32K后备存储。
That's really, not a whole lot;
那真的,不是很多;
The Usual Suspects
1. Using the stack liberally.
2. Calling functions recursively.
If you read over the blog a bit, you will see how hard kernel development can be with a rather unique set of issues. You're theory class was not wrong, it was simply, simple. ;)
如果您稍微阅读一下博客,您将会看到一系列相当独特的问题可以让内核开发变得更加困难。你的理论课没有错,简单,简单。 ;)
To go from theory -> kernel development is about as significant of a context switch as is possible (perhaps save some hypervisor interaction in the mix!!).
从理论出发 - >内核开发与上下文切换一样重要(可能会在混合中保存一些虚拟机管理程序交互!!)。
Anyhow, never assume, validate and test your expectations.
无论如何,永远不要假设,验证和测试您的期望。