I've heard of stackless languages. However I don't have any idea how such a language would be implemented. Can someone explain?
我听说过无堆语言。但是,我不知道如何实现这样的语言。谁能解释一下?
8 个解决方案
#1
The modern operating systems we have (Windows, Linux) operate with what I call the "big stack model". And that model is wrong, sometimes, and motivates the need for "stackless" languages.
我们拥有的现代操作系统(Windows,Linux)以我称之为“大堆栈模型”的方式运行。有时,这种模式是错误的,并激发了对“无堆叠”语言的需求。
The "big stack model" assumes that a compiled program will allocate "stack frames" for function calls in a contiguous region of memory, using machine instructions to adjust registers containing the stack pointer (and optional stack frame pointer) very rapidly. This leads to fast function call/return, at the price of having a large, contiguous region for the stack. Because 99.99% of all programs run under these modern OSes work well with the big stack model, the compilers, loaders, and even the OS "know" about this stack area.
“大堆栈模型”假设编译的程序将在连续的存储器区域中为函数调用分配“堆栈帧”,使用机器指令非常快速地调整包含堆栈指针(和可选的堆栈帧指针)的寄存器。这导致快速的函数调用/返回,代价是具有堆栈的大的连续区域。因为在这些现代操作系统下运行的所有程序中99.99%适用于大堆栈模型,编译器,加载器甚至操作系统“都知道”这个堆栈区域。
One common problem all such applications have is, "how big should my stack be?". With memory being dirt cheap, mostly what happens is that a large chunk is set aside for the stack (MS defaults to 1Mb), and typical application call structure never gets anywhere near to using it up. But if an application does use it all up, it dies with an illegal memory reference ("I'm sorry Dave, I can't do that"), by virtue of reaching off the end of its stack.
所有这些应用程序的一个常见问题是“我的堆栈应该有多大?”。由于内存很便宜,所以大多数情况下会发生大量的数据块(MS默认为1Mb),并且典型的应用程序调用结构永远不会接近使用它。但是如果一个应用程序确实全部使用它,它会因为非法内存引用而死(“我很抱歉Dave,我不能这样做”),因为它已经到了堆栈的末尾。
Most so-called called "stackless" languages aren't really stackless. They just don't use the contiguous stack provided by these systems. What they do instead is allocate a stack frame from the heap on each function call. The cost per function call goes up somewhat; if functions are typically complex, or the language is interpretive, this additional cost is insignificant. (One can also determine call DAGs in the program call graph and allocate a heap segment to cover the entire DAG; this way you get both heap allocation and the speed of classic big-stack function calls for all calls inside the call DAG).
大多数所谓的“无堆栈”语言并非真正无堆叠。它们只是不使用这些系统提供的连续堆栈。他们所做的是在每次函数调用时从堆中分配堆栈帧。每个函数调用的成本有所上升;如果函数通常很复杂,或者语言是解释性的,那么这个额外的成本是微不足道的。 (也可以在程序调用图中确定调用DAG并分配一个堆段来覆盖整个DAG;这样就可以获得堆分配和调用DAG内所有调用的经典大堆函数调用的速度)。
There are several reasons for using heap allocation for stack frames:
使用堆分配堆栈帧有几个原因:
1) If the program does deep recursion dependent on the specific problem it is solving, it is very hard to preallocate a "big stack" area in advance because the needed size isn't known. One can awkwardly arrange function calls to check to see if there's enough stack left, and if not, reallocate a bigger chunk, copy the old stack and readjust all the pointers into the stack; that's so awkward that I don't know of any implementations. Allocating stack frames means the application never has to say its sorry until there's literally no allocatable memory left.
1)如果程序根据它正在解决的特定问题进行深度递归,则很难预先分配“大堆栈”区域,因为所需的大小是未知的。可以笨拙地安排函数调用以检查是否有足够的堆栈,如果没有,重新分配更大的块,复制旧堆栈并重新调整所有指针进入堆栈;这太尴尬了,我不知道任何实现。分配堆栈帧意味着应用程序永远不必说对不起,直到实际上没有可分配的内存。
2) The program forks subtasks. Each subtask requires its own stack, and therefore can't use the one "big stack" provided. So, one needs to allocate stacks for each subtask. If you have thousands of possible subtasks, you might now need thousands of "big stacks", and the memory demand suddenly gets ridiculous. Allocating stack frames solves this problem. Often the subtask "stacks" refer back to the parent tasks to implement lexical scoping; as subtasks fork, a tree of "substacks" is created called a "cactus stack".
2)程序分叉子任务。每个子任务都需要自己的堆栈,因此不能使用提供的一个“大堆栈”。因此,需要为每个子任务分配堆栈。如果你有数千个可能的子任务,你现在可能需要成千上万的“大堆栈”,内存需求突然变得荒谬。分配堆栈帧解决了这个问题。子任务“堆栈”通常会引用父任务来实现词法作用域;作为子任务fork,创建一个称为“仙人掌堆栈”的“substacks”树。
3) Your language has continuations. These require that the data in lexical scope visible to the current function somehow be preserved for later reuse. This can be implemented by copying parent stack frames, climbing up the cactus stack, and proceeding.
3)你的语言有延续。这些要求以某种方式保留当前函数可见的词法范围中的数据以供以后重用。这可以通过复制父堆栈帧,爬上仙人掌堆栈并继续进行来实现。
The PARLANSE programming langauge I implemented does 1) and 2). I'm working on 3).
我实现的PARLANSE编程语言有1)和2)。我正在努力3)。
#2
Stackless Python still has a Python stack (though it may have tail call optimization and other call frame merging tricks), but it is completely divorced from the C stack of the interpreter.
Stackless Python仍然有一个Python堆栈(虽然它可能有尾调用优化和其他调用框架合并技巧),但它完全脱离了解释器的C堆栈。
Haskell (as commonly implemented) does not have a call stack; evaluation is based on graph reduction.
Haskell(通常实现)没有调用堆栈;评估基于图表缩减。
#3
There is a nice article about the language framework Parrot at http://www.linux-mag.com/cache/7373/1.html. Parrot does not use the stack for calling and this article explains the technique a bit.
在http://www.linux-mag.com/cache/7373/1.html上有一篇关于语言框架Parrot的好文章。 Parrot不使用堆栈进行调用,本文稍微解释了该技术。
#4
In the stackless environments I'm more or less familiar with (Turing machine, assembly, and Brainfuck), it's common to implement your own stack. There is nothing fundamental about having a stack built into the language.
在无框架环境中,我或多或少熟悉(图灵机,汇编和Brainfuck),实现自己的堆栈很常见。将语言堆叠在语言中并没有什么根本。
In the most practical of these, assembly, you just choose a region of memory available to you, set the stack register to point to the bottom, then increment or decrement to implement your pushes and pops.
在最实用的这些程序集中,您只需选择一个可用的内存区域,将堆栈寄存器设置为指向底部,然后递增或递减以实现您的推送和弹出。
EDIT: I know some architectures have dedicated stacks, but they aren't necessary.
编辑:我知道有些架构有专门的堆栈,但它们不是必需的。
#5
There is an easy to understand description of continuations on this article: http://www.defmacro.org/ramblings/fp.html
本文有一个易于理解的延续描述:http://www.defmacro.org/ramblings/fp.html
Continuations are something you can pass into a function in a stack-based language, but which can also be used by a language's own semantics to make it "stackless". Of course the stack is still there, but as Ira Baxter described, it's not one big contiguous segment.
Continuations是你可以在基于堆栈的语言中传递给函数的东西,但它也可以被语言自己的语义用来使它“无堆栈”。当然堆栈仍然存在,但正如Ira Baxter描述的那样,它不是一个大的连续段。
#6
Say you wanted to implement stackless C. The first thing to realize is that this doesn't need a stack:
假设您想要实现无堆栈C.首先要意识到的是,这不需要堆栈:
a == b
But, does this?
但是,这样吗?
isequal(a, b) { return a == b; }
No. Because a smart compiler will inline calls to isequal
, turning them into a == b
. So, why not just inline everything? Sure, you will generate more code but if getting rid of the stack is worth it to you then this is easy with a small tradeoff.
不会。因为智能编译器会将对isequal的调用内联,所以将它们转换为== b。那么,为什么不直接内联一切呢?当然,你会产生更多的代码但是如果摆脱堆栈对你来说是值得的,那么这很容易通过一个小的权衡。
What about recursion? No problem. A tail-recursive function like:
递归怎么样?没问题。尾递归函数如:
bang(x) { return x == 1 ? 1 : x * bang(x-1); }
Can still be inlined, because really it's just a for loop in disguise:
仍然可以内联,因为它实际上只是伪装的for循环:
bang(x) {
for(int i = x; i >=1; i--) x *= x-1;
return x;
}
In theory a really smart compiler could figure that out for you. But a less-smart one could still flatten it as a goto:
理论上,一个非常聪明的编译器可以为您解决这个问题。但是一个不那么聪明的人仍然可以将它作为一个转到:
ax = x;
NOTDONE:
if(ax > 1) {
x = x*(--ax);
goto NOTDONE;
}
There is one case where you have to make a small trade off. This can't be inlined:
有一种情况你必须做一个小的交易。这不能内联:
fib(n) { return n <= 2 ? n : fib(n-1) + fib(n-2); }
Stackless C simply cannot do this. Are you giving up a lot? Not really. This is something normal C can't do well very either. If you don't believe me just call fib(1000)
and see what happens to your precious computer.
Stackless C根本无法做到这一点。你放弃了很多吗?并不是的。这是正常的C也无法做得很好的事情。如果你不相信我只是打电话给fib(1000),看看你宝贵的电脑会发生什么。
#7
Call me ancient, but I can remember when the FORTRAN standards and COBOL did not support recursive calls, and therefore didn't require a stack. Indeed, I recall the implementations for CDC 6000 series machines where there wasn't a stack, and FORTRAN would do strange things if you tried to call a subroutine recursively.
叫我古老,但我记得FORTRAN标准和COBOL不支持递归调用,因此不需要堆栈。实际上,我记得没有堆栈的CDC 6000系列机器的实现,如果你试图递归调用子程序,FORTRAN会做一些奇怪的事情。
For the record, instead of a call-stack, the CDC 6000 series instruction set used the RJ instruction to call a subroutine. This saved the current PC value at the call target location and then branches to the location following it. At the end, a subroutine would perform an indirect jump to the call target location. That reloaded saved PC, effectively returning to the caller.
对于记录,CDC 6000系列指令集使用RJ指令调用子程序而不是调用堆栈。这将当前PC值保存在呼叫目标位置,然后分支到其后的位置。最后,子程序将执行到呼叫目标位置的间接跳转。重新加载保存的PC,有效地返回给调用者。
Obviously, that does not work with recursive calls. (And my recollection is that the CDC FORTRAN IV compiler would generate broken code if you did attempt recursion ...)
显然,这不适用于递归调用。 (我的回忆是,如果你尝试递归,CDC FORTRAN IV编译器会生成破坏的代码......)
#8
Please feel free to correct me if I'm wrong, but I would think that allocating memory on the heap for each function call frame would cause extreme memory thrashing. The operating system does after all have to manage this memory. I would think that the way to avoid this memory thrashing would be a cache for call frames. So if you need a cache anyway, we might as well make it contigous in memory and call it a stack.
如果我错了,请随意纠正我,但我认为在堆上为每个函数调用框分配内存会导致极端的内存抖动。毕竟操作系统必须管理这个内存。我认为避免这种内存颠簸的方法是调用帧的缓存。所以如果你还需要一个缓存,我们不妨在内存中使它成为一个堆栈。
#1
The modern operating systems we have (Windows, Linux) operate with what I call the "big stack model". And that model is wrong, sometimes, and motivates the need for "stackless" languages.
我们拥有的现代操作系统(Windows,Linux)以我称之为“大堆栈模型”的方式运行。有时,这种模式是错误的,并激发了对“无堆叠”语言的需求。
The "big stack model" assumes that a compiled program will allocate "stack frames" for function calls in a contiguous region of memory, using machine instructions to adjust registers containing the stack pointer (and optional stack frame pointer) very rapidly. This leads to fast function call/return, at the price of having a large, contiguous region for the stack. Because 99.99% of all programs run under these modern OSes work well with the big stack model, the compilers, loaders, and even the OS "know" about this stack area.
“大堆栈模型”假设编译的程序将在连续的存储器区域中为函数调用分配“堆栈帧”,使用机器指令非常快速地调整包含堆栈指针(和可选的堆栈帧指针)的寄存器。这导致快速的函数调用/返回,代价是具有堆栈的大的连续区域。因为在这些现代操作系统下运行的所有程序中99.99%适用于大堆栈模型,编译器,加载器甚至操作系统“都知道”这个堆栈区域。
One common problem all such applications have is, "how big should my stack be?". With memory being dirt cheap, mostly what happens is that a large chunk is set aside for the stack (MS defaults to 1Mb), and typical application call structure never gets anywhere near to using it up. But if an application does use it all up, it dies with an illegal memory reference ("I'm sorry Dave, I can't do that"), by virtue of reaching off the end of its stack.
所有这些应用程序的一个常见问题是“我的堆栈应该有多大?”。由于内存很便宜,所以大多数情况下会发生大量的数据块(MS默认为1Mb),并且典型的应用程序调用结构永远不会接近使用它。但是如果一个应用程序确实全部使用它,它会因为非法内存引用而死(“我很抱歉Dave,我不能这样做”),因为它已经到了堆栈的末尾。
Most so-called called "stackless" languages aren't really stackless. They just don't use the contiguous stack provided by these systems. What they do instead is allocate a stack frame from the heap on each function call. The cost per function call goes up somewhat; if functions are typically complex, or the language is interpretive, this additional cost is insignificant. (One can also determine call DAGs in the program call graph and allocate a heap segment to cover the entire DAG; this way you get both heap allocation and the speed of classic big-stack function calls for all calls inside the call DAG).
大多数所谓的“无堆栈”语言并非真正无堆叠。它们只是不使用这些系统提供的连续堆栈。他们所做的是在每次函数调用时从堆中分配堆栈帧。每个函数调用的成本有所上升;如果函数通常很复杂,或者语言是解释性的,那么这个额外的成本是微不足道的。 (也可以在程序调用图中确定调用DAG并分配一个堆段来覆盖整个DAG;这样就可以获得堆分配和调用DAG内所有调用的经典大堆函数调用的速度)。
There are several reasons for using heap allocation for stack frames:
使用堆分配堆栈帧有几个原因:
1) If the program does deep recursion dependent on the specific problem it is solving, it is very hard to preallocate a "big stack" area in advance because the needed size isn't known. One can awkwardly arrange function calls to check to see if there's enough stack left, and if not, reallocate a bigger chunk, copy the old stack and readjust all the pointers into the stack; that's so awkward that I don't know of any implementations. Allocating stack frames means the application never has to say its sorry until there's literally no allocatable memory left.
1)如果程序根据它正在解决的特定问题进行深度递归,则很难预先分配“大堆栈”区域,因为所需的大小是未知的。可以笨拙地安排函数调用以检查是否有足够的堆栈,如果没有,重新分配更大的块,复制旧堆栈并重新调整所有指针进入堆栈;这太尴尬了,我不知道任何实现。分配堆栈帧意味着应用程序永远不必说对不起,直到实际上没有可分配的内存。
2) The program forks subtasks. Each subtask requires its own stack, and therefore can't use the one "big stack" provided. So, one needs to allocate stacks for each subtask. If you have thousands of possible subtasks, you might now need thousands of "big stacks", and the memory demand suddenly gets ridiculous. Allocating stack frames solves this problem. Often the subtask "stacks" refer back to the parent tasks to implement lexical scoping; as subtasks fork, a tree of "substacks" is created called a "cactus stack".
2)程序分叉子任务。每个子任务都需要自己的堆栈,因此不能使用提供的一个“大堆栈”。因此,需要为每个子任务分配堆栈。如果你有数千个可能的子任务,你现在可能需要成千上万的“大堆栈”,内存需求突然变得荒谬。分配堆栈帧解决了这个问题。子任务“堆栈”通常会引用父任务来实现词法作用域;作为子任务fork,创建一个称为“仙人掌堆栈”的“substacks”树。
3) Your language has continuations. These require that the data in lexical scope visible to the current function somehow be preserved for later reuse. This can be implemented by copying parent stack frames, climbing up the cactus stack, and proceeding.
3)你的语言有延续。这些要求以某种方式保留当前函数可见的词法范围中的数据以供以后重用。这可以通过复制父堆栈帧,爬上仙人掌堆栈并继续进行来实现。
The PARLANSE programming langauge I implemented does 1) and 2). I'm working on 3).
我实现的PARLANSE编程语言有1)和2)。我正在努力3)。
#2
Stackless Python still has a Python stack (though it may have tail call optimization and other call frame merging tricks), but it is completely divorced from the C stack of the interpreter.
Stackless Python仍然有一个Python堆栈(虽然它可能有尾调用优化和其他调用框架合并技巧),但它完全脱离了解释器的C堆栈。
Haskell (as commonly implemented) does not have a call stack; evaluation is based on graph reduction.
Haskell(通常实现)没有调用堆栈;评估基于图表缩减。
#3
There is a nice article about the language framework Parrot at http://www.linux-mag.com/cache/7373/1.html. Parrot does not use the stack for calling and this article explains the technique a bit.
在http://www.linux-mag.com/cache/7373/1.html上有一篇关于语言框架Parrot的好文章。 Parrot不使用堆栈进行调用,本文稍微解释了该技术。
#4
In the stackless environments I'm more or less familiar with (Turing machine, assembly, and Brainfuck), it's common to implement your own stack. There is nothing fundamental about having a stack built into the language.
在无框架环境中,我或多或少熟悉(图灵机,汇编和Brainfuck),实现自己的堆栈很常见。将语言堆叠在语言中并没有什么根本。
In the most practical of these, assembly, you just choose a region of memory available to you, set the stack register to point to the bottom, then increment or decrement to implement your pushes and pops.
在最实用的这些程序集中,您只需选择一个可用的内存区域,将堆栈寄存器设置为指向底部,然后递增或递减以实现您的推送和弹出。
EDIT: I know some architectures have dedicated stacks, but they aren't necessary.
编辑:我知道有些架构有专门的堆栈,但它们不是必需的。
#5
There is an easy to understand description of continuations on this article: http://www.defmacro.org/ramblings/fp.html
本文有一个易于理解的延续描述:http://www.defmacro.org/ramblings/fp.html
Continuations are something you can pass into a function in a stack-based language, but which can also be used by a language's own semantics to make it "stackless". Of course the stack is still there, but as Ira Baxter described, it's not one big contiguous segment.
Continuations是你可以在基于堆栈的语言中传递给函数的东西,但它也可以被语言自己的语义用来使它“无堆栈”。当然堆栈仍然存在,但正如Ira Baxter描述的那样,它不是一个大的连续段。
#6
Say you wanted to implement stackless C. The first thing to realize is that this doesn't need a stack:
假设您想要实现无堆栈C.首先要意识到的是,这不需要堆栈:
a == b
But, does this?
但是,这样吗?
isequal(a, b) { return a == b; }
No. Because a smart compiler will inline calls to isequal
, turning them into a == b
. So, why not just inline everything? Sure, you will generate more code but if getting rid of the stack is worth it to you then this is easy with a small tradeoff.
不会。因为智能编译器会将对isequal的调用内联,所以将它们转换为== b。那么,为什么不直接内联一切呢?当然,你会产生更多的代码但是如果摆脱堆栈对你来说是值得的,那么这很容易通过一个小的权衡。
What about recursion? No problem. A tail-recursive function like:
递归怎么样?没问题。尾递归函数如:
bang(x) { return x == 1 ? 1 : x * bang(x-1); }
Can still be inlined, because really it's just a for loop in disguise:
仍然可以内联,因为它实际上只是伪装的for循环:
bang(x) {
for(int i = x; i >=1; i--) x *= x-1;
return x;
}
In theory a really smart compiler could figure that out for you. But a less-smart one could still flatten it as a goto:
理论上,一个非常聪明的编译器可以为您解决这个问题。但是一个不那么聪明的人仍然可以将它作为一个转到:
ax = x;
NOTDONE:
if(ax > 1) {
x = x*(--ax);
goto NOTDONE;
}
There is one case where you have to make a small trade off. This can't be inlined:
有一种情况你必须做一个小的交易。这不能内联:
fib(n) { return n <= 2 ? n : fib(n-1) + fib(n-2); }
Stackless C simply cannot do this. Are you giving up a lot? Not really. This is something normal C can't do well very either. If you don't believe me just call fib(1000)
and see what happens to your precious computer.
Stackless C根本无法做到这一点。你放弃了很多吗?并不是的。这是正常的C也无法做得很好的事情。如果你不相信我只是打电话给fib(1000),看看你宝贵的电脑会发生什么。
#7
Call me ancient, but I can remember when the FORTRAN standards and COBOL did not support recursive calls, and therefore didn't require a stack. Indeed, I recall the implementations for CDC 6000 series machines where there wasn't a stack, and FORTRAN would do strange things if you tried to call a subroutine recursively.
叫我古老,但我记得FORTRAN标准和COBOL不支持递归调用,因此不需要堆栈。实际上,我记得没有堆栈的CDC 6000系列机器的实现,如果你试图递归调用子程序,FORTRAN会做一些奇怪的事情。
For the record, instead of a call-stack, the CDC 6000 series instruction set used the RJ instruction to call a subroutine. This saved the current PC value at the call target location and then branches to the location following it. At the end, a subroutine would perform an indirect jump to the call target location. That reloaded saved PC, effectively returning to the caller.
对于记录,CDC 6000系列指令集使用RJ指令调用子程序而不是调用堆栈。这将当前PC值保存在呼叫目标位置,然后分支到其后的位置。最后,子程序将执行到呼叫目标位置的间接跳转。重新加载保存的PC,有效地返回给调用者。
Obviously, that does not work with recursive calls. (And my recollection is that the CDC FORTRAN IV compiler would generate broken code if you did attempt recursion ...)
显然,这不适用于递归调用。 (我的回忆是,如果你尝试递归,CDC FORTRAN IV编译器会生成破坏的代码......)
#8
Please feel free to correct me if I'm wrong, but I would think that allocating memory on the heap for each function call frame would cause extreme memory thrashing. The operating system does after all have to manage this memory. I would think that the way to avoid this memory thrashing would be a cache for call frames. So if you need a cache anyway, we might as well make it contigous in memory and call it a stack.
如果我错了,请随意纠正我,但我认为在堆上为每个函数调用框分配内存会导致极端的内存抖动。毕竟操作系统必须管理这个内存。我认为避免这种内存颠簸的方法是调用帧的缓存。所以如果你还需要一个缓存,我们不妨在内存中使它成为一个堆栈。