
时间:2022-11-24 20:06:16

Is there any substantial optimization when omitting the frame pointer? If I have understood correctly by reading this page, -fomit-frame-pointer is used when we want to avoid saving, setting up and restoring frame pointers.


Is this done only for each function call and if so, is it really worth to avoid a few instructions for every function? Isn't it trivial for an optimization. What are the actual implications of using this option apart from the debugging limitations?


I compiled the following C code with and without this option


int main(void)
        int i;

        i = myf(1, 2);

int myf(int a, int b)
        return a + b;


# gcc -S -fomit-frame-pointer code.c -o withoutfp.s
# gcc -S code.c -o withfp.s


diff -u 'ing the two files revealed the following assembly code:

diff -u这两个文件显示以下汇编代码:

--- withfp.s    2009-12-22 00:03:59.000000000 +0000
+++ withoutfp.s 2009-12-22 00:04:17.000000000 +0000
@@ -7,17 +7,14 @@
        leal    4(%esp), %ecx
        andl    $-16, %esp
        pushl   -4(%ecx)
-       pushl   %ebp
-       movl    %esp, %ebp
        pushl   %ecx
-       subl    $36, %esp
+       subl    $24, %esp
        movl    $2, 4(%esp)
        movl    $1, (%esp)
        call    myf
-       movl    %eax, -8(%ebp)
-       addl    $36, %esp
+       movl    %eax, 20(%esp)
+       addl    $24, %esp
        popl    %ecx
-       popl    %ebp
        leal    -4(%ecx), %esp
        .size   main, .-main
@@ -25,11 +22,8 @@
 .globl myf
        .type   myf, @function
-       pushl   %ebp
-       movl    %esp, %ebp
-       movl    12(%ebp), %eax
-       addl    8(%ebp), %eax
-       popl    %ebp
+       movl    8(%esp), %eax
+       addl    4(%esp), %eax
        .size   myf, .-myf
        .ident  "GCC: (GNU) 4.2.1 20070719 

Could someone please shed light on the key points of the above code where -fomit-frame-pointer did actually make the difference?

有人可以阐明上面代码的关键点 - -fomit-frame-pointer确实有所作为?

Edit: objdump's output replaced with gcc -S's

编辑:objdump的输出被gcc -S替换

4 个解决方案



-fomit-frame-pointer allows one extra register to be available for general-purpose use. I would assume this is really only a big deal on 32-bit x86, which is a bit starved for registers.*


One would expect to see EBP no longer saved and adjusted on every function call, and probably some additional use of EBP in normal code, and fewer stack operations on occasions where EBP gets used as a general-purpose register.


Your code is far too simple to see any benefit from this sort of optimization-- you're not using enough registers. Also, you haven't turned on the optimizer, which might be necessary to see some of these effects.

您的代码太简单了,无法从这种优化中获得任何好处 - 您没有使用足够的寄存器。此外,您还没有打开优化器,这可能是查看其中一些效果所必需的。

* ISA registers, not micro-architecture registers.

* ISA寄存器,而不是微架构寄存器。



The only downside of omitting it is that debugging is much more difficult.


The major upside is that there is one extra general purpose register which can make a big difference on performance. Obviously this extra register is used only when needed (probably in your very simple function it isn't); in some functions it makes more difference than in others.




You can often get more meaningful assembly code from GCC by using the -S argument to output the assembly:


$ gcc code.c -S -o withfp.s
$ gcc code.c -S -o withoutfp.s -fomit-frame-pointer
$ diff -u withfp.s withoutfp.s

GCC doesn't care about the address, so we can compare the actual instructions generated directly. For your leaf function, this gives:


-       pushl   %ebp
-       movl    %esp, %ebp
-       movl    12(%ebp), %eax
-       addl    8(%ebp), %eax
-       popl    %ebp
+       movl    8(%esp), %eax
+       addl    4(%esp), %eax

GCC doesn't generate the code to push the frame pointer onto the stack, and this changes the relative address of the arguments passed to the function on the stack.




Profile your program to see if there is a significant difference.


Next, profile your development process. Is debugging easier or more difficult? Do you spend more time developing or less?


Optimizations without profiling are a waste of time and money.




-fomit-frame-pointer allows one extra register to be available for general-purpose use. I would assume this is really only a big deal on 32-bit x86, which is a bit starved for registers.*


One would expect to see EBP no longer saved and adjusted on every function call, and probably some additional use of EBP in normal code, and fewer stack operations on occasions where EBP gets used as a general-purpose register.


Your code is far too simple to see any benefit from this sort of optimization-- you're not using enough registers. Also, you haven't turned on the optimizer, which might be necessary to see some of these effects.

您的代码太简单了,无法从这种优化中获得任何好处 - 您没有使用足够的寄存器。此外,您还没有打开优化器,这可能是查看其中一些效果所必需的。

* ISA registers, not micro-architecture registers.

* ISA寄存器,而不是微架构寄存器。



The only downside of omitting it is that debugging is much more difficult.


The major upside is that there is one extra general purpose register which can make a big difference on performance. Obviously this extra register is used only when needed (probably in your very simple function it isn't); in some functions it makes more difference than in others.




You can often get more meaningful assembly code from GCC by using the -S argument to output the assembly:


$ gcc code.c -S -o withfp.s
$ gcc code.c -S -o withoutfp.s -fomit-frame-pointer
$ diff -u withfp.s withoutfp.s

GCC doesn't care about the address, so we can compare the actual instructions generated directly. For your leaf function, this gives:


-       pushl   %ebp
-       movl    %esp, %ebp
-       movl    12(%ebp), %eax
-       addl    8(%ebp), %eax
-       popl    %ebp
+       movl    8(%esp), %eax
+       addl    4(%esp), %eax

GCC doesn't generate the code to push the frame pointer onto the stack, and this changes the relative address of the arguments passed to the function on the stack.




Profile your program to see if there is a significant difference.


Next, profile your development process. Is debugging easier or more difficult? Do you spend more time developing or less?


Optimizations without profiling are a waste of time and money.
