使用64位变量的C ++尾递归

I have written a simple Fibonacci function as an exercise in C++ (using Visual Studio) to test Tail Recursion and to see how it works.

我已经编写了一个简单的Fibonacci函数作为C ++中的练习(使用Visual Studio)来测试Tail Recursion并查看它是如何工作的。

this is the code:

这是代码:

int fib_tail(int n, int res, int next) {
  if (n == 0) {
    return res;
  }
  return fib_tail(n - 1, next, res + next);
}

int main()
{
  fib_tail(10,0,1); //Tail Recursion works
}

when I compiled using Release mode I saw the optimized assembly using the JMP instruction in spite of a call. So my conclusion was: tail recursion works. See image below:

当我使用Release模式编译时,尽管有一个调用,我仍然使用JMP指令看到了优化的程序集。所以我的结论是:尾递归工作。见下图:

使用64位变量的C ++尾递归

I wanted to do some performance tests by increasing the input variable n in my Fibonacci function. I then opted to change the variable type, used in the function, from int to unsigned long long. Then I passed a big number like: 10e+08

我想通过增加Fibonacci函数中的输入变量n来进行一些性能测试。然后我选择将函数中使用的变量类型从int更改为unsigned long long。然后我通过了一个很大的数字:10e + 08

This is now the new function:

这是现在的新功能:

typedef  unsigned long long ULONG64;

ULONG64 fib_tail(ULONG64 n, ULONG64 res, ULONG64 next) {
   if (n == 0) {
     return res;
   }
   return fib_tail(n - 1, next, res + next);
}

int main()
{
  fib_tail(10e+9,0,1); //Tail recursion does not work
}

When I ran the code above I got a stack overflow exception, which made me think that tail recursion was not working. I looked at the assembly and in fact I found this:

当我运行上面的代码时,我得到了一个堆栈溢出异常,这让我觉得尾递归不起作用。我看着集会,事实上我发现了这个:

使用64位变量的C ++尾递归

As you see now there is a call instruction whereas I was expecting only a simple JMP. I don't understand the reason why using a 8 bytes variable disables tail recursion. Why the compiler doesn't perform an optimization in such case?

正如你现在看到的那样,有一个调用指令,而我只期待一个简单的JMP。我不明白使用8字节变量禁用尾递归的原因。为什么编译器在这种情况下不执行优化?

1 个解决方案

#1

This is one of those questions that you'd have to ask the guys that do compiler optimisation for MS - there is really no technical reason why ANY return type should prevent tail-recursion from being a jump as such - there may be OTHER reasons such as "the code is too complex to understand" or some such.

这是你必须要问那些为MS进行编译器优化的人的问题之一 - 实际上没有任何技术原因可以解释为什么任何返回类型都应该阻止尾递归这样的跳跃 - 可能还有其他原因因为“代码太复杂而无法理解”或某些此类代码。

clang 3.7 as of a couple of weeks back clearly figures it out:

几个星期前的clang 3.7清楚地表明了这一点:

_Z8fib_tailyyy:                         # @_Z8fib_tailyyy
    pushl   %ebp
    pushl   %ebx
    pushl   %edi
    pushl   %esi
    pushl   %eax
    movl    36(%esp), %ecx
    movl    32(%esp), %esi
    movl    28(%esp), %edi
    movl    24(%esp), %ebx
    movl    %ebx, %eax
    orl %edi, %eax
    je  .LBB0_1
    movl    44(%esp), %ebp
    movl    40(%esp), %eax
    movl    %eax, (%esp)            # 4-byte Spill
.LBB0_3:                                # %if.end
    movl    %ebp, %edx
    movl    (%esp), %eax            # 4-byte Reload
    addl    $-1, %ebx
    adcl    $-1, %edi
    addl    %eax, %esi
    adcl    %edx, %ecx
    movl    %ebx, %ebp
    orl %edi, %ebp
    movl    %esi, (%esp)            # 4-byte Spill
    movl    %ecx, %ebp
    movl    %eax, %esi
    movl    %edx, %ecx
    jne .LBB0_3
    jmp .LBB0_4
.LBB0_1:
    movl    %esi, %eax
    movl    %ecx, %edx
.LBB0_4:                                # %return
    addl    $4, %esp
    popl    %esi
    popl    %edi
    popl    %ebx
    popl    %ebp
    retl


main:                                   # @main
    subl    $28, %esp
    movl    $0, 20(%esp)
    movl    $1, 16(%esp)
    movl    $0, 12(%esp)
    movl    $0, 8(%esp)
    movl    $2, 4(%esp)
    movl    $1410065408, (%esp)     # imm = 0x540BE400
    calll   _Z8fib_tailyyy
    movl    %edx, f+4
    movl    %eax, f
    xorl    %eax, %eax
    addl    $28, %esp
    retl

Same applies to gcc 4.9.2 if you give it -O2 (but not in -O1 which was all clang needed)

同样适用于gcc 4.9.2,如果你给它-O2(但不是-O1,这是所有clang需要)

(And of course also in 64-bit mode)

(当然也是64位模式)

#1

clang 3.7 as of a couple of weeks back clearly figures it out:

几个星期前的clang 3.7清楚地表明了这一点:

_Z8fib_tailyyy:                         # @_Z8fib_tailyyy
    pushl   %ebp
    pushl   %ebx
    pushl   %edi
    pushl   %esi
    pushl   %eax
    movl    36(%esp), %ecx
    movl    32(%esp), %esi
    movl    28(%esp), %edi
    movl    24(%esp), %ebx
    movl    %ebx, %eax
    orl %edi, %eax
    je  .LBB0_1
    movl    44(%esp), %ebp
    movl    40(%esp), %eax
    movl    %eax, (%esp)            # 4-byte Spill
.LBB0_3:                                # %if.end
    movl    %ebp, %edx
    movl    (%esp), %eax            # 4-byte Reload
    addl    $-1, %ebx
    adcl    $-1, %edi
    addl    %eax, %esi
    adcl    %edx, %ecx
    movl    %ebx, %ebp
    orl %edi, %ebp
    movl    %esi, (%esp)            # 4-byte Spill
    movl    %ecx, %ebp
    movl    %eax, %esi
    movl    %edx, %ecx
    jne .LBB0_3
    jmp .LBB0_4
.LBB0_1:
    movl    %esi, %eax
    movl    %ecx, %edx
.LBB0_4:                                # %return
    addl    $4, %esp
    popl    %esi
    popl    %edi
    popl    %ebx
    popl    %ebp
    retl


main:                                   # @main
    subl    $28, %esp
    movl    $0, 20(%esp)
    movl    $1, 16(%esp)
    movl    $0, 12(%esp)
    movl    $0, 8(%esp)
    movl    $2, 4(%esp)
    movl    $1410065408, (%esp)     # imm = 0x540BE400
    calll   _Z8fib_tailyyy
    movl    %edx, f+4
    movl    %eax, f
    xorl    %eax, %eax
    addl    $28, %esp
    retl

Same applies to gcc 4.9.2 if you give it -O2 (but not in -O1 which was all clang needed)

同样适用于gcc 4.9.2,如果你给它-O2(但不是-O1,这是所有clang需要)

(And of course also in 64-bit mode)

(当然也是64位模式)

秒客网

使用64位变量的C ++尾递归

1 个解决方案

#1

#1

相关文章