I'm trying to do a 64=32x32 multiply via the x86 mul
instruction, but I only need the high dword of the result (the edx
register). So naturally, I tried listing edx
as an output register and eax
as a clobbered register.
我正在尝试通过x86 mul指令进行64=32x32的乘法,但是我只需要结果的高dword (edx寄存器)。因此,自然地,我尝试将edx列为输出寄存器,将eax列为阻塞寄存器。
This seems natural to me, but eax
is also an input register. When I try to tell GCC that eax
is clobbered, it gives an error message.
这在我看来很自然,但eax也是一个输入寄存器。当我试图告诉GCC eax崩溃时,它会给出一个错误消息。
__asm__("mull\t%2" : "=d"(div10) : "%a"(UINT32_C(0x1999999A)), "r"(number)
: "cc", "rax");
If I try that, it throws this error message:
如果我这样做,它会抛出错误信息:
divmod10.cpp:76:91: error: can’t find a register in class ‘AREG’ while reloading
‘asm’
divmod10.cpp:76:91: error: ‘asm’ operand has impossible constraints
Omitting it compiles, but breaks the code. GCC ends up relying upon eax
not being clobbered, which is incorrect:
省略它会编译,但会破坏代码。GCC最终依赖于eax不被破坏,这是不正确的:
movl $429496730, %eax
#APP
# 76 "divmod10.cpp" 1
mull %esi
# 0 "" 2
#NO_APP
movl %edx, %esi
#APP
# 78 "divmod10.cpp" 1
mull %edx
# 0 "" 2
#NO_APP
How do I do what I want?
我怎么做我想做的?
1 个解决方案
#1
3
Just make a useless temp for the output to go into and the compiler will optimize it out. For example:
只需要对输出做一个无用的临时变量,编译器就会优化它。例如:
__asm__("mull\t%2" : "=d"(div10), "=a"((int){0})
: "a"(UINT32_C(0x1999999A)), "r"(number) : "cc");
That's the easiest way I know to handle clobbered inputs.
这是我所知道的处理失败输入的最简单的方法。
#1
3
Just make a useless temp for the output to go into and the compiler will optimize it out. For example:
只需要对输出做一个无用的临时变量,编译器就会优化它。例如:
__asm__("mull\t%2" : "=d"(div10), "=a"((int){0})
: "a"(UINT32_C(0x1999999A)), "r"(number) : "cc");
That's the easiest way I know to handle clobbered inputs.
这是我所知道的处理失败输入的最简单的方法。