GCC / x86内联asm:你怎么告诉gcc内联汇编部分会修改%esp?

时间:2021-09-08 03:10:16

While trying to make some old code work again (https://github.com/chaos4ever/chaos/blob/master/libraries/system/system_calls.h#L387, FWIW) I discovered that some of the semantics of gcc seem to have changed in a quite subtle but still dangerous way during the latest 10-15 years... :P

在尝试使一些旧代码再次工作时(https://github.com/chaos4ever/chaos/blob/master/libraries/system/system_calls.h#L387,FWIW)我发现gcc的一些语义似乎有在最近10到15年间,以一种非常微妙但仍然危险的方式改变了......:P

The code used to work well with older versions of gcc, like 2.95. Anyway, here is the code:

该代码用于与旧版本的gcc一起使用,如2.95。无论如何,这是代码:

static inline return_type system_call_service_get(const char *protocol_name, service_parameter_type *service_parameter,
    tag_type *identification)
{
    return_type return_value;

    asm volatile("pushl %2\n"
                 "pushl %3\n"
                 "pushl %4\n"
                 "lcall %5, $0"
                 : "=a" (return_value),
                   "=g" (*service_parameter)
                 : "g" (identification),
                   "g" (service_parameter),
                   "g" (protocol_name),
                   "n" (SYSTEM_CALL_SERVICE_GET << 3));

    return return_value;
}

The problem with the code above is that gcc (4.7 in my case) will compile this to the following asm code (AT&T syntax):

上面的代码的问题是gcc(在我的情况下为4.7)将编译为以下asm代码(AT&T语法):

# 392 "../system/system_calls.h" 1
pushl 68(%esp)  # This pointer (%esp + 0x68) is valid when the inline asm is entered.
pushl %eax
pushl 48(%esp)  # ...but this one is not (%esp + 0x48), since two dwords have now been pushed onto the stack, so %esp is not what the compiler expects it to be
lcall $456, $0

# Restoration of %esp at this point is done in the called method (i.e. lret $12)

The problem: The variables (identification and protocol_name) are on the stack in the calling context. So gcc (with optimizations turned out, unsure if it matters) will just get the values from there and hand it over to the inline asm section. But since I'm pushing stuff on the stack, the offsets that gcc calculate will be off by 8 in the third call (pushl 48(%esp)). :)

问题:变量(标识和protocol_name)位于调用上下文的堆栈中。因此gcc(优化结果,不确定是否重要)将从那里获取值并将其交给内联asm部分。但是因为我在堆栈上推送东西,所以gcc计算的偏移在第三次调用中将被关闭8(pushl 48(%esp))。 :)

This took me a long time to figure out, it wasn't all obvious to me at first.

这花了我很长时间才弄明白,起初并不是很明显。

The easiest way around this is of course to use the r input constraint, to ensure that the value is in a register instead. But is there another, better way? One obvious way would of course be to rewrite the whole system call interface to not push stuff on the stack in the first place (and use registers instead, like e.g. Linux), but that's not a refactoring I feel like doing tonight...

最简单的方法当然是使用r输入约束,以确保该值在寄存器中。但还有另一种更好的方法吗?一个显而易见的方法当然是重写整个系统调用接口,而不是首先在堆栈上推送东西(而是使用寄存器,例如Linux),但这不是我今晚想做的重构......

Is there any way to tell gcc inline asm that "the stack is volatile"? How have you guys been handling stuff like this in the past?

有没有办法告诉gcc inline asm“堆栈是不稳定的”?你们过去一直在处理这样的事情吗?


Update later the same evening: I did found a relevant gcc ML thread (https://gcc.gnu.org/ml/gcc-help/2011-06/msg00206.html) but it didn't seem to help. It seems like specifying %esp in the clobber list should make it do offsets from %ebp instead, but it doesn't work and I suspect the -O2 -fomit-frame-pointer has an effect here. I have both of these flags enabled.

同一天晚上更新:我确实找到了相关的gcc ML线程(https://gcc.gnu.org/ml/gcc-help/2011-06/msg00206.html),但它似乎没有帮助。似乎在clobber列表中指定%esp应该使它从%ebp做偏移,但它不起作用,我怀疑-O2 -fomit-frame-pointer在这里有效。我启用了这两个标志。

1 个解决方案

#1


3  

What works and what doesn't:

什么有用,有什么不可用:

  1. I tried omitting -fomit-frame-pointer. No effect whatsoever. I included %esp, esp and sp in the list of clobbers.

    我试着省略-fomit-frame-pointer。没有任何影响。我将%esp,esp和sp包含在clobbers列表中。

  2. I tried omitting -fomit-frame-pointer and -O3. This actually produces code that works, since it relies on %ebp rather than %esp.

    我试着省略-fomit-frame-pointer和-O3。这实际上产生了有效的代码,因为它依赖于%ebp而不是%esp。

    pushl 16(%ebp)
    pushl 12(%ebp)
    pushl 8(%ebp)
    lcall $456, $0
    
  3. I tried with just having -O3 and not -fomit-frame-pointer specified in my command line. Creates bad, broken code (relies on %esp being constant within the whole assembly block, i.e. no stack frame).

    我尝试在命令行中指定-O3和not -fomit-frame-pointer。创建坏的,破坏的代码(依赖于%esp在整个程序集块中保持不变,即没有堆栈帧)。

  4. I tried with skipping -fomit-frame-pointer and just using -O2. Broken code, no stack frame.

    我尝试跳过-fomit-frame-pointer并使用-O2。代码破碎,没有堆栈框架。

  5. I tried with just using -O1. Broken code, no stack frame.

    我尝试使用-O1。代码破碎,没有堆栈框架。

  6. I tried adding cc as clobber. No can do, doesn't make any difference whatsoever.

    我尝试添加cc作为clobber。没有办法,没有任何区别。

  7. I tried changing the input constraints to ri, giving the input & output code below. This of course works but is slightly less elegant than I had hoped. Then again, perfect is the enemy of good so maybe I will have to live with this for now.

    我尝试将输入约束更改为ri,给出下面的输入和输出代码。这当然有效但不如我希望的那么优雅。然后,完美是好的敌人所以也许我现在必须忍受这个。

Input C code:

输入C代码:

static inline return_type system_call_service_get(const char *protocol_name, service_parameter_type *service_parameter,
    tag_type *identification)
{
    return_type return_value;

    asm volatile("pushl %2\n"
                 "pushl %3\n"
                 "pushl %4\n"
                 "lcall %5, $0"
                 : "=a" (return_value),
                   "=g" (*service_parameter)
                 : "ri" (identification),
                   "ri" (service_parameter),
                   "ri" (protocol_name),
                   "n" (SYSTEM_CALL_SERVICE_GET << 3));

    return return_value;
}

Output asm code. As can be seen, using registers instead which should always be safe (but maybe somewhat less performant since the compiler has to move stuff around):

输出asm代码。可以看出,使用寄存器而不是应该始终是安全的(但由于编译器必须移动东西,因此可能性能稍差):

#APP
# 392 "../system/system_calls.h" 1
pushl %esi
pushl %eax
pushl %ebx
lcall $456, $0

#1


3  

What works and what doesn't:

什么有用,有什么不可用:

  1. I tried omitting -fomit-frame-pointer. No effect whatsoever. I included %esp, esp and sp in the list of clobbers.

    我试着省略-fomit-frame-pointer。没有任何影响。我将%esp,esp和sp包含在clobbers列表中。

  2. I tried omitting -fomit-frame-pointer and -O3. This actually produces code that works, since it relies on %ebp rather than %esp.

    我试着省略-fomit-frame-pointer和-O3。这实际上产生了有效的代码,因为它依赖于%ebp而不是%esp。

    pushl 16(%ebp)
    pushl 12(%ebp)
    pushl 8(%ebp)
    lcall $456, $0
    
  3. I tried with just having -O3 and not -fomit-frame-pointer specified in my command line. Creates bad, broken code (relies on %esp being constant within the whole assembly block, i.e. no stack frame).

    我尝试在命令行中指定-O3和not -fomit-frame-pointer。创建坏的,破坏的代码(依赖于%esp在整个程序集块中保持不变,即没有堆栈帧)。

  4. I tried with skipping -fomit-frame-pointer and just using -O2. Broken code, no stack frame.

    我尝试跳过-fomit-frame-pointer并使用-O2。代码破碎,没有堆栈框架。

  5. I tried with just using -O1. Broken code, no stack frame.

    我尝试使用-O1。代码破碎,没有堆栈框架。

  6. I tried adding cc as clobber. No can do, doesn't make any difference whatsoever.

    我尝试添加cc作为clobber。没有办法,没有任何区别。

  7. I tried changing the input constraints to ri, giving the input & output code below. This of course works but is slightly less elegant than I had hoped. Then again, perfect is the enemy of good so maybe I will have to live with this for now.

    我尝试将输入约束更改为ri,给出下面的输入和输出代码。这当然有效但不如我希望的那么优雅。然后,完美是好的敌人所以也许我现在必须忍受这个。

Input C code:

输入C代码:

static inline return_type system_call_service_get(const char *protocol_name, service_parameter_type *service_parameter,
    tag_type *identification)
{
    return_type return_value;

    asm volatile("pushl %2\n"
                 "pushl %3\n"
                 "pushl %4\n"
                 "lcall %5, $0"
                 : "=a" (return_value),
                   "=g" (*service_parameter)
                 : "ri" (identification),
                   "ri" (service_parameter),
                   "ri" (protocol_name),
                   "n" (SYSTEM_CALL_SERVICE_GET << 3));

    return return_value;
}

Output asm code. As can be seen, using registers instead which should always be safe (but maybe somewhat less performant since the compiler has to move stuff around):

输出asm代码。可以看出,使用寄存器而不是应该始终是安全的(但由于编译器必须移动东西,因此可能性能稍差):

#APP
# 392 "../system/system_calls.h" 1
pushl %esi
pushl %eax
pushl %ebx
lcall $456, $0