在C中__asm__起什么作用?

时间:2022-01-14 15:00:02

I looked into some C code from

我查了一些C代码

http://www.mcs.anl.gov/~kazutomo/rdtsc.html

http://www.mcs.anl.gov/ ~ kazutomo / rdtsc.html

They use stuff like "inline", "asm" etc like the following:

他们使用诸如“内联”、“asm”之类的东西:

code1:

code1:

static __inline__ tick gettick (void) {
    unsigned a, d;
    __asm__ __volatile__("rdtsc": "=a" (a), "=d" (d) );
    return (((tick)a) | (((tick)d) << 32));
}

code2:

code2:

volatile int  __attribute__((noinline)) foo2 (int a0, int a1) {
    __asm__ __volatile__ ("");
}

I was wondering what does the code1 and code2 do?

我想知道code1和code2是做什么的?

3 个解决方案

#1


48  

The __volatile__ modifier on an __asm__ block forces the compiler's optimizer to execute the code as-is. Without it, the optimizer may think it can be either removed outright, or lifted out of a loop and cached.

__asm__块上的__volatile修饰符强制编译器的优化器按原样执行代码。如果没有它,优化器可能会认为它可以直接删除,或者从循环中取出并缓存。

This is useful for the rdtsc instruction like so:

这对于像这样的rdtsc指令是有用的:

__asm__ __volatile__("rdtsc": "=a" (a), "=d" (d) )

This takes no dependencies, so the compiler might assume the value can be cached. Volatile is used to force it to read a fresh timestamp.

这不需要依赖项,因此编译器可能会假设可以缓存值。Volatile用于强制它读取新的时间戳。

When used alone, like this:

单独使用时,如:

__asm__ __volatile__ ("")

It will not actually execute anything. You can extend this, though, to get a compile-time memory barrier that won't allow reordering any memory access instructions:

它实际上不会执行任何东西。不过,您可以扩展它,以获得一个编译时内存屏障,不允许重新排序任何内存访问指令:

__asm__ __volatile__ ("":::"memory")

The rdtsc instruction is a good example for volatile. rdtsc is usually used when you need to time how long some instructions take to execute. Imagine some code like this, where you want to time r1 and r2's execution:

rdtsc指令是volatile的一个很好的例子。rdtsc通常用于需要计算执行某些指令所需的时间的时候。想象一下这样的一些代码,你想给r1和r2执行计时:

__asm__ ("rdtsc": "=a" (a0), "=d" (d0) )
r1 = x1 + y1;
__asm__ ("rdtsc": "=a" (a1), "=d" (d1) )
r2 = x2 + y2;
__asm__ ("rdtsc": "=a" (a2), "=d" (d2) )

Here the compiler is actually allowed to cache the timestamp, and valid output might show that each line took exactly 0 clocks to execute. Obviously this isn't what you want, so you introduce __volatile__ to prevent caching:

在这里,编译器实际上被允许缓存时间戳,并且有效的输出可能显示每一行执行0个时钟。显然这不是您想要的,所以您引入了__volatile__来防止缓存:

__asm__ __volatile__("rdtsc": "=a" (a0), "=d" (d0))
r1 = x1 + y1;
__asm__ __volatile__("rdtsc": "=a" (a1), "=d" (d1))
r2 = x2 + y2;
__asm__ __volatile__("rdtsc": "=a" (a2), "=d" (d2))

Now you'll get a new timestamp each time, but it still has a problem that both the compiler and the CPU are allowed to reorder all of these statements. It could end up executing the asm blocks after r1 and r2 have already been calculated. To work around this, you'd add some barriers that force serialization:

现在,每次都将得到一个新的时间戳,但是仍然存在一个问题,即允许编译器和CPU对所有这些语句重新排序。它可能会在r1和r2已经计算完之后执行asm块。要解决这个问题,您需要添加一些强制序列化的障碍:

__asm__ __volatile__("mfence;rdtsc": "=a" (a0), "=d" (d0) :: "memory")
r1 = x1 + y1;
__asm__ __volatile__("mfence;rdtsc": "=a" (a1), "=d" (d1) :: "memory")
r2 = x2 + y2;
__asm__ __volatile__("mfence;rdtsc": "=a" (a2), "=d" (d2) :: "memory")

Note the mfence instruction here, which enforces a CPU-side barrier, and the "memory" specifier in the volatile block which enforces a compile-time barrier. On modern CPUs, you can replace mfence:rdtsc with rdtscp for something more efficient.

注意这里的mfence指令,它强制执行cpuside barrier,以及在volatile块中强制执行编译时barrier的“内存”说明符。在现代cpu上,您可以用rdtscp替换mfence:rdtsc,以获得更高的效率。

#2


2  

asm is for including native Assembly code into the C source code. E.g.

asm包括将本机汇编代码包含到C源代码中。如。

int a = 2;
asm("mov a, 3");
printf("%i", a); // will print 3

Compilers have different variants of it. __asm__ should be synonymous, maybe with some compiler-specific differences.

编译器有不同的版本。__asm__应该是同义词,可能与某些特定于编译器的差异有关。

volatile means the variable can be modified from outside (aka not by the C program). For instance when programming a microcontroller where the memory address 0x0000x1234 is mapped to some device-specific interface (i.e. when coding for the GameBoy, buttons/screen/etc are accessed this way.)

volatile表示可以从外部修改变量(也可以不通过C程序)。例如,当编写一个微控制器时,其中内存地址0x0000x1234映射到某个特定于设备的接口(例如,为GameBoy编写代码时,可以通过这种方式访问按钮/屏幕/等等)。

volatile std::uint8_t* const button1 = 0x00001111;

This disabled compiler optimizations that rely on *button1 not changing unless being changed by the code.

此禁用的编译器优化依赖于*button1不变,除非被代码更改。

It is also used in multi-threaded programming (not needed anymore today?) where a variable might be modified by another thread.

它也用于多线程编程(现在不再需要它了?),在这种编程中,变量可能被另一个线程修改。

inline is a hint to the compiler to "inline" calls to a function.

内联是对编译器“内联”函数调用的提示。

inline int f(int a) {
    return a + 1
}

int a;
int b = f(a);

This should not be compiled into a function call to f but into int b = a + 1. As if f where a macro. Compilers mostly do this optimization automatically depending on function usage/content. __inline__ in this example might have a more specific meaning.

这不应该编译为函数调用f,而是编译为int b = a + 1。就好像f是一个宏。编译器主要根据函数的使用情况和内容自动进行这种优化。在这个例子中,__inline__可能有更具体的含义。

Similarily __attribute__((noinline)) (GCC-specific syntax) prevents a function from being inlined.

类似的__attribute__(noinline)(特定于gcc的语法)阻止函数被内联。

#3


-1  

The __asm__ attribute specifies the name to be used in assembler code for the function or variable.

__asm__属性指定要在函数或变量的汇编代码中使用的名称。

The __volatile__ qualifier, generally used in Real-Time-Computing of embedded systems, addresses a problem with compiler tests of the status register for the ERROR or READY bit causing problems during optimization. __volatile__ was introduced as a way of telling the compiler that the object is subject to rapid change and to force every reference of the object to be a genuine reference.

__挥发物限定词,通常用于嵌入式系统的实时计算,解决了在优化过程中由于错误或就绪位导致状态寄存器的编译器测试问题。__volatile__被引入用于告诉编译器对象受到快速变化的影响,并迫使对象的每个引用成为真正的引用。

#1


48  

The __volatile__ modifier on an __asm__ block forces the compiler's optimizer to execute the code as-is. Without it, the optimizer may think it can be either removed outright, or lifted out of a loop and cached.

__asm__块上的__volatile修饰符强制编译器的优化器按原样执行代码。如果没有它,优化器可能会认为它可以直接删除,或者从循环中取出并缓存。

This is useful for the rdtsc instruction like so:

这对于像这样的rdtsc指令是有用的:

__asm__ __volatile__("rdtsc": "=a" (a), "=d" (d) )

This takes no dependencies, so the compiler might assume the value can be cached. Volatile is used to force it to read a fresh timestamp.

这不需要依赖项,因此编译器可能会假设可以缓存值。Volatile用于强制它读取新的时间戳。

When used alone, like this:

单独使用时,如:

__asm__ __volatile__ ("")

It will not actually execute anything. You can extend this, though, to get a compile-time memory barrier that won't allow reordering any memory access instructions:

它实际上不会执行任何东西。不过,您可以扩展它,以获得一个编译时内存屏障,不允许重新排序任何内存访问指令:

__asm__ __volatile__ ("":::"memory")

The rdtsc instruction is a good example for volatile. rdtsc is usually used when you need to time how long some instructions take to execute. Imagine some code like this, where you want to time r1 and r2's execution:

rdtsc指令是volatile的一个很好的例子。rdtsc通常用于需要计算执行某些指令所需的时间的时候。想象一下这样的一些代码,你想给r1和r2执行计时:

__asm__ ("rdtsc": "=a" (a0), "=d" (d0) )
r1 = x1 + y1;
__asm__ ("rdtsc": "=a" (a1), "=d" (d1) )
r2 = x2 + y2;
__asm__ ("rdtsc": "=a" (a2), "=d" (d2) )

Here the compiler is actually allowed to cache the timestamp, and valid output might show that each line took exactly 0 clocks to execute. Obviously this isn't what you want, so you introduce __volatile__ to prevent caching:

在这里,编译器实际上被允许缓存时间戳,并且有效的输出可能显示每一行执行0个时钟。显然这不是您想要的,所以您引入了__volatile__来防止缓存:

__asm__ __volatile__("rdtsc": "=a" (a0), "=d" (d0))
r1 = x1 + y1;
__asm__ __volatile__("rdtsc": "=a" (a1), "=d" (d1))
r2 = x2 + y2;
__asm__ __volatile__("rdtsc": "=a" (a2), "=d" (d2))

Now you'll get a new timestamp each time, but it still has a problem that both the compiler and the CPU are allowed to reorder all of these statements. It could end up executing the asm blocks after r1 and r2 have already been calculated. To work around this, you'd add some barriers that force serialization:

现在,每次都将得到一个新的时间戳,但是仍然存在一个问题,即允许编译器和CPU对所有这些语句重新排序。它可能会在r1和r2已经计算完之后执行asm块。要解决这个问题,您需要添加一些强制序列化的障碍:

__asm__ __volatile__("mfence;rdtsc": "=a" (a0), "=d" (d0) :: "memory")
r1 = x1 + y1;
__asm__ __volatile__("mfence;rdtsc": "=a" (a1), "=d" (d1) :: "memory")
r2 = x2 + y2;
__asm__ __volatile__("mfence;rdtsc": "=a" (a2), "=d" (d2) :: "memory")

Note the mfence instruction here, which enforces a CPU-side barrier, and the "memory" specifier in the volatile block which enforces a compile-time barrier. On modern CPUs, you can replace mfence:rdtsc with rdtscp for something more efficient.

注意这里的mfence指令,它强制执行cpuside barrier,以及在volatile块中强制执行编译时barrier的“内存”说明符。在现代cpu上,您可以用rdtscp替换mfence:rdtsc,以获得更高的效率。

#2


2  

asm is for including native Assembly code into the C source code. E.g.

asm包括将本机汇编代码包含到C源代码中。如。

int a = 2;
asm("mov a, 3");
printf("%i", a); // will print 3

Compilers have different variants of it. __asm__ should be synonymous, maybe with some compiler-specific differences.

编译器有不同的版本。__asm__应该是同义词,可能与某些特定于编译器的差异有关。

volatile means the variable can be modified from outside (aka not by the C program). For instance when programming a microcontroller where the memory address 0x0000x1234 is mapped to some device-specific interface (i.e. when coding for the GameBoy, buttons/screen/etc are accessed this way.)

volatile表示可以从外部修改变量(也可以不通过C程序)。例如,当编写一个微控制器时,其中内存地址0x0000x1234映射到某个特定于设备的接口(例如,为GameBoy编写代码时,可以通过这种方式访问按钮/屏幕/等等)。

volatile std::uint8_t* const button1 = 0x00001111;

This disabled compiler optimizations that rely on *button1 not changing unless being changed by the code.

此禁用的编译器优化依赖于*button1不变,除非被代码更改。

It is also used in multi-threaded programming (not needed anymore today?) where a variable might be modified by another thread.

它也用于多线程编程(现在不再需要它了?),在这种编程中,变量可能被另一个线程修改。

inline is a hint to the compiler to "inline" calls to a function.

内联是对编译器“内联”函数调用的提示。

inline int f(int a) {
    return a + 1
}

int a;
int b = f(a);

This should not be compiled into a function call to f but into int b = a + 1. As if f where a macro. Compilers mostly do this optimization automatically depending on function usage/content. __inline__ in this example might have a more specific meaning.

这不应该编译为函数调用f,而是编译为int b = a + 1。就好像f是一个宏。编译器主要根据函数的使用情况和内容自动进行这种优化。在这个例子中,__inline__可能有更具体的含义。

Similarily __attribute__((noinline)) (GCC-specific syntax) prevents a function from being inlined.

类似的__attribute__(noinline)(特定于gcc的语法)阻止函数被内联。

#3


-1  

The __asm__ attribute specifies the name to be used in assembler code for the function or variable.

__asm__属性指定要在函数或变量的汇编代码中使用的名称。

The __volatile__ qualifier, generally used in Real-Time-Computing of embedded systems, addresses a problem with compiler tests of the status register for the ERROR or READY bit causing problems during optimization. __volatile__ was introduced as a way of telling the compiler that the object is subject to rapid change and to force every reference of the object to be a genuine reference.

__挥发物限定词,通常用于嵌入式系统的实时计算,解决了在优化过程中由于错误或就绪位导致状态寄存器的编译器测试问题。__volatile__被引入用于告诉编译器对象受到快速变化的影响,并迫使对象的每个引用成为真正的引用。