不同的优化级别会导致功能不同的代码吗?

时间:2021-01-24 20:58:38

I am curious about the liberties that a compiler has when optimizing. Let's limit this question to GCC and C/C++ (any version, any flavour of standard):

我很好奇编译器在优化时的*。让我们把这个问题限制在GCC和C/ c++(任何版本,任何标准的味道):

Is it possible to write code which behaves differently depending on which optimization level it was compiled with?

是否有可能编写行为不同的代码,这取决于它所使用的优化级别?

The example I have in mind is printing different bits of text in various constructors in C++ and getting a difference depending on whether copies are elided (though I've not been able to make such a thing work).

我想到的示例是在c++的不同构造函数中打印不同的文本,并根据是否省略副本获得不同的结果(尽管我还不能使这样的东西工作)。

Counting clock cycles is not permitted. If you have an example for a non-GCC compiler, I'd be curious, too, but I can't check it. Bonus points for an example in C. :-)

不允许计数时钟周期。如果您有一个关于非gcc编译器的示例,我也很好奇,但是我不能检查它。例如在C.:-)

Edit: The example code should be standard compliant and not contain undefined behaviour from the outset.

编辑:示例代码应该符合标准,并且从一开始就不包含未定义的行为。

Edit 2: Got some great answers already! Let me up the stakes a bit: The code must constitute a well-formed program and be standards-compliant, and it must compile to correct, deterministic programs in every optimization level. (That excludes things like race-conditions in ill-formed multithreaded code.) Also I appreciate that floating point rounding may be affected, but let's discount that.

编辑2:已经有一些很棒的答案了!让我稍微强调一下:代码必须是一个格式良好的程序,并且符合标准,并且它必须在每个优化级别中编译为正确的、确定的程序。(这就排除了格式不良的多线程代码中的种族条件。)我也很欣赏浮点四舍五入可能会受到影响,但让我们来折现一下。

I just hit 800 reputation, so I think I shall blow 50 reputation as bounty on the first complete example to conform to (the spirit) of those conditions; 25 if it involves abusing strict aliasing. (Subject to someone showing me how to send bounty to someone else.)

我刚打了800个声誉,所以我想我应该把50个声誉作为赏金放在第一个完全符合(精神)条件的例子上;如果涉及滥用严格的假混。(如果有人教我如何把赏金发给别人。)

10 个解决方案

#1


17  

The portion of the C++ standard that applies is §1.9 "Program execution". It reads, in part:

c++标准的一部分,适用§1.9“程序执行”。它读取部分:

conforming implementations are required to emulate (only) the observable behavior of the abstract machine as explained below. ...

符合标准的实现需要模拟(仅)抽象机器的可观察行为,如下所述。

A conforming implementation executing a well-formed program shall produce the same observable behavior as one of the possible execution sequences of the corresponding instance of the abstract machine with the same program and the same input. ...

执行格式良好的程序的符合条件的实现应产生与具有相同程序和相同输入的抽象机器相应实例的可能执行序列之一相同的可观察行为。

The observable behavior of the abstract machine is its sequence of reads and writes to volatile data and calls to library I/O functions. ...

抽象机器的可观察行为是它对易失性数据的读写顺序和对库I/O函数的调用。

So, yes, code may behave differently at different optimization levels, but (assuming that all levels produce a conforming compiler), but they cannot behave observably differently.

因此,是的,代码在不同的优化级别上可能表现得不同,但是(假设所有级别都生成一个符合要求的编译器),但是它们不能表现得不同。

EDIT: Allow me to correct my conclusion: Yes, code may behave differently at different optimization levels as long as each behavior is observably identical to one of the behaviors of the standard's abstract machine.

编辑:请允许我纠正我的结论:是的,只要每个行为都与标准的抽象机器的行为明显相同,代码在不同的优化级别上的行为就可能不同。

#2


15  

Is it possible to write code which behaves differently depending on which optimization level it was compiled with?

是否有可能编写行为不同的代码,这取决于它所使用的优化级别?

Only if you trigger a compiler's bug.

除非您触发编译器的错误。

EDIT

编辑

This example behaves differently on gcc 4.5.2:

这个例子在gcc 4.5.2中表现不同:

void foo(int i) {
  foo(i+1);
}

main() {
  foo(0);
}

Compiled with -O0 creates a program crashing with a segmentation fault.
Compiled with -O2 creates a program entering an endless loop.

使用-O0编译时,将创建一个程序,该程序在出现分段错误时崩溃。使用-O2编译创建一个程序,该程序将进入一个无限循环。

#3


13  

Floating point calculations are a ripe source for differences. Depending on how the individual operations are ordered, you can get more/less rounding errors.

浮点计算是差异的成熟来源。根据各个操作的顺序,您可以得到更多/更少的舍入错误。

Less than safe multi-threaded code can also have different results depending on how memory accesses are optimized, but that's essentially a bug in your code anyhow.

不安全的多线程代码也可能有不同的结果,这取决于内存访问是如何优化的,但这本质上就是代码中的一个bug。

And as you mentioned, side effects in copy constructors can vanish when optimization levels change.

正如您所提到的,当优化级别改变时,拷贝构造函数中的副作用就会消失。

#4


8  

For C, almost all operations are strictly defined in the abstract machine and optimizations are only allowed if the observable result is exactly that of that abstract machine. Exceptions of that rule that come to mind:

对于C,几乎所有的操作都是在抽象机器中严格定义的,只有在可观察结果与抽象机器完全一致的情况下,才允许进行优化。想到这条规则的例外情况:

  • undefined behavior don't has to be consistent between different compiler runs or executions of the faulty code
  • 未定义的行为不需要在不同的编译器运行或执行错误代码之间保持一致。
  • floating point operations may cause different rounding
  • 浮点运算可能导致不同的舍入
  • arguments to function calls can be evaluated in any order
  • 函数调用的参数可以按任何顺序计算
  • expressions with volatile qualified type may or may not be evaluated just for their side effects
  • 具有volatile限定类型的表达式可能仅仅因为其副作用而被评估,也可能不被评估
  • identical const qualified compound literals may or may be not folded into one static memory location
  • 相同的const限定复合文字可以或不能折叠到一个静态内存位置

#5


8  

OK, my flagrant play for the bounty, by providing a concrete example. I'll put together the bits from other people's answers and my comments.

好吧,我明目张胆地为赏金而战,通过提供一个具体的例子。我将把其他人的回答和我的评论放在一起。

For the purpose of different behaviour at different optimizations levels, "optimization level A" shall denote gcc -O0 (I'm using version 4.3.4, but it doesn't matter much, I think any even vaguely recent version will show the difference I'm after), and "optimization level B" shall denote gcc -O0 -fno-elide-constructors.

为了在不同的优化级别上实现不同的行为,“优化级别A”应该表示gcc -O0(我使用的是4.3.4版本,但这并不重要,我认为任何稍微最近的版本都将显示我所追求的差异),“优化级别B”应该表示gcc -O0 -fno-elide构造函数。

Code is simple:

代码很简单:

#include <iostream>

struct Foo {
    ~Foo() { std::cout << "~Foo\n"; }
};

int main() {
    Foo f = Foo();
}

Output at optimization level A:

优化级别A的输出:

~Foo

Output at optimization level B:

优化B级输出:

~Foo
~Foo

The code is totally legal, but the output is implementation-dependent because of copy constructor elision, and in particular it's sensitive to gcc's optimization flag that disables copy ctor elision.

代码是完全合法的,但是由于复制构造函数省略,输出是依赖于实现的,特别是它对禁用复制ctor省略的gcc的优化标志非常敏感。

Note that generally speaking, "optimization" refers to compiler transformations that can alter behavior that is undefined, unspecified or implementation-defined, but not behavior that is defined by the standard. So any example that satisfies your criteria necessarily is a program whose output is either unspecified or implementation-defined. In this case it's unspecified by the standard whether copy ctors are elided, I just happen to be lucky that GCC reliably elides them pretty much whenever allowed, but has an option to disable that.

注意,一般来说,“优化”指的是编译器转换,它可以改变未定义、未指定或实现定义的行为,但不能改变标准定义的行为。因此,任何满足条件的示例都必须是输出未指定或实现定义的程序。在这种情况下,标准不确定是否省略了copy ctors,幸运的是GCC在任何允许的情况下都会可靠地省略它们,但是可以选择禁用它们。

#6


4  

Anything that is Undefined Behavior according to the standard can change its behavior depending on optimization level (or moon-phase, for that matter).

任何根据标准定义的未定义行为都可以根据优化级别(或者说是月亮阶段)改变其行为。

#7


2  

The -fstrict-aliasing option can easily cause changes in behavior if you have two pointers to the same block of memory. This is supposed to be invalid but is actually quite common.

如果有两个指向同一块内存的指针,那么-fstrict- alialize选项很容易导致行为的改变。这被认为是无效的,但实际上是相当普遍的。

#8


2  

Since copy constructor calls can be optimized away, even if they have side effects, having copy constructors with side-effects will cause unoptimized and optimized code to behave differently.

由于复制构造函数调用可以被优化,即使它们有副作用,具有具有副作用的复制构造函数将导致未优化和优化的代码行为不同。

#9


1  

This C program invokes undefined behavior, but does display different results in different optimization levels:

这个C程序调用未定义的行为,但是在不同的优化级别显示不同的结果:

#include <stdio.h>
/*
$ for i in 0 1 2 3 4 
    do echo -n "$i: " && gcc -O$i x.c && ./a.out 
  done
0: 5
1: 5
2: 5
3: -1
4: -1
*/

void f(int a) {
  int b;
  printf("%d\n", (int)(&a-&b));
}
int main() {
 f(0);
 return 0;
}

#10


-1  

Got some interesting example in my OS course today. We analized some software mutex that could be damaged on optimization because the compiler does not know about the parallel execution.

今天在我的OS课程中有一些有趣的例子。我们分析了一些软件互斥体,由于编译器不知道并行执行,在优化时可能会损坏它们。

The compiler can reorder statements that do not operate on dependent data. As I already statet in parallelized code this dependencie is hidden for the compiler so it could break. The example I gave would lead to some hard times in debugging as the threadsafety is broken and your code behaves unpredictable because of OS-scheduling issues and concurrent access errors.

编译器可以重新排序不依赖于相关数据的语句。因为我已经在并行代码中声明了这个依赖项,所以编译器隐藏了它,所以它可能会被破坏。我给出的示例将导致调试中出现一些困难,因为线程安全性被破坏,并且由于操作系统调度问题和并发访问错误,您的代码行为不可预测。

#1


17  

The portion of the C++ standard that applies is §1.9 "Program execution". It reads, in part:

c++标准的一部分,适用§1.9“程序执行”。它读取部分:

conforming implementations are required to emulate (only) the observable behavior of the abstract machine as explained below. ...

符合标准的实现需要模拟(仅)抽象机器的可观察行为,如下所述。

A conforming implementation executing a well-formed program shall produce the same observable behavior as one of the possible execution sequences of the corresponding instance of the abstract machine with the same program and the same input. ...

执行格式良好的程序的符合条件的实现应产生与具有相同程序和相同输入的抽象机器相应实例的可能执行序列之一相同的可观察行为。

The observable behavior of the abstract machine is its sequence of reads and writes to volatile data and calls to library I/O functions. ...

抽象机器的可观察行为是它对易失性数据的读写顺序和对库I/O函数的调用。

So, yes, code may behave differently at different optimization levels, but (assuming that all levels produce a conforming compiler), but they cannot behave observably differently.

因此,是的,代码在不同的优化级别上可能表现得不同,但是(假设所有级别都生成一个符合要求的编译器),但是它们不能表现得不同。

EDIT: Allow me to correct my conclusion: Yes, code may behave differently at different optimization levels as long as each behavior is observably identical to one of the behaviors of the standard's abstract machine.

编辑:请允许我纠正我的结论:是的,只要每个行为都与标准的抽象机器的行为明显相同,代码在不同的优化级别上的行为就可能不同。

#2


15  

Is it possible to write code which behaves differently depending on which optimization level it was compiled with?

是否有可能编写行为不同的代码,这取决于它所使用的优化级别?

Only if you trigger a compiler's bug.

除非您触发编译器的错误。

EDIT

编辑

This example behaves differently on gcc 4.5.2:

这个例子在gcc 4.5.2中表现不同:

void foo(int i) {
  foo(i+1);
}

main() {
  foo(0);
}

Compiled with -O0 creates a program crashing with a segmentation fault.
Compiled with -O2 creates a program entering an endless loop.

使用-O0编译时,将创建一个程序,该程序在出现分段错误时崩溃。使用-O2编译创建一个程序,该程序将进入一个无限循环。

#3


13  

Floating point calculations are a ripe source for differences. Depending on how the individual operations are ordered, you can get more/less rounding errors.

浮点计算是差异的成熟来源。根据各个操作的顺序,您可以得到更多/更少的舍入错误。

Less than safe multi-threaded code can also have different results depending on how memory accesses are optimized, but that's essentially a bug in your code anyhow.

不安全的多线程代码也可能有不同的结果,这取决于内存访问是如何优化的,但这本质上就是代码中的一个bug。

And as you mentioned, side effects in copy constructors can vanish when optimization levels change.

正如您所提到的,当优化级别改变时,拷贝构造函数中的副作用就会消失。

#4


8  

For C, almost all operations are strictly defined in the abstract machine and optimizations are only allowed if the observable result is exactly that of that abstract machine. Exceptions of that rule that come to mind:

对于C,几乎所有的操作都是在抽象机器中严格定义的,只有在可观察结果与抽象机器完全一致的情况下,才允许进行优化。想到这条规则的例外情况:

  • undefined behavior don't has to be consistent between different compiler runs or executions of the faulty code
  • 未定义的行为不需要在不同的编译器运行或执行错误代码之间保持一致。
  • floating point operations may cause different rounding
  • 浮点运算可能导致不同的舍入
  • arguments to function calls can be evaluated in any order
  • 函数调用的参数可以按任何顺序计算
  • expressions with volatile qualified type may or may not be evaluated just for their side effects
  • 具有volatile限定类型的表达式可能仅仅因为其副作用而被评估,也可能不被评估
  • identical const qualified compound literals may or may be not folded into one static memory location
  • 相同的const限定复合文字可以或不能折叠到一个静态内存位置

#5


8  

OK, my flagrant play for the bounty, by providing a concrete example. I'll put together the bits from other people's answers and my comments.

好吧,我明目张胆地为赏金而战,通过提供一个具体的例子。我将把其他人的回答和我的评论放在一起。

For the purpose of different behaviour at different optimizations levels, "optimization level A" shall denote gcc -O0 (I'm using version 4.3.4, but it doesn't matter much, I think any even vaguely recent version will show the difference I'm after), and "optimization level B" shall denote gcc -O0 -fno-elide-constructors.

为了在不同的优化级别上实现不同的行为,“优化级别A”应该表示gcc -O0(我使用的是4.3.4版本,但这并不重要,我认为任何稍微最近的版本都将显示我所追求的差异),“优化级别B”应该表示gcc -O0 -fno-elide构造函数。

Code is simple:

代码很简单:

#include <iostream>

struct Foo {
    ~Foo() { std::cout << "~Foo\n"; }
};

int main() {
    Foo f = Foo();
}

Output at optimization level A:

优化级别A的输出:

~Foo

Output at optimization level B:

优化B级输出:

~Foo
~Foo

The code is totally legal, but the output is implementation-dependent because of copy constructor elision, and in particular it's sensitive to gcc's optimization flag that disables copy ctor elision.

代码是完全合法的,但是由于复制构造函数省略,输出是依赖于实现的,特别是它对禁用复制ctor省略的gcc的优化标志非常敏感。

Note that generally speaking, "optimization" refers to compiler transformations that can alter behavior that is undefined, unspecified or implementation-defined, but not behavior that is defined by the standard. So any example that satisfies your criteria necessarily is a program whose output is either unspecified or implementation-defined. In this case it's unspecified by the standard whether copy ctors are elided, I just happen to be lucky that GCC reliably elides them pretty much whenever allowed, but has an option to disable that.

注意,一般来说,“优化”指的是编译器转换,它可以改变未定义、未指定或实现定义的行为,但不能改变标准定义的行为。因此,任何满足条件的示例都必须是输出未指定或实现定义的程序。在这种情况下,标准不确定是否省略了copy ctors,幸运的是GCC在任何允许的情况下都会可靠地省略它们,但是可以选择禁用它们。

#6


4  

Anything that is Undefined Behavior according to the standard can change its behavior depending on optimization level (or moon-phase, for that matter).

任何根据标准定义的未定义行为都可以根据优化级别(或者说是月亮阶段)改变其行为。

#7


2  

The -fstrict-aliasing option can easily cause changes in behavior if you have two pointers to the same block of memory. This is supposed to be invalid but is actually quite common.

如果有两个指向同一块内存的指针,那么-fstrict- alialize选项很容易导致行为的改变。这被认为是无效的,但实际上是相当普遍的。

#8


2  

Since copy constructor calls can be optimized away, even if they have side effects, having copy constructors with side-effects will cause unoptimized and optimized code to behave differently.

由于复制构造函数调用可以被优化,即使它们有副作用,具有具有副作用的复制构造函数将导致未优化和优化的代码行为不同。

#9


1  

This C program invokes undefined behavior, but does display different results in different optimization levels:

这个C程序调用未定义的行为,但是在不同的优化级别显示不同的结果:

#include <stdio.h>
/*
$ for i in 0 1 2 3 4 
    do echo -n "$i: " && gcc -O$i x.c && ./a.out 
  done
0: 5
1: 5
2: 5
3: -1
4: -1
*/

void f(int a) {
  int b;
  printf("%d\n", (int)(&a-&b));
}
int main() {
 f(0);
 return 0;
}

#10


-1  

Got some interesting example in my OS course today. We analized some software mutex that could be damaged on optimization because the compiler does not know about the parallel execution.

今天在我的OS课程中有一些有趣的例子。我们分析了一些软件互斥体,由于编译器不知道并行执行,在优化时可能会损坏它们。

The compiler can reorder statements that do not operate on dependent data. As I already statet in parallelized code this dependencie is hidden for the compiler so it could break. The example I gave would lead to some hard times in debugging as the threadsafety is broken and your code behaves unpredictable because of OS-scheduling issues and concurrent access errors.

编译器可以重新排序不依赖于相关数据的语句。因为我已经在并行代码中声明了这个依赖项,所以编译器隐藏了它,所以它可能会被破坏。我给出的示例将导致调试中出现一些困难,因为线程安全性被破坏,并且由于操作系统调度问题和并发访问错误,您的代码行为不可预测。