I am quite familiar with GCC -O3 flag, but how it differs from -Os, in which situation we should prefer one over other?
我非常熟悉GCC -O3标志,但是它与-Os的区别是什么呢?在这种情况下,我们应该更喜欢一个。
5 个解决方案
#1
20
The GCC documentation describes what these options do very explicitly.
GCC文档描述了这些选项非常明确地做了什么。
-O3 tries to optimize code very heavily for performance. It includes all of the optimizations -O2 includes, plus some more.
-O3试图为性能优化代码。它包括所有的优化-O2包括,再加上一些。
-Os, on the other hand, instructs GCC to "optimize for size." It enables all -O2 optimizations which do not increase the size of the executable, and then it also toggles some optimization flags to further reduce executable size.
另一方面,操作系统指示GCC“优化大小”。它支持不增加可执行文件大小的所有-O2优化,然后它还切换一些优化标志以进一步减少可执行的大小。
Note that I've been deliberately a bit vague with my descriptions - read the GCC documentation for a more in-depth discussion of exactly which flags are enabled for either optimization level.
请注意,我故意对我的描述有些模糊——请阅读GCC文档,以便更深入地讨论哪些标志可以用于优化级别。
I believe the -O* optimization levels are just that - mutually exclusive, distinct levels of optimization. It doesn't really make sense to mix them, since two levels will enable or leave out flags that the other one intentionally leaves out or enables (respectively). If you want to mix and match (you probably don't actually want to do this, unless you have a really good reason to want a specific set of flags), you are best off reading the documentation and mixing and matching the flags each level enables by hand.
我认为- o *优化级别仅仅是相互排斥、层次分明的优化。将它们混合在一起是没有意义的,因为两个级别将启用或省略掉另一方故意遗漏或启用的标志(分别)。如果您想要混合和匹配(您可能实际上并不想这样做,除非您有一个非常好的理由来需要特定的标志),那么您最好阅读文档,并将每个级别的标记进行混合和匹配。
I think I'll also link this article from the Gentoo Linux Wiki, which talks about optimization flags as they relate to building the packages for the operating system. Obviously not all of this is applicable, but it still contains some interesting information - for one:
我想我还将把这篇文章链接到Gentoo Linux Wiki上,它讨论的是优化标志,因为它们与构建操作系统的包有关。显然不是所有这些都适用,但它仍然包含一些有趣的信息:
Compiling with -O3 is not a guaranteed way to improve performance, and in fact in many cases can slow down a system due to larger binaries and increased memory usage. -O3 is also known to break several packages. Therefore, using -O3 is not recommended.
使用-O3编译并不是提高性能的保证方法,事实上,在许多情况下,由于二进制文件的增加和内存使用量的增加,系统可能会减慢系统的运行速度。-O3也被认为打破了几个包。因此,不推荐使用-O3。
According to that article, -O2 is, most of the time, "as good as" -O3, and is safer to use, regarding broken executable output.
根据这篇文章,-O2在大多数情况下都是“as good as”-O3,而且使用起来更安全,关于可执行的输出。
#2
5
I suggest to read GCC documentation. -O3 is for getting a fast running code (even at the expense of some code bloat), while -Os
is optimizing for size of the generated code.
我建议阅读GCC文档。-O3是为了获得一个快速运行的代码(即使牺牲了一些代码膨胀),而-操作系统正在优化生成的代码的大小。
There are tons of other (obscure) GCC optimization flags (e.g. -fgcse-sm
) many of which are not enabled even at -O3
.
还有许多其他(模糊的)GCC优化标志(例如-fgcse-sm),其中许多甚至在-O3中都没有启用。
You might perhaps be also interested by -flto (for Link-Time Optimization) to be used, in addition of e.g. -O3
or -Os
, both at compile time and at link time. Then see also this answer.
您可能还会对-flto(用于link - time优化)感兴趣,在编译时和链接时都使用-O3或-Os。然后看看这个答案。
At last, take care to use the latest version of GCC (currently 4.8 at end of 2013), because GCC is improving significantly its optimizations.
最后,请注意使用最新版本的GCC(目前为4.8 At end of 2013),因为GCC正在显著改进它的优化。
You might want to also use -mtune=native (at least for x86).
您可能还需要使用-mtune=本机(至少对于x86)。
And you might even write your own optimization pass, specific to your own particular libraries and APIs, perhaps using MELT plugin.
您甚至可以编写自己的优化传递,具体到您自己的特定库和api,或者使用熔融插件。
As CmdrMoozy answered you might prefer using -O2
over -O3
(but notice that recent GCC versions have improved a lot their -O3
, so the Gentoo citation -recommending against -O3
and in favor of -O2
is becoming less relevant.).
正如CmdrMoozy所回答的,您可能更喜欢使用-O2 / -O3(但是请注意,最近的GCC版本已经改进了很多-O3,所以Gentoo引用-O3和支持-O2的建议变得不那么相关了)。
Also, as this SlashDot-ed Stack paper (by Xi Wang, Nickolai Zeldovich, M. Frans Kaashoek, and Armando Solar-Lezama) shows, many programs are not entirely C standard compliant and are not happy (and behave incorrectly) when some valid optimizations are done. Undefined behavior is a tricky subject.
另外,正如这张由席子(Xi Wang, Nickolai Zeldovich, M. Frans Kaashoek,和Armando sun - lezama)所显示的,许多程序并不完全符合C标准,当一些有效的优化完成时,它们并不快乐(而且行为不正确)。未定义的行为是一个棘手的问题。
BTW, notice that using -O3
usually makes your compilation time much bigger, and brings often (but not always) at most a few percents more performance than -O2
or even -O1
.... (it is even worse with -flto
). This is why I rarely use it.
顺便说一句,请注意使用o3通常使你的编译时间大得多,并将通常(但不总是)最多的性能超过几个百分点- 02甚至o1群....(更糟糕的是-flto)。这就是为什么我很少使用它。
#3
2
It depends. Do you need to optimize speed or size?
视情况而定。你需要优化速度或尺寸吗?
-O3
Optimize yet more. -O3 turns on all optimizations specified by -O2 and also turns on the -finline-functions, -funswitch-loops, -fpredictive-commoning, -fgcse-after-reload, -ftree-loop-vectorize, -ftree-slp-vectorize, -fvect-cost-model, -ftree-partial-pre and -fipa-cp-clone options.o3优化更多。-O3打开-O2所指定的所有优化,同时也打开-finline-功能,-funswitch-loop, -fgcse- reload, -ftree-loop-vectorize, -ftree-slp-vectorize, -ftree-partial-pre和-fipa-cp-clone选项。
-O0
Reduce compilation time and make debugging produce the expected results. This is the default.-O0减少编译时间,使调试产生预期的结果。这是默认的。
-Os
Optimize for size. -Os enables all -O2 optimizations that do not typically increase code size. It also performs further optimizations designed to reduce code size.
-Os Disables the following optimization flags:操作系统优化尺寸。-Os使所有的-O2优化不会增加代码的大小。它还执行了旨在减少代码大小的进一步优化。-Os禁用下列优化标志:
-falign-functions
-falign-jumps
-falign-loops
-falign-labels
-freorder-blocks
-freorder-blocks-and-partition
-fprefetch-loop-arrays
- faligni - faligni -f - faligni - - - faligni - - - - - faligni - - - - - faligni - - - - - - - faligni - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
Actually, -O is a shorthand for a long list of independent optimizations. If you don't know what you need, just go for -O3.
实际上,-O是一长串独立优化的简写。如果你不知道你需要什么,就去找-O3。
#4
1
-O3 optimizes for speed, whereas -Os optimizes for space. That means -O3 will give you a fast executable, but it may be rather large, and -Os gives you a smaller executable, but it might be slower.
-O3优化速度,而-Os优化空间。这意味着-O3将给您一个快速的可执行文件,但是它可能相当大,并且-Os给您一个较小的可执行文件,但是它可能会比较慢。
Space and time efficiency is usually a trade-off. Faster algorithms tend to take up more space, where in-place algorithms (algorithms that don't increase the space usage) tend to be less efficient.
空间和时间效率通常是一种权衡。更快的算法倾向于占用更多的空间,而就地算法(不增加空间使用的算法)往往效率较低。
Usually modern computers have plenty of memory space, so -O3 is usually preferable. However if you're programing for something with low-ram (like a small device) you might prefer -Os
通常现代计算机有足够的内存空间,所以-O3通常更可取。然而,如果你正在编程,用低内存(像一个小设备),你可能喜欢-操作系统。
#5
0
This is not really possible to answer, a simple rules would be to use optimize for speed on critical code path, and optimize for size on non critical code path such as loading, ...
这是不可能回答的,一个简单的规则就是在关键代码路径上使用优化,并优化非关键代码路径(如加载)的大小。
Some compilers can work in two passes to decide it for you, a first one create a special executable with profiling support, you run the application to collect data and a second compilation is able to decide, based on the data of what is best. It allows de-virtualization, branch prediction, ...
一些编译器可以通过两个传递来确定它,第一个编译器创建一个具有分析支持的特殊的可执行文件,您运行应用程序来收集数据,而第二个编译可以根据最好的数据来决定。它允许去虚拟化、分支预测……
#1
20
The GCC documentation describes what these options do very explicitly.
GCC文档描述了这些选项非常明确地做了什么。
-O3 tries to optimize code very heavily for performance. It includes all of the optimizations -O2 includes, plus some more.
-O3试图为性能优化代码。它包括所有的优化-O2包括,再加上一些。
-Os, on the other hand, instructs GCC to "optimize for size." It enables all -O2 optimizations which do not increase the size of the executable, and then it also toggles some optimization flags to further reduce executable size.
另一方面,操作系统指示GCC“优化大小”。它支持不增加可执行文件大小的所有-O2优化,然后它还切换一些优化标志以进一步减少可执行的大小。
Note that I've been deliberately a bit vague with my descriptions - read the GCC documentation for a more in-depth discussion of exactly which flags are enabled for either optimization level.
请注意,我故意对我的描述有些模糊——请阅读GCC文档,以便更深入地讨论哪些标志可以用于优化级别。
I believe the -O* optimization levels are just that - mutually exclusive, distinct levels of optimization. It doesn't really make sense to mix them, since two levels will enable or leave out flags that the other one intentionally leaves out or enables (respectively). If you want to mix and match (you probably don't actually want to do this, unless you have a really good reason to want a specific set of flags), you are best off reading the documentation and mixing and matching the flags each level enables by hand.
我认为- o *优化级别仅仅是相互排斥、层次分明的优化。将它们混合在一起是没有意义的,因为两个级别将启用或省略掉另一方故意遗漏或启用的标志(分别)。如果您想要混合和匹配(您可能实际上并不想这样做,除非您有一个非常好的理由来需要特定的标志),那么您最好阅读文档,并将每个级别的标记进行混合和匹配。
I think I'll also link this article from the Gentoo Linux Wiki, which talks about optimization flags as they relate to building the packages for the operating system. Obviously not all of this is applicable, but it still contains some interesting information - for one:
我想我还将把这篇文章链接到Gentoo Linux Wiki上,它讨论的是优化标志,因为它们与构建操作系统的包有关。显然不是所有这些都适用,但它仍然包含一些有趣的信息:
Compiling with -O3 is not a guaranteed way to improve performance, and in fact in many cases can slow down a system due to larger binaries and increased memory usage. -O3 is also known to break several packages. Therefore, using -O3 is not recommended.
使用-O3编译并不是提高性能的保证方法,事实上,在许多情况下,由于二进制文件的增加和内存使用量的增加,系统可能会减慢系统的运行速度。-O3也被认为打破了几个包。因此,不推荐使用-O3。
According to that article, -O2 is, most of the time, "as good as" -O3, and is safer to use, regarding broken executable output.
根据这篇文章,-O2在大多数情况下都是“as good as”-O3,而且使用起来更安全,关于可执行的输出。
#2
5
I suggest to read GCC documentation. -O3 is for getting a fast running code (even at the expense of some code bloat), while -Os
is optimizing for size of the generated code.
我建议阅读GCC文档。-O3是为了获得一个快速运行的代码(即使牺牲了一些代码膨胀),而-操作系统正在优化生成的代码的大小。
There are tons of other (obscure) GCC optimization flags (e.g. -fgcse-sm
) many of which are not enabled even at -O3
.
还有许多其他(模糊的)GCC优化标志(例如-fgcse-sm),其中许多甚至在-O3中都没有启用。
You might perhaps be also interested by -flto (for Link-Time Optimization) to be used, in addition of e.g. -O3
or -Os
, both at compile time and at link time. Then see also this answer.
您可能还会对-flto(用于link - time优化)感兴趣,在编译时和链接时都使用-O3或-Os。然后看看这个答案。
At last, take care to use the latest version of GCC (currently 4.8 at end of 2013), because GCC is improving significantly its optimizations.
最后,请注意使用最新版本的GCC(目前为4.8 At end of 2013),因为GCC正在显著改进它的优化。
You might want to also use -mtune=native (at least for x86).
您可能还需要使用-mtune=本机(至少对于x86)。
And you might even write your own optimization pass, specific to your own particular libraries and APIs, perhaps using MELT plugin.
您甚至可以编写自己的优化传递,具体到您自己的特定库和api,或者使用熔融插件。
As CmdrMoozy answered you might prefer using -O2
over -O3
(but notice that recent GCC versions have improved a lot their -O3
, so the Gentoo citation -recommending against -O3
and in favor of -O2
is becoming less relevant.).
正如CmdrMoozy所回答的,您可能更喜欢使用-O2 / -O3(但是请注意,最近的GCC版本已经改进了很多-O3,所以Gentoo引用-O3和支持-O2的建议变得不那么相关了)。
Also, as this SlashDot-ed Stack paper (by Xi Wang, Nickolai Zeldovich, M. Frans Kaashoek, and Armando Solar-Lezama) shows, many programs are not entirely C standard compliant and are not happy (and behave incorrectly) when some valid optimizations are done. Undefined behavior is a tricky subject.
另外,正如这张由席子(Xi Wang, Nickolai Zeldovich, M. Frans Kaashoek,和Armando sun - lezama)所显示的,许多程序并不完全符合C标准,当一些有效的优化完成时,它们并不快乐(而且行为不正确)。未定义的行为是一个棘手的问题。
BTW, notice that using -O3
usually makes your compilation time much bigger, and brings often (but not always) at most a few percents more performance than -O2
or even -O1
.... (it is even worse with -flto
). This is why I rarely use it.
顺便说一句,请注意使用o3通常使你的编译时间大得多,并将通常(但不总是)最多的性能超过几个百分点- 02甚至o1群....(更糟糕的是-flto)。这就是为什么我很少使用它。
#3
2
It depends. Do you need to optimize speed or size?
视情况而定。你需要优化速度或尺寸吗?
-O3
Optimize yet more. -O3 turns on all optimizations specified by -O2 and also turns on the -finline-functions, -funswitch-loops, -fpredictive-commoning, -fgcse-after-reload, -ftree-loop-vectorize, -ftree-slp-vectorize, -fvect-cost-model, -ftree-partial-pre and -fipa-cp-clone options.o3优化更多。-O3打开-O2所指定的所有优化,同时也打开-finline-功能,-funswitch-loop, -fgcse- reload, -ftree-loop-vectorize, -ftree-slp-vectorize, -ftree-partial-pre和-fipa-cp-clone选项。
-O0
Reduce compilation time and make debugging produce the expected results. This is the default.-O0减少编译时间,使调试产生预期的结果。这是默认的。
-Os
Optimize for size. -Os enables all -O2 optimizations that do not typically increase code size. It also performs further optimizations designed to reduce code size.
-Os Disables the following optimization flags:操作系统优化尺寸。-Os使所有的-O2优化不会增加代码的大小。它还执行了旨在减少代码大小的进一步优化。-Os禁用下列优化标志:
-falign-functions
-falign-jumps
-falign-loops
-falign-labels
-freorder-blocks
-freorder-blocks-and-partition
-fprefetch-loop-arrays
- faligni - faligni -f - faligni - - - faligni - - - - - faligni - - - - - faligni - - - - - - - faligni - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
Actually, -O is a shorthand for a long list of independent optimizations. If you don't know what you need, just go for -O3.
实际上,-O是一长串独立优化的简写。如果你不知道你需要什么,就去找-O3。
#4
1
-O3 optimizes for speed, whereas -Os optimizes for space. That means -O3 will give you a fast executable, but it may be rather large, and -Os gives you a smaller executable, but it might be slower.
-O3优化速度,而-Os优化空间。这意味着-O3将给您一个快速的可执行文件,但是它可能相当大,并且-Os给您一个较小的可执行文件,但是它可能会比较慢。
Space and time efficiency is usually a trade-off. Faster algorithms tend to take up more space, where in-place algorithms (algorithms that don't increase the space usage) tend to be less efficient.
空间和时间效率通常是一种权衡。更快的算法倾向于占用更多的空间,而就地算法(不增加空间使用的算法)往往效率较低。
Usually modern computers have plenty of memory space, so -O3 is usually preferable. However if you're programing for something with low-ram (like a small device) you might prefer -Os
通常现代计算机有足够的内存空间,所以-O3通常更可取。然而,如果你正在编程,用低内存(像一个小设备),你可能喜欢-操作系统。
#5
0
This is not really possible to answer, a simple rules would be to use optimize for speed on critical code path, and optimize for size on non critical code path such as loading, ...
这是不可能回答的,一个简单的规则就是在关键代码路径上使用优化,并优化非关键代码路径(如加载)的大小。
Some compilers can work in two passes to decide it for you, a first one create a special executable with profiling support, you run the application to collect data and a second compilation is able to decide, based on the data of what is best. It allows de-virtualization, branch prediction, ...
一些编译器可以通过两个传递来确定它,第一个编译器创建一个具有分析支持的特殊的可执行文件,您运行应用程序来收集数据,而第二个编译可以根据最好的数据来决定。它允许去虚拟化、分支预测……