不同的浮点结果与优化的编译器错误?

The below code works on Visual Studio 2008 with and without optimization. But it only works on g++ without optimization (O0).

下面的代码在Visual Studio 2008中进行，没有优化。但是它只在没有优化(O0)的情况下工作。

#include <cstdlib>
#include <iostream>
#include <cmath>

double round(double v, double digit)
{
    double pow = std::pow(10.0, digit);
    double t = v * pow;
    //std::cout << "t:" << t << std::endl;
    double r = std::floor(t + 0.5);
    //std::cout << "r:" << r << std::endl;
    return r / pow;
}

int main(int argc, char *argv[])
{
    std::cout << round(4.45, 1) << std::endl;
    std::cout << round(4.55, 1) << std::endl;
}

The output should be:

输出应该是:

4.5
4.6

But g++ with optimization (O1 - O3) will output:

但是g++的优化(O1 - O3)将输出:

4.5
4.5

If I add the volatile keyword before t, it works, so might there be some kind of optimization bug?

如果我在t之前添加了volatile关键字，它可以工作，那么是否存在某种优化错误?

Test on g++ 4.1.2, and 4.4.4.

测试g++ 4.1.2和4.4.4。

Here is the result on ideone: http://ideone.com/Rz937

下面是ideone的结果:http://ideone.com/Rz937。

And the option I test on g++ is simple:

我在g++上测试的选项很简单:

g++ -O2 round.cpp

The more interesting result, even I turn on /fp:fast option on Visual Studio 2008, the result still is correct.

更有趣的结果是，即使我打开/fp: Visual Studio 2008的快速选项，结果仍然是正确的。

Further question:

进一步的问题:

I was wondering, should I always turn on the -ffloat-store option?

我在想，我是否应该经常打开-ffloat-store选项?

Because the g++ version I tested is shipped with CentOS/Red Hat Linux 5 and CentOS/Redhat 6.

因为我测试的g++版本有CentOS/Red Hat Linux 5和CentOS/Redhat 6。

I compiled many of my programs under these platforms, and I am worried it will cause unexpected bugs inside my programs. It seems a little difficult to investigate all my C++ code and used libraries whether they have such problems. Any suggestion?

我在这些平台下编译了很多程序，我担心它会在我的程序中产生意想不到的bug。对于我所有的c++代码和使用的库来说，研究它们是否存在这样的问题似乎有点困难。任何建议吗?

Is anyone interested in why even /fp:fast turned on, Visual Studio 2008 still works? It seems like Visual Studio 2008 is more reliable at this problem than g++?

有没有人对为什么即使/fp:fast打开，Visual Studio 2008仍然有效?看起来Visual Studio 2008在这个问题上比g++更可靠?

6 个解决方案

#1

Intel x86 processors use 80-bit extended precision internally, whereas double is normally 64-bit wide. Different optimization levels affect how often floating point values from CPU get saved into memory and thus rounded from 80-bit precision to 64-bit precision.

Intel x86处理器在内部使用80位的扩展精度，而double通常是64位宽。不同的优化级别会影响从CPU到内存的浮点值的频率，从而从80位精度提高到64位精度。

Use the -ffloat-store gcc option to get the same floating point results with different optimization levels.

使用-ffloat-store gcc选项来获得与不同优化级别相同的浮点结果。

Alternatively, use the long double type, which is normally 80-bit wide on gcc to avoid rounding from 80-bit to 64-bit precision.

或者，使用long double类型，在gcc中通常是80位宽，以避免从80位到64位精度。

man gcc says it all:

人类海湾合作委员会说了这一切:

   -ffloat-store
       Do not store floating point variables in registers, and inhibit
       other options that might change whether a floating point value is
       taken from a register or memory.

       This option prevents undesirable excess precision on machines such
       as the 68000 where the floating registers (of the 68881) keep more
       precision than a "double" is supposed to have.  Similarly for the
       x86 architecture.  For most programs, the excess precision does
       only good, but a few programs rely on the precise definition of
       IEEE floating point.  Use -ffloat-store for such programs, after
       modifying them to store all pertinent intermediate computations
       into variables.

#2

Output should be: 4.5 4.6 That's what the output would be if you had infinite precision, or if you were working with a device that used a decimal-based rather than binary-based floating point representation. But, you aren't. Most computers use the binary IEEE floating point standard.

输出应该是:4.5 4.6，如果你有无限的精度，或者你使用的设备使用的是基于十进制的浮点数，而不是基于二进制的浮点表示法，输出应该是这样的。但是,你没有。大多数计算机使用二进制IEEE浮点标准。

As Maxim Yegorushkin already noted in his answer, part of the problem is that internally your computer is using an 80 bit floating point representation. This is just part of the problem, though. The basis of the problem is that any number of the form n.nn5 does not have an exact binary floating representation. Those corner cases are always inexact numbers.

正如Maxim Yegorushkin在他的回答中所指出的，部分问题在于，你的电脑在内部使用了一个80位的浮点表示法。不过，这只是问题的一部分。问题的基础是任何形式的n。nn5没有精确的二进制浮点表示法。那些角落的案件总是不准确的数字。

If you really want your rounding to be able to reliably round these corner cases, you need a rounding algorithm that addresses the fact that n.n5, n.nn5, or n.nnn5, etc. (but not n.5) is always inexact. Find the corner case that determines whether some input value rounds up or down and return the rounded-up or rounded-down value based on a comparison to this corner case. And you do need to take care that a optimizing compiler will not put that found corner case in an extended precision register.

如果你真的希望你的四舍五入能够可靠地绕过这些角，你需要一个四舍五入的算法来解决这个问题。它们,n。nn5或n。nnn5等(但不是n.5)总是不精确的。找到一个角的情况，它决定了某个输入值是向上还是向下，然后根据这个角的情况，返回圆的或圆滑的值。你需要注意的是，一个优化的编译器不会在一个扩展的精度寄存器中放置那个发现的角落。

See How does Excel successfully Rounds Floating numbers even though they are imprecise? for such an algorithm.

看看Excel是如何成功地将浮点数计算出来的，即使它们是不精确的?对于这样一个算法。

Or you can just live with the fact that the corner cases will sometimes round erroneously.

或者你也可以接受这样一个事实:角落的情况有时会出现错误。

#3

Different compilers have different optimization settings. Some of those faster optimization settings do not maintain strict floating-point rules according to IEEE 754. Visual Studio has a specific setting, /fp:strict, /fp:precise, /fp:fast, where /fp:fast violates the standard on what can be done. You might find that this flag is what controls the optimization in such settings. You may also find a similar setting in GCC which changes the behaviour.

不同的编译器有不同的优化设置。一些更快的优化设置并没有按照IEEE 754来维持严格的浮点规则。Visual Studio有一个特定的设置，/fp:严格的，/fp:精确的，/fp:fast, where /fp:fast违反了可以完成的标准。您可能会发现，这个标志是在这样的设置中控制优化的东西。您还可以在GCC中找到类似的设置，从而改变行为。

If this is the case then the only thing that's different between the compilers is that GCC would look for the fastest floating point behaviour by default on higher optimisations, whereas Visual Studio does not change the floating point behaviour with higher optimization levels. Thus it might not necessarily be an actual bug, but intended behaviour of an option you didn't know you were turning on.

如果是这样，那么编译器之间唯一不同的是，GCC会在更高的优化级别上默认地寻找最快速的浮点行为，而Visual Studio不会更改具有更高优化级别的浮点行为。因此，它可能并不一定是一个真正的bug，而是一个您不知道您正在打开的选项的预期行为。

#4

To those who can't reproduce the bug: do not uncomment the commented out debug stmts, they affect the result.

对于那些不能复制错误的人:不要取消注释掉调试stmts，它们会影响结果。

This implies that the problem is related to the debug statements. And it looks like there's a rounding error caused by loading the values into registers during the output statements, which is why others found that you can fix this with -ffloat-store

这意味着问题与调试语句有关。看起来，在输出语句中，将值加载到寄存器中会产生一个舍入错误，这就是为什么其他人发现您可以用-ffloat-store来修复这个错误。

Further question:

进一步的问题:

I was wondering, should I always turn on -ffloat-store option?

我在想，我是否应该经常打开-ffloat-store选项?

To be flippant, there must be a reason that some programmers don't turn on -ffloat-store, otherwise the option wouldn't exist (likewise, there must be a reason that some programmers do turn on -ffloat-store). I wouldn't recommend always turning it on or always turning it off. Turning it on prevents some optimizations, but turning it off allows for the kind of behavior you're getting.

要想成为一个轻率的人，一定有一个原因，一些程序员不打开-ffloat-store，否则这个选项就不存在(同样，一定有一个原因，一些程序员确实打开了-ffloat-store)。我不建议总是打开或关闭它。打开它可以防止一些优化，但是关闭它允许你的行为。

But, generally, there is some mismatch between binary floating point numbers (like the computer uses) and decimal floating point numbers (that people are familiar with), and that mismatch can cause similar behavior to what your getting (to be clear, the behavior you're getting is not caused by this mismatch, but similar behavior can be). The thing is, since you already have some vagueness when dealing with floating point, I can't say that -ffloat-store makes it any better or any worse.

但是,一般来说,有一些不匹配二进制浮点数(如计算机使用)和十进制浮点数(人都熟悉),和不匹配会导致类似的行为你获得(需要澄清的是,你得到的行为不是这种不匹配造成的,但类似的行为可以)。事情是这样的，因为你在处理浮点数时已经有些模糊了，我不能说-ffloat-store使它变得更好或更糟。

Instead, you may want to look into other solutions to the problem you're trying to solve (unfortunately, Koenig doesn't point to the actual paper, and I can't really find an obvious "canonical" place for it, so I'll have to send you to Google).

相反，你可能想要寻找其他的解决方法来解决你想要解决的问题(不幸的是，Koenig没有指出实际的论文，我也找不到一个明显的“规范”的地方，所以我必须把你送到谷歌)。

If you're not rounding for output purposes, I would probably look at std::modf() (in cmath) and std::numeric_limits<double>::epsilon() (in limits). Thinking over the original round() function, I believe it would be cleaner to replace the call to std::floor(d + .5) with a call to this function:

如果您不是为了输出目的而舍入，我可能会考虑std::modf()(在cmath中)和std::numeric_limit ::epsilon()(在限制中)。考虑到原来的round()函数，我认为将调用std::floor(d + .5)替换为对这个函数的调用会更简洁:

// this still has the same problems as the original rounding function
int round_up(double d)
{
    // return value will be coerced to int, and truncated as expected
    // you can then assign the int to a double, if desired
    return d + 0.5;
}

I think that suggests the following improvement:

我认为这表明了以下的改进:

// this won't work for negative d ...
// this may still round some numbers up when they should be rounded down
int round_up(double d)
{
    double floor;
    d = std::modf(d, &floor);
    return floor + (d + .5 + std::numeric_limits<double>::epsilon());
}

A simple note: std::numeric_limits<T>::epsilon() is defined as "the smallest number added to 1 that creates a number not equal to 1." You usually need to use a relative epsilon (i.e., scale epsilon somehow to account for the fact that you're working with numbers other than "1"). The sum of d, .5 and std::numeric_limits<double>::epsilon() should be near 1, so grouping that addition means that std::numeric_limits<double>::epsilon() will be about the right size for what we're doing. If anything, std::numeric_limits<double>::epsilon() will be too large (when the sum of all three is less than one) and may cause us to round some numbers up when we shouldn't.

一个简单的注释:std::numeric_limit ::epsilon()被定义为“添加到1的最小数字，它创建的数字不等于1。”你通常需要使用一个相对的。，以某种方式来解释你使用的数字不是“1”。d， .5和std::numeric_limit ::epsilon()应该在1附近，所以分组，加上std::numeric_limit ::epsilon()将会是我们正在做的事情的正确大小。如果有的话，std::numeric_limits ::epsilon()将会太大(当三者之和小于1时)，并可能导致我们在不应该的情况下将一些数字加起来。

Nowadays, you should consider std::nearbyint().

现在，您应该考虑std:: close byint()。

#5

Personally, I have hit the same problem going the other way - from gcc to VS. In most instances I think it is better to avoid optimisation. The only time it is worthwhile is when you're dealing with numerical methods involving large arrays of floating point data. Even after disassembling I'm often underwhelmed by the compilers choices. Very often it's just easier to use compiler intrinsics or just write the assembly yourself.

就我个人而言，我遇到了同样的问题——从gcc到vs，在大多数情况下，我认为最好避免优化。唯一值得做的是处理涉及大量浮点数据的数值方法。即使在拆卸之后，我也常常对编译器的选择感到失望。通常，使用编译器的特性或者自己编写程序会更简单。

#6

The accepted answer is correct if you are compiling to an x86 target that doesn't include SSE2. All modern x86 processors support SSE2, so if you can take advantage of it, you should:

如果您正在编译一个不包含SSE2的x86目标，那么接受的答案是正确的。所有现代的x86处理器都支持SSE2，所以如果你能利用它，你应该:

-mfpmath=sse -msse2 -ffp-contract=off

Let's break this down.

让我们分解。

-mfpmath=sse -msse2. This rounds inside SSE2 registers, which is much faster than storing every intermediate result to memory. Note that this is already the default on GCC for x86-64. From the GCC wiki:

-mfpmath = -msse2东南偏南。这在SSE2寄存器内循环，这比将每个中间结果存储到内存中要快得多。注意，这已经是GCC中x86-64的默认值。从GCC维基:

On more modern x86 processors that support SSE2, specifying the compiler options -mfpmath=sse -msse2 ensures all float and double operations are performed in SSE registers and correctly rounded. These options do not affect the ABI and should therefore be used whenever possible for predictable numerical results.

在支持SSE2的更现代的x86处理器上，指定编译器选项-mfpmath=sse -msse2确保所有的浮点数和双操作都在sse寄存器中执行，并且正确地进行了整数处理。这些选项不影响ABI，因此应尽可能使用可预测的数值结果。

-ffp-contract=off. Controlling rounding isn't enough for an exact match, however. FMA (fused multiply-add) instructions can change the rounding behavior versus its non-fused counterparts, so we need to disable it. This is the default on Clang, not GCC. As explained by this answer:

-ffp-contract =。然而，控制四舍五入是不够的。FMA(融合的multiply-add)指令可以改变舍入行为与非融合的行为，因此我们需要禁用它。这是Clang的默认值，不是GCC。正如这个答案所解释的:

An FMA has only one rounding (it effectively keeps infinite precision for the internal temporary multiply result), while an ADD + MUL has two.

一个FMA只有一个舍入(它有效地保持了内部临时乘法结果的无限精度)，而一个ADD + MUL有两个。

By disabling FMA, we get results that exactly match on debug and release, at the cost of some performance (and accuracy). We can still take advantage of other performance benefits of SSE and AVX.

通过禁用FMA，我们得到了与调试和发布完全匹配的结果，以某些性能(和准确性)为代价。我们仍然可以利用SSE和AVX的其他性能优势。

#1