I am working with an array of doubles called indata
(in the heap, allocated with malloc), and a local double called sum
.
我正在使用一个名为indata的双精度数组(在堆中,用malloc分配)和一个名为sum的本地双精度数。
I wrote two different functions to compare values in indata
, and obtained different results. Eventually I determined that the discrepancy was due to one function using an expression in a conditional test, and the other function using a local variable in the same conditional test. I expected these to be equivalent.
我写了两个不同的函数来比较indata中的值,并获得不同的结果。最后我确定差异是由于一个函数在条件测试中使用表达式,而另一个函数在同一条件测试中使用局部变量。我希望这些是等价的。
My function A uses:
我的功能A使用:
if (indata[i]+indata[j] > max) hi++;
and my function B uses:
我的功能B使用:
sum = indata[i]+indata[j];
if (sum>max) hi++;
After going through the same data set and max
, I end up with different values of hi
depending on which function I use. I believe function B is correct, and function A is misleading. Similarly when I try the snippet below
经过相同的数据集和最大值后,我最终会得到不同的hi值,具体取决于我使用的函数。我相信功能B是正确的,功能A是误导性的。同样,当我尝试下面的代码片段时
sum = indata[i]+indata[j];
if ((indata[i]+indata[j]) != sum) etc.
that conditional will evaluate to true.
条件将评估为真。
While I understand that floating point numbers do not necessarily provide an exact representation, why does that in-exact representation change when evaluated as an expression vs stored in a variable? Is recommended best practice to always evaluate a double expression like this prior to a conditional? Thanks!
虽然我理解浮点数不一定提供精确的表示,但为什么在计算表达式vs存储在变量中时,精确表示会发生变化?建议的最佳做法是在条件之前始终评估这样的双重表达式?谢谢!
1 个解决方案
#1
12
I suspect you're using 32-bit x86, the only common architecture subject to excess precision. In C, expressions of type float
and double
are actually evaluated as float_t
or double_t
, whose relationships to float
and double
are reflected in the FLT_EVAL_METHOD
macro. In the case of x86, both are defined as long double
because the fpu is not actually capable of performing arithmetic at single or double precision. (It has mode bits intended to allow that, but the behavior is slightly wrong and thus can't be used.)
我怀疑你使用的是32位x86,这是唯一一个精度过高的常见架构。在C中,float和double类型的表达式实际上被计算为float_t或double_t,它们与float和double的关系反映在FLT_EVAL_METHOD宏中。在x86的情况下,两者都被定义为long double,因为fpu实际上不能以单精度或双精度执行算术。 (它具有允许的模式位,但行为稍有错误,因此无法使用。)
Assigning to an object of type float
or double
is one way to force rounding and get rid of the excess precision, but you can also just add a gratuitous cast to (double)
if you prefer to leave it as an expression without assignments.
赋值为float或double类型的对象是强制舍入并消除多余精度的一种方法,但如果您希望将其保留为没有赋值的表达式,也可以添加无偿强制转换(double)。
Note that forcing rounding to the desired precision is not equivalent to performing the arithmetic at the desired precision; instead of one rounding step (during the arithmetic) you now have two (during the arithmetic, and again to drop unwanted precision), and in cases where the first rounding gives you an exact-midpoint, the second rounding can go in the 'wrong' direction. This issue is generally called double rounding, and it makes excess precision significantly worse than nominal precision for certain types of calculations.
注意,强制舍入到所需的精度并不等于以所需的精度执行算术;而不是一个舍入步骤(在算术期间)你现在有两个(在算术期间,并再次降低不需要的精度),并且在第一个舍入给你一个精确中点的情况下,第二个舍入可以进入'错误'方向。这个问题通常称为双舍入,对于某些类型的计算,它会使精度过高,明显低于标称精度。
#1
12
I suspect you're using 32-bit x86, the only common architecture subject to excess precision. In C, expressions of type float
and double
are actually evaluated as float_t
or double_t
, whose relationships to float
and double
are reflected in the FLT_EVAL_METHOD
macro. In the case of x86, both are defined as long double
because the fpu is not actually capable of performing arithmetic at single or double precision. (It has mode bits intended to allow that, but the behavior is slightly wrong and thus can't be used.)
我怀疑你使用的是32位x86,这是唯一一个精度过高的常见架构。在C中,float和double类型的表达式实际上被计算为float_t或double_t,它们与float和double的关系反映在FLT_EVAL_METHOD宏中。在x86的情况下,两者都被定义为long double,因为fpu实际上不能以单精度或双精度执行算术。 (它具有允许的模式位,但行为稍有错误,因此无法使用。)
Assigning to an object of type float
or double
is one way to force rounding and get rid of the excess precision, but you can also just add a gratuitous cast to (double)
if you prefer to leave it as an expression without assignments.
赋值为float或double类型的对象是强制舍入并消除多余精度的一种方法,但如果您希望将其保留为没有赋值的表达式,也可以添加无偿强制转换(double)。
Note that forcing rounding to the desired precision is not equivalent to performing the arithmetic at the desired precision; instead of one rounding step (during the arithmetic) you now have two (during the arithmetic, and again to drop unwanted precision), and in cases where the first rounding gives you an exact-midpoint, the second rounding can go in the 'wrong' direction. This issue is generally called double rounding, and it makes excess precision significantly worse than nominal precision for certain types of calculations.
注意,强制舍入到所需的精度并不等于以所需的精度执行算术;而不是一个舍入步骤(在算术期间)你现在有两个(在算术期间,并再次降低不需要的精度),并且在第一个舍入给你一个精确中点的情况下,第二个舍入可以进入'错误'方向。这个问题通常称为双舍入,对于某些类型的计算,它会使精度过高,明显低于标称精度。