There is a strange behavior I cannot understand. Agreed that float point number are approximations, so even operations that are obviously returning a number without decimal numbers can be approximated to something with decimals.
有一种我无法理解的奇怪行为。同意浮点数是近似值,因此即使显然返回没有十进制数的数的运算也可以近似为带小数的运算。
I'm doing this:
我这样做:
int num = (int)(195.95F * 100);
and since it's a floating point operation I get 19594 instead of 19595.. but this is kind of correct.
因为它是一个浮点运算,我得到19594而不是19595 ..但这是正确的。
What puzzles me is that if I do
令我困惑的是,如果我这样做
float flo = 195.95F * 100;
int num = (int) flo;
I get the correct result of 19595.
我得到19595的正确结果。
Any idea of why this happens?
知道为什么会这样吗?
6 个解决方案
#1
I looked to see if this was the compiler doing the math, but it behaves this way even if you force it out:
我查看这是否是编译器正在进行数学计算,但即使你强行执行它也会这样:
static void Main()
{
int i = (int)(GetF() * GetI()); // 19594
float f = GetF() * GetI();
int j = (int)f; // 19595
}
[MethodImpl(MethodImplOptions.NoInlining)]
static int GetI() { return 100; }
[MethodImpl(MethodImplOptions.NoInlining)]
static float GetF() { return 195.95F; }
It looks like the difference is whether it stays in the registers (wider than normal r4) or is forced to a float
variable:
看起来差别在于它是否保留在寄存器中(比正常的r4宽)或者被强制为浮点变量:
L_0001: call float32 Program::GetF()
L_0006: call int32 Program::GetI()
L_000b: conv.r4
L_000c: mul
L_000d: conv.i4
L_000e: stloc.0
vs
L_000f: call float32 Program::GetF()
L_0014: call int32 Program::GetI()
L_0019: conv.r4
L_001a: mul
L_001b: stloc.1
L_001c: ldloc.1
L_001d: conv.i4
L_001e: stloc.2
The only difference is the stloc.1
/ ldloc.1
.
唯一的区别是stloc.1 / ldloc.1。
This is supported by the fact that if you do an optimised build (which will remove the local variable) I get the same answer (19594) for both.
这得到以下事实的支持:如果你进行优化构建(将删除局部变量),我得到两者的相同答案(19594)。
#2
this code...
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
float result = 195.95F*100;
int intresult = (int)(195.95F * 100);
}
}
}
give this IL
给这个IL
.method private hidebysig static void Main(string[] args) cil managed
{
.entrypoint
// Code size 14 (0xe)
.maxstack 1
.locals init ([0] float32 result,
[1] int32 intresult)
IL_0000: nop
IL_0001: ldc.r4 19595.
IL_0006: stloc.0
IL_0007: ldc.i4 0x4c8a
IL_000c: stloc.1
IL_000d: ret
} // end of method Program::Main
look at IL_00001 -> the compier has done the calc.. Otherwise there are the decimal -> binary conversion problem
看看IL_00001 - > compier已经完成了计算..否则有十进制 - >二进制转换问题
#3
Mark's answer is correct in that it is the conversion between nativefloat and float32/float64.
Mark的答案是正确的,因为它是nativefloat和float32 / float64之间的转换。
This is covered in the CLR ECMA spec but David Notario explains this far better than I could.
CLR ECMA规范对此进行了介绍,但David Notario对此的解释远远超出了我的能力。
#4
Try converting float to double in your second example:
在第二个示例中尝试将float转换为double:
double flo = 195.95F * 100;
int num = (int) flo;
I'm guessing in your first example the compiler is using double to hold the intermediate result, and so in the float case you're losing precision.
我猜你的第一个例子是编译器使用double来保存中间结果,所以在float情况下你会失去精度。
#5
When you multiply by 100, that is an integer, so it is doing an implicit conversion at that step. If you put an "F" behind the 100, I'll bet they'd be the same.
当您乘以100时,这是一个整数,因此它在该步骤进行隐式转换。如果你把“F”放在100后面,我敢打赌他们会是一样的。
I typically use boxing/unboxing with the parentheses when it is a reference type. When it is a value type, I try to use the Convert static methods.
当它是引用类型时,我通常使用括号的装箱/拆箱。当它是值类型时,我尝试使用Convert static方法。
Try Convert.ToSingle(YourNumber); for a more reliable conversion.
试试Convert.ToSingle(YourNumber);为了更可靠的转换。
HTH
#6
I can't answer why the second one works and the first one doesn't. However, I can tell you that 195.95 is a non-terminating decimal in binary, and as such round off errors like this one are bound to happen.
我无法回答为什么第二个有效,第一个没有。但是,我可以告诉你,195.95是二进制中的非终止小数,因此像这样的舍入错误必然会发生。
Try converting to a double rather than float. You could also use a money or a decimal type rather than a float. That will store the number differently and more accurately.
尝试转换为double而不是float。您也可以使用金钱或小数类型而不是浮点数。这将以不同且更准确的方式存储数字。
For more on floating point numbers and the IEEE representation, go here:
有关浮点数和IEEE表示的更多信息,请访问:
#1
I looked to see if this was the compiler doing the math, but it behaves this way even if you force it out:
我查看这是否是编译器正在进行数学计算,但即使你强行执行它也会这样:
static void Main()
{
int i = (int)(GetF() * GetI()); // 19594
float f = GetF() * GetI();
int j = (int)f; // 19595
}
[MethodImpl(MethodImplOptions.NoInlining)]
static int GetI() { return 100; }
[MethodImpl(MethodImplOptions.NoInlining)]
static float GetF() { return 195.95F; }
It looks like the difference is whether it stays in the registers (wider than normal r4) or is forced to a float
variable:
看起来差别在于它是否保留在寄存器中(比正常的r4宽)或者被强制为浮点变量:
L_0001: call float32 Program::GetF()
L_0006: call int32 Program::GetI()
L_000b: conv.r4
L_000c: mul
L_000d: conv.i4
L_000e: stloc.0
vs
L_000f: call float32 Program::GetF()
L_0014: call int32 Program::GetI()
L_0019: conv.r4
L_001a: mul
L_001b: stloc.1
L_001c: ldloc.1
L_001d: conv.i4
L_001e: stloc.2
The only difference is the stloc.1
/ ldloc.1
.
唯一的区别是stloc.1 / ldloc.1。
This is supported by the fact that if you do an optimised build (which will remove the local variable) I get the same answer (19594) for both.
这得到以下事实的支持:如果你进行优化构建(将删除局部变量),我得到两者的相同答案(19594)。
#2
this code...
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
float result = 195.95F*100;
int intresult = (int)(195.95F * 100);
}
}
}
give this IL
给这个IL
.method private hidebysig static void Main(string[] args) cil managed
{
.entrypoint
// Code size 14 (0xe)
.maxstack 1
.locals init ([0] float32 result,
[1] int32 intresult)
IL_0000: nop
IL_0001: ldc.r4 19595.
IL_0006: stloc.0
IL_0007: ldc.i4 0x4c8a
IL_000c: stloc.1
IL_000d: ret
} // end of method Program::Main
look at IL_00001 -> the compier has done the calc.. Otherwise there are the decimal -> binary conversion problem
看看IL_00001 - > compier已经完成了计算..否则有十进制 - >二进制转换问题
#3
Mark's answer is correct in that it is the conversion between nativefloat and float32/float64.
Mark的答案是正确的,因为它是nativefloat和float32 / float64之间的转换。
This is covered in the CLR ECMA spec but David Notario explains this far better than I could.
CLR ECMA规范对此进行了介绍,但David Notario对此的解释远远超出了我的能力。
#4
Try converting float to double in your second example:
在第二个示例中尝试将float转换为double:
double flo = 195.95F * 100;
int num = (int) flo;
I'm guessing in your first example the compiler is using double to hold the intermediate result, and so in the float case you're losing precision.
我猜你的第一个例子是编译器使用double来保存中间结果,所以在float情况下你会失去精度。
#5
When you multiply by 100, that is an integer, so it is doing an implicit conversion at that step. If you put an "F" behind the 100, I'll bet they'd be the same.
当您乘以100时,这是一个整数,因此它在该步骤进行隐式转换。如果你把“F”放在100后面,我敢打赌他们会是一样的。
I typically use boxing/unboxing with the parentheses when it is a reference type. When it is a value type, I try to use the Convert static methods.
当它是引用类型时,我通常使用括号的装箱/拆箱。当它是值类型时,我尝试使用Convert static方法。
Try Convert.ToSingle(YourNumber); for a more reliable conversion.
试试Convert.ToSingle(YourNumber);为了更可靠的转换。
HTH
#6
I can't answer why the second one works and the first one doesn't. However, I can tell you that 195.95 is a non-terminating decimal in binary, and as such round off errors like this one are bound to happen.
我无法回答为什么第二个有效,第一个没有。但是,我可以告诉你,195.95是二进制中的非终止小数,因此像这样的舍入错误必然会发生。
Try converting to a double rather than float. You could also use a money or a decimal type rather than a float. That will store the number differently and more accurately.
尝试转换为double而不是float。您也可以使用金钱或小数类型而不是浮点数。这将以不同且更准确的方式存储数字。
For more on floating point numbers and the IEEE representation, go here:
有关浮点数和IEEE表示的更多信息,请访问: