I've recently come across some code which has a loop of the form
我最近遇到了一些具有表单循环的代码
for (int i = 0; i < 1e7; i++){
}
I question the wisdom of doing this since 1e7 is a floating point type, and will cause i
to be promoted when evaluating the stopping condition. Should this be of cause for concern?
我质疑这样做是否明智,因为1e7是一种浮点类型,在评价停止条件时,会让我得到提升。这应该引起关注吗?
4 个解决方案
#1
50
The elephant in the room here is that the range of an int
could be as small as -32767 to +32767, and the behaviour on assigning a larger value than this to such an int
is undefined.
在这里的房间里的大象是一个整数的范围可以是-32767到+32767,而将一个更大的值分配给这样一个整数的行为是没有定义的。
But, as for your main point, indeed it should concern you as it is a very bad habit. Things could go wrong as yes, 1e7 is a floating point double type.
但是,就你的主要观点而言,它确实应该引起你的关注,因为它是一个非常坏的习惯。事情可能会出错,是的,1e7是一个浮点双类型。
The fact that i
will be converted to a floating point due to type promotion rules is somewhat moot: the real damage is done if there is unexpected truncation of the apparent integral literal. By the way of a "proof by example", consider first the loop
由于类型升级规则,我将被转换为一个浮点数,这一点有点奇怪:如果明显的整数文字被截断,那么实际的损害就会发生。通过“示例证明”,首先考虑循环
for (std::uint64_t i = std::numeric_limits<std::uint64_t>::max() - 1024; i ++< 18446744073709551615ULL; ){
std::cout << i << "\n";
}
This outputs every consecutive value of i
in the range, as you'd expect. Note that std::numeric_limits<std::uint64_t>::max()
is 18446744073709551615ULL
, which is 1 less than the 64th power of 2. (Here I'm using a slide-like "operator" ++<
which is useful when working with unsigned
types. Many folk consider -->
and ++<
as obfuscating but in scientific programming they are common, particularly -->
.)
它输出i在范围内的每个连续值,正如您所期望的。注意,std::numeric_limits
Now on my machine, a double is an IEEE754 64 bit floating point. (Such as scheme is particularly good at representing powers of 2 exactly - IEEE754 can represent powers of 2 up to 1022 exactly.) So 18,446,744,073,709,551,616
(the 64th power of 2) can be represented exactly as a double. The nearest representable number before that is 18,446,744,073,709,550,592
(which is 1024 less).
在我的机器上,double是IEEE754位浮点数。(例如scheme尤其擅长精确地表示2的幂——IEEE754可以精确地表示2到1022的幂。)所以18446,744,073,709,551,616(2的64次幂)可以精确地表示为一个双数。在此之前最近的可表示数字是18446,744,073,709,550,592(少了1024)。
So now let's write the loop as
现在我们把循环写成
for (std::uint64_t i = std::numeric_limits<std::uint64_t>::max() - 1024; i ++< 1.8446744073709551615e19; ){
std::cout << i << "\n";
}
On my machine that will only output one value of i
: 18,446,744,073,709,550,592
(the number that we've already seen). This proves that 1.8446744073709551615e19
is a floating point type. If the compiler was allowed to treat the literal as an integral type then the output of the two loops would be equivalent.
在我的机器上,它只输出i的一个值:18,446,744,73,709,550,592(我们已经看到的数字)。这证明了1.8446744073709551615e19是一种浮点类型。如果编译器被允许将文字视为一个积分类型,那么两个循环的输出将是等价的。
#2
14
It will work, assuming that your int
is at least 32 bits.
它将工作,假设你的int至少是32位。
However, if you really want to use exponential notation, you should better define an integer constant outside the loop and use proper casting, like this:
但是,如果您真的想使用指数表示法,您应该在循环之外定义一个整数常数,并使用适当的类型转换,如下所示:
const int MAX_INDEX = static_cast<int>(1.0e7);
...
for (int i = 0; i < MAX_INDEX; i++) {
...
}
Considering this, I'd say it is much better to write
考虑到这一点,我认为最好还是写下来
const int MAX_INDEX = 10000000;
or if you can use C++14
或者你可以用c++ 14。
const int MAX_INDEX = 10'000'000;
#3
10
1e7
is a literal of type double
, and usually double
is 64-bit IEEE 754 format with a 52-bit mantissa. Roughly every tenth power of 2 corresponds to a third power of 10, so double
should be able to represent integers up to at least 105*3 = 1015, exactly. And if int
is 32-bit then int
has roughly 103*3 = 109 as max value (asking Google search it says that "2**31 - 1" = 2 147 483 647, i.e. twice the rough estimate).
1e7是double类型的文字,double类型通常是64位的IEEE 754格式,具有52位的尾数。大约2的每十分之一次方对应着10的三次方,所以2应该能够表示至少105*3 = 1015的整数。如果int是32位,那么int大约有103*3 = 109为最大值(问谷歌搜索,它说“2**31 - 1”= 2 147 483 647,即粗略估计的两倍)。
So, in practice it's safe on current desktop systems and larger.
因此,在实践中,它在当前的桌面系统和更大的系统上是安全的。
But C++ allows int
to be just 16 bits, and on e.g. an embedded system with that small int
, one would have Undefined Behavior.
但是,c++允许int仅为16位,并且在例如一个带有小int的嵌入式系统上,一个人会有未定义的行为。
#4
2
If the intention to loop for a exact integer number of iterations, for example if iterating over exactly all the elements in an array then comparing against a floating point value is maybe not such a good idea, solely for accuracy reasons; since the implicit cast of an integer to float will truncate integers toward zero there's no real danger of out-of-bounds access, it will just abort the loop short.
如果想要循环一个精确的整数次迭代,例如迭代数组中的所有元素,然后与浮点值进行比较,这可能不是一个好主意,仅仅是出于准确性的原因;由于对浮点数的隐式强制转换将把整数截断为零,因此不存在越界访问的真正危险,因此它将终止循环。
Now the question is: When do these effects actually kick in? Will your program experience them? The floating point representation usually used these days is IEEE 754. As long as the exponent is 0 a floating point value is essentially an integer. C double precision floats 52 bits for the mantissa, which gives you integer precision to a value of up to 2^52, which is in the order of about 1e15. Without specifying with a suffix f
that you want a floating point literal to be interpreted single precision the literal will be double precision and the implicit conversion will target that as well. So as long as your loop end condition is less 2^52 it will work reliably!
现在的问题是:这些效应是什么时候产生的?你的程序会体验它们吗?目前常用的浮点表示法是IEEE 754。只要指数为0,浮点值本质上就是一个整数。C双精度浮点数52位尾数,它给你整数精确值为2 ^ 52岁,在约e15的顺序。不使用后缀f来指定要将浮点字面值解释为单个精度,这个字面值将是双精度,隐式转换也将针对这个精度。只要你的循环结束条件少2 ^ 52将工作可靠!
Now one question you have to think about on the x86 architecture is efficiency. The very first 80x87 FPUs came in a different package, and later a different chip and as aresult getting values into the FPU registers is a bit awkward on the x86 assembly level. Depending on what your intentions are it might make the difference in runtime for a realtime application; but that's premature optimization.
在x86体系结构上,您必须考虑的一个问题是效率。最初的80x87 FPU是在一个不同的包中出现的,后来又出现了一个不同的芯片,在x86汇编级别上,将值输入到FPU寄存器有点麻烦。根据您的意图,它可能会对实时应用程序的运行时产生影响;但这是过早优化。
TL;DR: Is it safe to to? Most certainly yes. Will it cause trouble? It could cause numerical problems. Could it invoke undefined behavior? Depends on how you use the loop end condition, but if i
is used to index an array and for some reason the array length ended up in a floating point variable always truncating toward zero it's not going to cause a logical problem. Is it a smart thing to do? Depends on the application.
医生:去那里安全吗?肯定是的。会带来麻烦吗?它可能会引起数值问题。它会引发未定义的行为吗?这取决于你如何使用循环结束条件,但是如果我被用来索引一个数组,由于某种原因,数组的长度最后变成了一个浮点变量,它总是趋向于0,这不会引起逻辑问题。这是明智的做法吗?取决于应用程序。
#1
50
The elephant in the room here is that the range of an int
could be as small as -32767 to +32767, and the behaviour on assigning a larger value than this to such an int
is undefined.
在这里的房间里的大象是一个整数的范围可以是-32767到+32767,而将一个更大的值分配给这样一个整数的行为是没有定义的。
But, as for your main point, indeed it should concern you as it is a very bad habit. Things could go wrong as yes, 1e7 is a floating point double type.
但是,就你的主要观点而言,它确实应该引起你的关注,因为它是一个非常坏的习惯。事情可能会出错,是的,1e7是一个浮点双类型。
The fact that i
will be converted to a floating point due to type promotion rules is somewhat moot: the real damage is done if there is unexpected truncation of the apparent integral literal. By the way of a "proof by example", consider first the loop
由于类型升级规则,我将被转换为一个浮点数,这一点有点奇怪:如果明显的整数文字被截断,那么实际的损害就会发生。通过“示例证明”,首先考虑循环
for (std::uint64_t i = std::numeric_limits<std::uint64_t>::max() - 1024; i ++< 18446744073709551615ULL; ){
std::cout << i << "\n";
}
This outputs every consecutive value of i
in the range, as you'd expect. Note that std::numeric_limits<std::uint64_t>::max()
is 18446744073709551615ULL
, which is 1 less than the 64th power of 2. (Here I'm using a slide-like "operator" ++<
which is useful when working with unsigned
types. Many folk consider -->
and ++<
as obfuscating but in scientific programming they are common, particularly -->
.)
它输出i在范围内的每个连续值,正如您所期望的。注意,std::numeric_limits
Now on my machine, a double is an IEEE754 64 bit floating point. (Such as scheme is particularly good at representing powers of 2 exactly - IEEE754 can represent powers of 2 up to 1022 exactly.) So 18,446,744,073,709,551,616
(the 64th power of 2) can be represented exactly as a double. The nearest representable number before that is 18,446,744,073,709,550,592
(which is 1024 less).
在我的机器上,double是IEEE754位浮点数。(例如scheme尤其擅长精确地表示2的幂——IEEE754可以精确地表示2到1022的幂。)所以18446,744,073,709,551,616(2的64次幂)可以精确地表示为一个双数。在此之前最近的可表示数字是18446,744,073,709,550,592(少了1024)。
So now let's write the loop as
现在我们把循环写成
for (std::uint64_t i = std::numeric_limits<std::uint64_t>::max() - 1024; i ++< 1.8446744073709551615e19; ){
std::cout << i << "\n";
}
On my machine that will only output one value of i
: 18,446,744,073,709,550,592
(the number that we've already seen). This proves that 1.8446744073709551615e19
is a floating point type. If the compiler was allowed to treat the literal as an integral type then the output of the two loops would be equivalent.
在我的机器上,它只输出i的一个值:18,446,744,73,709,550,592(我们已经看到的数字)。这证明了1.8446744073709551615e19是一种浮点类型。如果编译器被允许将文字视为一个积分类型,那么两个循环的输出将是等价的。
#2
14
It will work, assuming that your int
is at least 32 bits.
它将工作,假设你的int至少是32位。
However, if you really want to use exponential notation, you should better define an integer constant outside the loop and use proper casting, like this:
但是,如果您真的想使用指数表示法,您应该在循环之外定义一个整数常数,并使用适当的类型转换,如下所示:
const int MAX_INDEX = static_cast<int>(1.0e7);
...
for (int i = 0; i < MAX_INDEX; i++) {
...
}
Considering this, I'd say it is much better to write
考虑到这一点,我认为最好还是写下来
const int MAX_INDEX = 10000000;
or if you can use C++14
或者你可以用c++ 14。
const int MAX_INDEX = 10'000'000;
#3
10
1e7
is a literal of type double
, and usually double
is 64-bit IEEE 754 format with a 52-bit mantissa. Roughly every tenth power of 2 corresponds to a third power of 10, so double
should be able to represent integers up to at least 105*3 = 1015, exactly. And if int
is 32-bit then int
has roughly 103*3 = 109 as max value (asking Google search it says that "2**31 - 1" = 2 147 483 647, i.e. twice the rough estimate).
1e7是double类型的文字,double类型通常是64位的IEEE 754格式,具有52位的尾数。大约2的每十分之一次方对应着10的三次方,所以2应该能够表示至少105*3 = 1015的整数。如果int是32位,那么int大约有103*3 = 109为最大值(问谷歌搜索,它说“2**31 - 1”= 2 147 483 647,即粗略估计的两倍)。
So, in practice it's safe on current desktop systems and larger.
因此,在实践中,它在当前的桌面系统和更大的系统上是安全的。
But C++ allows int
to be just 16 bits, and on e.g. an embedded system with that small int
, one would have Undefined Behavior.
但是,c++允许int仅为16位,并且在例如一个带有小int的嵌入式系统上,一个人会有未定义的行为。
#4
2
If the intention to loop for a exact integer number of iterations, for example if iterating over exactly all the elements in an array then comparing against a floating point value is maybe not such a good idea, solely for accuracy reasons; since the implicit cast of an integer to float will truncate integers toward zero there's no real danger of out-of-bounds access, it will just abort the loop short.
如果想要循环一个精确的整数次迭代,例如迭代数组中的所有元素,然后与浮点值进行比较,这可能不是一个好主意,仅仅是出于准确性的原因;由于对浮点数的隐式强制转换将把整数截断为零,因此不存在越界访问的真正危险,因此它将终止循环。
Now the question is: When do these effects actually kick in? Will your program experience them? The floating point representation usually used these days is IEEE 754. As long as the exponent is 0 a floating point value is essentially an integer. C double precision floats 52 bits for the mantissa, which gives you integer precision to a value of up to 2^52, which is in the order of about 1e15. Without specifying with a suffix f
that you want a floating point literal to be interpreted single precision the literal will be double precision and the implicit conversion will target that as well. So as long as your loop end condition is less 2^52 it will work reliably!
现在的问题是:这些效应是什么时候产生的?你的程序会体验它们吗?目前常用的浮点表示法是IEEE 754。只要指数为0,浮点值本质上就是一个整数。C双精度浮点数52位尾数,它给你整数精确值为2 ^ 52岁,在约e15的顺序。不使用后缀f来指定要将浮点字面值解释为单个精度,这个字面值将是双精度,隐式转换也将针对这个精度。只要你的循环结束条件少2 ^ 52将工作可靠!
Now one question you have to think about on the x86 architecture is efficiency. The very first 80x87 FPUs came in a different package, and later a different chip and as aresult getting values into the FPU registers is a bit awkward on the x86 assembly level. Depending on what your intentions are it might make the difference in runtime for a realtime application; but that's premature optimization.
在x86体系结构上,您必须考虑的一个问题是效率。最初的80x87 FPU是在一个不同的包中出现的,后来又出现了一个不同的芯片,在x86汇编级别上,将值输入到FPU寄存器有点麻烦。根据您的意图,它可能会对实时应用程序的运行时产生影响;但这是过早优化。
TL;DR: Is it safe to to? Most certainly yes. Will it cause trouble? It could cause numerical problems. Could it invoke undefined behavior? Depends on how you use the loop end condition, but if i
is used to index an array and for some reason the array length ended up in a floating point variable always truncating toward zero it's not going to cause a logical problem. Is it a smart thing to do? Depends on the application.
医生:去那里安全吗?肯定是的。会带来麻烦吗?它可能会引起数值问题。它会引发未定义的行为吗?这取决于你如何使用循环结束条件,但是如果我被用来索引一个数组,由于某种原因,数组的长度最后变成了一个浮点变量,它总是趋向于0,这不会引起逻辑问题。这是明智的做法吗?取决于应用程序。