Question says it all. Does anyone know if the following...
问题说明了一切。有人知道以下的……
size_t div(size_t value) {
const size_t x = 64;
return value / x;
}
...is optimized into?
…优化?
size_t div(size_t value) {
return value >> 6;
}
Do compilers do this? (My interest lies in GCC). Are there situations where it does and others where it doesn't?
这个编译器做什么?(我对GCC很感兴趣。)有没有这样的情况?
I would really like to know, because every time I write a division that could be optimized like this I spend some mental energy wondering about whether precious nothings of a second is wasted doing a division where a shift would suffice.
我真的很想知道,因为每次我写一个可以优化的除法,我花了一些精力去思考是否浪费了一秒钟的宝贵的东西,去做一个转移就足够了的除法。
4 个解决方案
#1
43
Even with g++ -O0
(yes, -O0
!), this happens. Your function compiles down to:
即使使用g+ -O0(是的,-O0!),也会发生这种情况。你的功能汇编成:
_Z3divm:
.LFB952:
pushq %rbp
.LCFI0:
movq %rsp, %rbp
.LCFI1:
movq %rdi, -24(%rbp)
movq $64, -8(%rbp)
movq -24(%rbp), %rax
shrq $6, %rax
leave
ret
Note the shrq $6
, which is a right shift by 6 places.
注意shrq $6,这是一个右移6个位置。
With -O1
, the unnecessary junk is removed:
与-O1,不必要的垃圾被删除:
_Z3divm:
.LFB1023:
movq %rdi, %rax
shrq $6, %rax
ret
Results on g++ 4.3.3, x64.
结果为g+ 4.3.3, x64。
#2
29
Most compilers will go even further than reducing division by powers of 2 into shifts - they'll often convert integer division by a constant into a series of multiplication, shift, and addition instructions to get the result instead of using the CPU's built-in divide instruction (if there even is one).
大多数编译器甚至会更进一步,将除法除以2的幂化为移位——它们通常会将整数除法除以一个常数转换成一系列的乘法、移位和加法指令,以获得结果,而不是使用CPU的内置除法指令(如果有的话)。
For example, MSVC converts division by 71 to the following:
例如,MSVC将部门由71转换为以下内容:
// volatile int y = x / 71;
8b 0c 24 mov ecx, DWORD PTR _x$[esp+8] ; load x into ecx
b8 49 b4 c2 e6 mov eax, -423447479 ; magic happens starting here...
f7 e9 imul ecx ; edx:eax = x * 0xe6c2b449
03 d1 add edx, ecx ; edx = x + edx
c1 fa 06 sar edx, 6 ; edx >>= 6 (with sign fill)
8b c2 mov eax, edx ; eax = edx
c1 e8 1f shr eax, 31 ; eax >>= 31 (no sign fill)
03 c2 add eax, edx ; eax += edx
89 04 24 mov DWORD PTR _y$[esp+8], eax
So, you get a divide by 71 with a multiply, a couple shifts and a couple adds.
得到除以71,再乘以,几次平移,再加上几个。
For more details on what's going on, consult Henry Warren's "Hacker's Delight" book or the companion webpage:
想要了解更多的细节,可以参考亨利·沃伦的《黑客之乐》一书或相关网页:
- http://www.hackersdelight.org/
- http://www.hackersdelight.org/
There's an online added chapter that provides some addition information about about division by constants using multiplication/shift/add with magic numbers, and a page with a little JavaScript program that'll calculate the magic numbers you need.
有一个在线添加的章节,提供了一些关于使用乘法/移位/添加的常量除法的附加信息,以及一个带有一个小JavaScript程序的页面,该程序将计算您需要的神奇数字。
The companion site for the book is well worth reading (as is the book) - particularly if you're interested in bit-level micro optimizations.
这本书的配套网站非常值得一读(就像这本书一样)——尤其是如果你对位级微优化感兴趣的话。
Another article that I discovered just now that discusses this optimization: http://blogs.msdn.com/devdev/archive/2005/12/12/502980.aspx
我刚刚发现的另一篇文章讨论了这种优化:http://blogs.msdn.com/devdev/archive/2005/12/12/502980.aspx
#3
19
Only when it can determine that the argument is positive. That's the case for your example, but ever since C99 specified round-towards-zero semantics for integer division, it has become harder to optimize division by powers of two into shifts, because they give different results for negative arguments.
只有当它能确定论点是积极的时候。这是您的示例的情况,但是自从C99为整型除法指定了圆心- 0语义以来,将2的幂进行的除法优化为移位变得越来越困难,因为它们为负的参数提供了不同的结果。
In reaction to Michael's comment below, here is one way the division r=x/p;
of x
by a known power of two p
can indeed be translated by the compiler:
针对Michael下面的评论,这里有一种方法可以将x的r=x/p除以已知的2 p的幂进行翻译:
if (x<0)
x += p-1;
r = x >> (log2 p);
Since the OP was asking whether he should think about these things, one possible answer would be "only if you know the dividend's sign better than the compiler or know that it doesn't matter if the result is rounded towards 0 or -∞".
自OP问他是否应该考虑这些事情,一个可能的答案将是“只有如果你知道股利的信号比编译器或知道不管结果是圆角对0或-∞”。
#4
3
Yes, compilers generate the most optimal code for such simplistic calculations. However, why you are insisting specifically on "shifts" is not clear to me. The optimal code for a given platform might easily turn out to be something different from a "shift".
是的,编译器为这种简单的计算生成最优的代码。然而,我不清楚你为什么坚持“轮班制”。给定平台的最优代码很容易被证明与“转移”不同。
In general case the old and beaten-to-death idea that a "shift" is somehow the most optimal way to implement power-of-two multiplications and divisions has very little practical relevance on modern platforms. It is a good way to illustrate the concept of "optimization" to newbies, but no more than that.
通常情况下,旧的和被打败的观点认为“转移”是实现两种乘法和除法运算的最优方式,在现代平台上几乎没有实际意义。对于新手来说,这是一个很好的说明“优化”概念的方法,但仅此而已。
Your original example is not really representative, because it uses an unsigned type, which greatly simplifies the implementation of division operation. The "round towards zero" requirement of the C and C++ languages makes it impossible to do division with a mere shift if the operand is signed.
您的原始示例并不具有代表性,因为它使用无符号类型,这极大地简化了分部操作的实现。C语言和c++语言的“圆到零”的要求使得如果操作数被签名,就不可能只进行一个移位。
#1
43
Even with g++ -O0
(yes, -O0
!), this happens. Your function compiles down to:
即使使用g+ -O0(是的,-O0!),也会发生这种情况。你的功能汇编成:
_Z3divm:
.LFB952:
pushq %rbp
.LCFI0:
movq %rsp, %rbp
.LCFI1:
movq %rdi, -24(%rbp)
movq $64, -8(%rbp)
movq -24(%rbp), %rax
shrq $6, %rax
leave
ret
Note the shrq $6
, which is a right shift by 6 places.
注意shrq $6,这是一个右移6个位置。
With -O1
, the unnecessary junk is removed:
与-O1,不必要的垃圾被删除:
_Z3divm:
.LFB1023:
movq %rdi, %rax
shrq $6, %rax
ret
Results on g++ 4.3.3, x64.
结果为g+ 4.3.3, x64。
#2
29
Most compilers will go even further than reducing division by powers of 2 into shifts - they'll often convert integer division by a constant into a series of multiplication, shift, and addition instructions to get the result instead of using the CPU's built-in divide instruction (if there even is one).
大多数编译器甚至会更进一步,将除法除以2的幂化为移位——它们通常会将整数除法除以一个常数转换成一系列的乘法、移位和加法指令,以获得结果,而不是使用CPU的内置除法指令(如果有的话)。
For example, MSVC converts division by 71 to the following:
例如,MSVC将部门由71转换为以下内容:
// volatile int y = x / 71;
8b 0c 24 mov ecx, DWORD PTR _x$[esp+8] ; load x into ecx
b8 49 b4 c2 e6 mov eax, -423447479 ; magic happens starting here...
f7 e9 imul ecx ; edx:eax = x * 0xe6c2b449
03 d1 add edx, ecx ; edx = x + edx
c1 fa 06 sar edx, 6 ; edx >>= 6 (with sign fill)
8b c2 mov eax, edx ; eax = edx
c1 e8 1f shr eax, 31 ; eax >>= 31 (no sign fill)
03 c2 add eax, edx ; eax += edx
89 04 24 mov DWORD PTR _y$[esp+8], eax
So, you get a divide by 71 with a multiply, a couple shifts and a couple adds.
得到除以71,再乘以,几次平移,再加上几个。
For more details on what's going on, consult Henry Warren's "Hacker's Delight" book or the companion webpage:
想要了解更多的细节,可以参考亨利·沃伦的《黑客之乐》一书或相关网页:
- http://www.hackersdelight.org/
- http://www.hackersdelight.org/
There's an online added chapter that provides some addition information about about division by constants using multiplication/shift/add with magic numbers, and a page with a little JavaScript program that'll calculate the magic numbers you need.
有一个在线添加的章节,提供了一些关于使用乘法/移位/添加的常量除法的附加信息,以及一个带有一个小JavaScript程序的页面,该程序将计算您需要的神奇数字。
The companion site for the book is well worth reading (as is the book) - particularly if you're interested in bit-level micro optimizations.
这本书的配套网站非常值得一读(就像这本书一样)——尤其是如果你对位级微优化感兴趣的话。
Another article that I discovered just now that discusses this optimization: http://blogs.msdn.com/devdev/archive/2005/12/12/502980.aspx
我刚刚发现的另一篇文章讨论了这种优化:http://blogs.msdn.com/devdev/archive/2005/12/12/502980.aspx
#3
19
Only when it can determine that the argument is positive. That's the case for your example, but ever since C99 specified round-towards-zero semantics for integer division, it has become harder to optimize division by powers of two into shifts, because they give different results for negative arguments.
只有当它能确定论点是积极的时候。这是您的示例的情况,但是自从C99为整型除法指定了圆心- 0语义以来,将2的幂进行的除法优化为移位变得越来越困难,因为它们为负的参数提供了不同的结果。
In reaction to Michael's comment below, here is one way the division r=x/p;
of x
by a known power of two p
can indeed be translated by the compiler:
针对Michael下面的评论,这里有一种方法可以将x的r=x/p除以已知的2 p的幂进行翻译:
if (x<0)
x += p-1;
r = x >> (log2 p);
Since the OP was asking whether he should think about these things, one possible answer would be "only if you know the dividend's sign better than the compiler or know that it doesn't matter if the result is rounded towards 0 or -∞".
自OP问他是否应该考虑这些事情,一个可能的答案将是“只有如果你知道股利的信号比编译器或知道不管结果是圆角对0或-∞”。
#4
3
Yes, compilers generate the most optimal code for such simplistic calculations. However, why you are insisting specifically on "shifts" is not clear to me. The optimal code for a given platform might easily turn out to be something different from a "shift".
是的,编译器为这种简单的计算生成最优的代码。然而,我不清楚你为什么坚持“轮班制”。给定平台的最优代码很容易被证明与“转移”不同。
In general case the old and beaten-to-death idea that a "shift" is somehow the most optimal way to implement power-of-two multiplications and divisions has very little practical relevance on modern platforms. It is a good way to illustrate the concept of "optimization" to newbies, but no more than that.
通常情况下,旧的和被打败的观点认为“转移”是实现两种乘法和除法运算的最优方式,在现代平台上几乎没有实际意义。对于新手来说,这是一个很好的说明“优化”概念的方法,但仅此而已。
Your original example is not really representative, because it uses an unsigned type, which greatly simplifies the implementation of division operation. The "round towards zero" requirement of the C and C++ languages makes it impossible to do division with a mere shift if the operand is signed.
您的原始示例并不具有代表性,因为它使用无符号类型,这极大地简化了分部操作的实现。C语言和c++语言的“圆到零”的要求使得如果操作数被签名,就不可能只进行一个移位。