Consider the following function:
考虑以下功能:
template <class T, class Priority>
void MutableQueue<T, Priority>::update(const T& item, const Priority& priority)
{
...
}
Would modern x86-64 compilers be smart enough to pass the priority argument by value rather than reference if the priority type could fit within a register?
如果优先级类型可以放在寄存器中,那么现代x86-64编译器是否足够智能以按值传递优先级参数而不是引用?
3 个解决方案
#1
As @black mentioned, optimizations are compiler and platform dependent. That said, we typically expect a number of optimizations to happen day-to-day when using a good optimizing compiler. For instance, we count on function inlining, register allocation, converting constant multiplications and divisions to bit-shifts when possible, etc.
正如@black所提到的,优化依赖于编译器和平台。也就是说,我们通常期望在使用优秀的优化编译器时,每天都会进行一些优化。例如,我们依靠函数内联,寄存器分配,将常数乘法和除法转换为可能的位移等。
To answer your question
回答你的问题
Would modern x86-64 compilers be smart enough to pass the priority argument by value rather than reference if the priority type could fit within a register?
如果优先级类型可以放在寄存器中,那么现代x86-64编译器是否足够智能以按值传递优先级参数而不是引用?
I'll simply try it out. See for your self:
我会简单地尝试一下。你自己看:
- GCC latest (without inlining)
- CLANG 3.5.1 (without inlining)
GCC最新(没有内联)
CLANG 3.5.1(无内联)
This is the code:
这是代码:
template<typename T>
T square(const T& num) {
return num * num;
}
int sq(int x) {
return square(x);
}
GCC -O3
, -O2
, and -O1
reliably perform this optimization.
GCC-O3,-O2和-O1可靠地执行此优化。
Clang 3.5.1, on the other hand, does not seem to perform this optimization.
另一方面,Clang 3.5.1似乎没有执行这种优化。
Should you count on such optimization happening? Not always, and not absolutely--the C++ standard says nothing about when an optimization like this could take place. In practice, if you are using GCC, you can 'expect' the optimization to take place.
你应该指望这样的优化发生吗?并不总是,而不是绝对 - C ++标准没有说明何时可以进行这样的优化。实际上,如果您正在使用GCC,您可以“期望”进行优化。
If you absolutely positively want to ensure that such optimization happens, you will want to use template specialization.
如果您绝对肯定希望确保进行此类优化,则需要使用模板专业化。
#2
Compiler may do the optimization, but it is not mandatory.
编译器可以进行优化,但不是强制性的。
To force to pass the "best" type, you may use boost: http://www.boost.org/doc/libs/1_55_0/libs/utility/call_traits.htm
要强制传递“最佳”类型,您可以使用boost:http://www.boost.org/doc/libs/1_55_0/libs/utility/call_traits.htm
Replacing const T&
(where passing by value is correct) by call_traits<T>::param_type
.
用call_traits
So your code may become:
所以你的代码可能变成:
template <class T, class Priority>
void MutableQueue<T, Priority>::update(call_traits<T>::param_type item,
call_traits<Priority>::param_type priority)
{
...
}
#3
This is totally platform and compiler dependent and so is how the arguments are passed to a function.
These specifics are defined in the ABI of the system the program runs on; some have a large number of registers and therefore use them mainly. Some push them all on the stack. Some mix them together up to N-th parameter.
这完全取决于平台和编译器,参数传递给函数的方式也是如此。这些细节在程序运行的系统的ABI中定义;一些寄存器有很多寄存器,因此主要使用它们。有些人将它们全部推到堆叠上。有些人将它们混合在一起直到第N个参数。
Again, it is something you cannot rely on; you can check it in a couple of ways, though. The C++ language has no concept of a register.
同样,这是你不能依赖的东西;但是你可以通过几种方式检查它。 C ++语言没有寄存器的概念。
#1
As @black mentioned, optimizations are compiler and platform dependent. That said, we typically expect a number of optimizations to happen day-to-day when using a good optimizing compiler. For instance, we count on function inlining, register allocation, converting constant multiplications and divisions to bit-shifts when possible, etc.
正如@black所提到的,优化依赖于编译器和平台。也就是说,我们通常期望在使用优秀的优化编译器时,每天都会进行一些优化。例如,我们依靠函数内联,寄存器分配,将常数乘法和除法转换为可能的位移等。
To answer your question
回答你的问题
Would modern x86-64 compilers be smart enough to pass the priority argument by value rather than reference if the priority type could fit within a register?
如果优先级类型可以放在寄存器中,那么现代x86-64编译器是否足够智能以按值传递优先级参数而不是引用?
I'll simply try it out. See for your self:
我会简单地尝试一下。你自己看:
- GCC latest (without inlining)
- CLANG 3.5.1 (without inlining)
GCC最新(没有内联)
CLANG 3.5.1(无内联)
This is the code:
这是代码:
template<typename T>
T square(const T& num) {
return num * num;
}
int sq(int x) {
return square(x);
}
GCC -O3
, -O2
, and -O1
reliably perform this optimization.
GCC-O3,-O2和-O1可靠地执行此优化。
Clang 3.5.1, on the other hand, does not seem to perform this optimization.
另一方面,Clang 3.5.1似乎没有执行这种优化。
Should you count on such optimization happening? Not always, and not absolutely--the C++ standard says nothing about when an optimization like this could take place. In practice, if you are using GCC, you can 'expect' the optimization to take place.
你应该指望这样的优化发生吗?并不总是,而不是绝对 - C ++标准没有说明何时可以进行这样的优化。实际上,如果您正在使用GCC,您可以“期望”进行优化。
If you absolutely positively want to ensure that such optimization happens, you will want to use template specialization.
如果您绝对肯定希望确保进行此类优化,则需要使用模板专业化。
#2
Compiler may do the optimization, but it is not mandatory.
编译器可以进行优化,但不是强制性的。
To force to pass the "best" type, you may use boost: http://www.boost.org/doc/libs/1_55_0/libs/utility/call_traits.htm
要强制传递“最佳”类型,您可以使用boost:http://www.boost.org/doc/libs/1_55_0/libs/utility/call_traits.htm
Replacing const T&
(where passing by value is correct) by call_traits<T>::param_type
.
用call_traits
So your code may become:
所以你的代码可能变成:
template <class T, class Priority>
void MutableQueue<T, Priority>::update(call_traits<T>::param_type item,
call_traits<Priority>::param_type priority)
{
...
}
#3
This is totally platform and compiler dependent and so is how the arguments are passed to a function.
These specifics are defined in the ABI of the system the program runs on; some have a large number of registers and therefore use them mainly. Some push them all on the stack. Some mix them together up to N-th parameter.
这完全取决于平台和编译器,参数传递给函数的方式也是如此。这些细节在程序运行的系统的ABI中定义;一些寄存器有很多寄存器,因此主要使用它们。有些人将它们全部推到堆叠上。有些人将它们混合在一起直到第N个参数。
Again, it is something you cannot rely on; you can check it in a couple of ways, though. The C++ language has no concept of a register.
同样,这是你不能依赖的东西;但是你可以通过几种方式检查它。 C ++语言没有寄存器的概念。