将minmax_element与min_element和max_element一起使用是否有效率优势?

时间:2022-08-13 22:49:54

std::minmax_element : returns a pair consisting of an iterator to the smallest element as the first element and an iterator to the greatest element as the second.

std :: minmax_element:返回一对由迭代器组成的最小元素作为第一个元素,最大元素的迭代器作为第二个元素。

std::min_element : returns an iterator to the smallest element in the range [first, last).

std :: min_element:将迭代器返回到[first,last]范围内的最小元素。

std::max_element : returns an iterator to the largest element in the range [first, last).

std :: max_element:将迭代器返回到[first,last]范围内的最大元素。


Does std::minmax_element uses sorting of complete list to achieve this?

std :: minmax_element是否使用完整列表的排序来实现此目的?

Is the overhead of processing returned pair from std::minmax_element worthy enough?

从std :: minmax_element处理返回对的开销是否足够值得?

4 个解决方案

#1


11  

You do not have to worry about std::minmax_element doing any sorting. It leaves the range in the exact way it was traversed. The reason it is more efficient is it can find both the max and min in a single pass where when looking for max and min separately you have to do two full traversals.

您不必担心std :: minmax_element正在进行任何排序。它以精确的方式离开范围。它更有效的原因是它可以在单个通道中找到最大值和最小值,而在分别查找最大值和最小值时,您必须执行两次完整遍历。

std::minmax_element has the complexity of max(floor(3/2(N−1)), 0) where as std::max_element and std::min_element each are max(N-1,0) so it is about 25% less operations using std::minmax_element

std :: minmax_element的复杂度为max(floor(3/2(N-1)),0)其中std :: max_element和std :: min_element都是max(N-1,0)所以大约是25使用std :: minmax_element减少运算次数

There is also a difference where std::minmax_element finds the last largest element while std::max_element finds the first largest.

std :: minmax_element找到最后一个最大元素,而std :: max_element找到第一个最大元素,这也是一个区别。

So if you need to find the min and max of a range then you should use std::minmax_element. If you only need the min or max then you should use the specialized version. Dealing with the return from std::minmax_element will get even easier with the upcoming C++17 standard and structured bindings. You will be able to write

因此,如果您需要找到范围的最小值和最大值,那么您应该使用std :: minmax_element。如果您只需要最小值或最大值,则应使用专用版本。使用即将推出的C ++ 17标准和结构化绑定,处理std :: minmax_element的返回将变得更加容易。你将能够写作

auto [min, max] = std::minmax_element(...);

and now the first element of the pair is stored in min and the second is stored in max.

现在该对的第一个元素存储在min中,第二个元素存储在max中。

#2


6  

The other answers are good. I wanted to add a little about how minmax_element necessarily works, however, which also helps to explain why it (usually) works better than running min_element and max_element separately, and talk about some specific cases where it doesn't perform better.

其他答案都很好。我想补充一点关于minmax_element必然如何工作,但是,这也有助于解释为什么它(通常)比分别运行min_element和max_element更好,并讨论一些不能更好地执行的特定情况。

If we think about a naive implementation, you would maintain a max-value and min-value (and their corresponding iterators) and simply iterate through the range, comparing each value with both the min-value and max-value and adjusting either as necessary. However, this would give you a total of 2N comparisons; while it might well perform better than iterating through the list twice (due to better use of locality), the specification calls for (roughly) 3/2 N comparisons. How is that possible then?

如果我们考虑一个简单的实现,你将保持最大值和最小值(及其相应的迭代器)并简单地遍历范围,将每个值与最小值和最大值进行比较并根据需要进行调整。但是,这将给你总共2N的比较;虽然它可能比在列表中迭代两次(由于更好地使用局部性)表现更好,但规范要求(大致)3/2 N比较。那怎么可能呢?

It works by processing pairs rather than individual items. Taking the first two items in the range (#0 and #1), we can compare them and assign the largest to max-value and the smallest to min-value. Then, we compare the next two items (#3 and #4) to decide which of them is larger; we compare the larger one to max-value, and the smaller one to min-value, and update max-value/min-value as necessary. We then repeat this process with each additional pair (#5 and #6, then #7 and #8 and so on).

它通过处理对而不是单个项来工作。取范围(#0和#1)中的前两项,我们可以比较它们并将最大值分配给最大值,将最小值分配给最小值。然后,我们比较接下来的两个项目(#3和#4)来决定哪个项目更大;我们将较大的值与最大值进行比较,将较小的值与最小值进行比较,并根据需要更新max-value / min-value。然后我们用另外一对(#5和#6,然后#7和#8等)重复这个过程。

So, each pair requires three comparisons - with each other, then the highest with the current maximum, and the lowest with the current minimum. This cuts down the number of comparisons required to 3/2 N!

因此,每对需要三次比较 - 相互之间,然后是当前最大值的最高值,最低值是当前最小值。这减少了3/2 N所需的比较次数!

As per comments below, however, it should be noted that this "improved" algorithm tends to produce worse performance on modern processors than the naive version when using a type (or comparator) where the comparison is cheap - notably, with a range over a vector<int> or similar: the comparison between the two elements of each pair has an unpredictable result, leading to branch prediction failure in the processor (though this is only true if the data is more-or-less randomly ordered); current compilers will not always convert branches into conditional transfers, as they potentially could. Also, it is harder for a compiler to vectorise the more complex algorithm.

然而,根据下面的评论,应该注意的是,当使用比较便宜的类型(或比较器)时,这种“改进的”算法往往会在现代处理器上产生比天真版本更差的性能 - 特别是在范围超过vector 或类似:每对中两个元素之间的比较具有不可预测的结果,导致处理器中的分支预测失败(尽管只有在数据或多或少随机排序时才会出现这种情况);当前的编译器并不总是将分支转换为条件转移,因为它们可能。此外,编译器更难以矢量化更复杂的算法。

In theory, I think, a C++ library implementation could provide an overload for the minmax_element function which used the naive algorithm for primitive (int etc) element types with the default comparator. While the standard mandates a bound to the number of comparisons, if the effect of those comparisons cannot be observed then the number actually computed is unimportant, so long as the time complexity is the same (which it is - O(N) in both cases). However, while this might give better performance with randomly ordered data, it might produce worse performance when the data is ordered.

理论上,我认为,C ++库实现可以为minmax_element函数提供重载,该函数使用带有默认比较器的原始(int等)元素类型的朴素算法。虽然标准强制要求比较次数,但如果无法观察到这些比较的影响,那么实际计算的数量并不重要,只要时间复杂度相同(在两种情况下都是 - O(N) )。但是,虽然这可能会提供随机排序数据的更好性能,但在订购数据时可能会产生更差的性能。

With all the above in mind, a simple test case (below) shows an interesting result: for randomly ordered data, using min_element and max_element separately can in fact be slightly faster than using minmax_element. However, for sorted data, minmax_element is much faster than using min_element and max_element separately. On my system (Haswell processor) the below (compiled with gcc -O3 -std=c++11 -march=native, GCC version 5.4) a sample run shows 692 milliseconds for min/max separately and 848 for minmax combined. There is of course some variation between runs but these values seem typical.

考虑到上述所有情况,一个简单的测试用例(下面)显示了一个有趣的结果:对于随机排序的数据,分别使用min_element和max_element可能比使用minmax_element稍快。但是,对于排序数据,minmax_element比分别使用min_element和max_element要快得多。在我的系统(Haswell处理器)下面(使用gcc -O3 -std = c ++ 11 -march = native,GCC版本5.4编译),样本运行分别显示692毫秒的最小值/最大值,848表示最小值组合。当然,运行之间存在一些差异,但这些值似乎很典型。

Note that:

  • The difference in performance is small enough that it is unlikely to be a dominating factor in a real-world program;
  • 性能差异很小,不太可能成为现实世界计划的主导因素;

  • The difference is dependent on compiler optimisation; in the future, the results may well reverse;
  • 差异取决于编译器优化;在未来,结果可能会逆转;

  • for more complex data types (or rather for more complex comparators) the results will likely reverse, since in that case the fewer comparisons will likely be a significant improvement.
  • 对于更复杂的数据类型(或者更复杂的比较器),结果可能会逆转,因为在这种情况下,较少的比较可能是一个显着的改进。

  • when the sample data is ordered instead of random (replace v.push_back(r(gen)) with v.push_back(i) in the program below) the performance is very different: for separate min/max, it's around 728 milliseconds, whereas for combined minmax, it goes down to 246 milliseconds.
  • 当样本数据被排序而不是随机(用下面的程序中的v.push_back(i)代替v.push_back(r(gen))时,性能是非常不同的:对于单独的最小值/最大值,它大约是728毫秒,而对于组合的minmax,它下降到246毫秒。

The code:

#include <iostream>
#include <vector>
#include <algorithm>
#include <random>
#include <chrono>

constexpr int numEls = 100000000;

void recresult(std::vector<int> *v, int min, int max)
{
   // Make sure the compiler doesn't optimize out the values: 
   __asm__ volatile (
       ""
       :
       : "rm"(v), "rm"(min), "rm"(max)
   );
}

int main(int argc, char **argv)
{
    using namespace std;

    std::mt19937 gen(0);
    uniform_int_distribution<> r(0, 100000);


    vector<int> v;
    for (int i = 0; i < numEls; i++) {
        v.push_back(r(gen));
    }

    // run once for warmup
    int min = *min_element(v.begin(), v.end());
    int max = *max_element(v.begin(), v.end());
    recresult(&v, min, max);

    // min/max separately:
    {
        auto starttime = std::chrono::high_resolution_clock::now();
        for (int i = 0; i < 5; i++) {
        int min = *min_element(v.begin(), v.end());
            int max = *max_element(v.begin(), v.end());
            recresult(&v, min, max);
        }
        auto endtime = std::chrono::high_resolution_clock::now();
        auto millis = std::chrono::duration_cast<std::chrono::milliseconds>(endtime - starttime).count();

        cout << "min/max element: " << millis << " milliseconds." << endl;
    }

    // run once for warmup
    auto minmaxi = minmax_element(v.begin(), v.end());
    recresult(&v, *(minmaxi.first), *(minmaxi.second));

    // minmax together:
    {
        auto starttime = std::chrono::high_resolution_clock::now();
        for (int i = 0; i < 5; i++) {
        minmaxi = minmax_element(v.begin(), v.end());
        recresult(&v, *(minmaxi.first), *(minmaxi.second));
        }
        auto endtime = std::chrono::high_resolution_clock::now();
        auto millis = std::chrono::duration_cast<std::chrono::milliseconds>(endtime - starttime).count();

        cout << "minmax element: " << millis << " milliseconds." << endl;
    }

    return 0;
}

#3


5  

Yes. You're iterating over the range only once instead of doing it twice.

是。你只需要在范围内迭代一次而不是两次。

#4


3  

std::minmax_element complexity:

At most max(floor(3/2(N−1)), 0) applications of the predicate, where N = std::distance(first, last).

至多max(floor(3/2(N-1)),0)谓词的应用,其中N = std :: distance(first,last)。

std::min_element complexity (same as max_element):

std :: min_element复杂度(与max_element相同):

Exactly max(N-1,0) comparisons, where N = std::distance(first, last).

恰好是最大(N-1,0)比较,其中N = std :: distance(first,last)。

Ignoring max and floor, we get:

忽略最大值和最大值,我们得到:

(N-1) * 2 vs 3/2 (N-1)

So by using minmax_element you get 3/4 of the comparisons you would need by using max_element + min_element, or better.

因此,通过使用minmax_element,您可以使用max_element + min_element或更好的方式获得3/4的比较。

minmax_element uses the transitivity of the < operator, it knows that if it is updating the minimum it does not need to compare for the maximum by comparing two elements at once, i.e. if a < b then we only need to check min(a, current_min) and max(b, current_max), and vice-versa.

minmax_element使用 <运算符的传递性,它知道如果它正在更新最小值,则不需要通过一次比较两个元素来比较最大值,即如果a 则我们只需要检查min(a,current_min)>

Also worth noting:

另外值得注意的是:

This algorithm is different from std::make_pair(std::min_element(), std::max_element()), not only in efficiency, but also in that this algorithm finds the last biggest element while std::max_element finds the first biggest element.

这个算法不同于std :: make_pair(std :: min_element(),std :: max_element()),不仅在效率上,而且在这个算法找到最后一个最大的元素,而std :: max_element找到第一个最大的元素元件。

#1


11  

You do not have to worry about std::minmax_element doing any sorting. It leaves the range in the exact way it was traversed. The reason it is more efficient is it can find both the max and min in a single pass where when looking for max and min separately you have to do two full traversals.

您不必担心std :: minmax_element正在进行任何排序。它以精确的方式离开范围。它更有效的原因是它可以在单个通道中找到最大值和最小值,而在分别查找最大值和最小值时,您必须执行两次完整遍历。

std::minmax_element has the complexity of max(floor(3/2(N−1)), 0) where as std::max_element and std::min_element each are max(N-1,0) so it is about 25% less operations using std::minmax_element

std :: minmax_element的复杂度为max(floor(3/2(N-1)),0)其中std :: max_element和std :: min_element都是max(N-1,0)所以大约是25使用std :: minmax_element减少运算次数

There is also a difference where std::minmax_element finds the last largest element while std::max_element finds the first largest.

std :: minmax_element找到最后一个最大元素,而std :: max_element找到第一个最大元素,这也是一个区别。

So if you need to find the min and max of a range then you should use std::minmax_element. If you only need the min or max then you should use the specialized version. Dealing with the return from std::minmax_element will get even easier with the upcoming C++17 standard and structured bindings. You will be able to write

因此,如果您需要找到范围的最小值和最大值,那么您应该使用std :: minmax_element。如果您只需要最小值或最大值,则应使用专用版本。使用即将推出的C ++ 17标准和结构化绑定,处理std :: minmax_element的返回将变得更加容易。你将能够写作

auto [min, max] = std::minmax_element(...);

and now the first element of the pair is stored in min and the second is stored in max.

现在该对的第一个元素存储在min中,第二个元素存储在max中。

#2


6  

The other answers are good. I wanted to add a little about how minmax_element necessarily works, however, which also helps to explain why it (usually) works better than running min_element and max_element separately, and talk about some specific cases where it doesn't perform better.

其他答案都很好。我想补充一点关于minmax_element必然如何工作,但是,这也有助于解释为什么它(通常)比分别运行min_element和max_element更好,并讨论一些不能更好地执行的特定情况。

If we think about a naive implementation, you would maintain a max-value and min-value (and their corresponding iterators) and simply iterate through the range, comparing each value with both the min-value and max-value and adjusting either as necessary. However, this would give you a total of 2N comparisons; while it might well perform better than iterating through the list twice (due to better use of locality), the specification calls for (roughly) 3/2 N comparisons. How is that possible then?

如果我们考虑一个简单的实现,你将保持最大值和最小值(及其相应的迭代器)并简单地遍历范围,将每个值与最小值和最大值进行比较并根据需要进行调整。但是,这将给你总共2N的比较;虽然它可能比在列表中迭代两次(由于更好地使用局部性)表现更好,但规范要求(大致)3/2 N比较。那怎么可能呢?

It works by processing pairs rather than individual items. Taking the first two items in the range (#0 and #1), we can compare them and assign the largest to max-value and the smallest to min-value. Then, we compare the next two items (#3 and #4) to decide which of them is larger; we compare the larger one to max-value, and the smaller one to min-value, and update max-value/min-value as necessary. We then repeat this process with each additional pair (#5 and #6, then #7 and #8 and so on).

它通过处理对而不是单个项来工作。取范围(#0和#1)中的前两项,我们可以比较它们并将最大值分配给最大值,将最小值分配给最小值。然后,我们比较接下来的两个项目(#3和#4)来决定哪个项目更大;我们将较大的值与最大值进行比较,将较小的值与最小值进行比较,并根据需要更新max-value / min-value。然后我们用另外一对(#5和#6,然后#7和#8等)重复这个过程。

So, each pair requires three comparisons - with each other, then the highest with the current maximum, and the lowest with the current minimum. This cuts down the number of comparisons required to 3/2 N!

因此,每对需要三次比较 - 相互之间,然后是当前最大值的最高值,最低值是当前最小值。这减少了3/2 N所需的比较次数!

As per comments below, however, it should be noted that this "improved" algorithm tends to produce worse performance on modern processors than the naive version when using a type (or comparator) where the comparison is cheap - notably, with a range over a vector<int> or similar: the comparison between the two elements of each pair has an unpredictable result, leading to branch prediction failure in the processor (though this is only true if the data is more-or-less randomly ordered); current compilers will not always convert branches into conditional transfers, as they potentially could. Also, it is harder for a compiler to vectorise the more complex algorithm.

然而,根据下面的评论,应该注意的是,当使用比较便宜的类型(或比较器)时,这种“改进的”算法往往会在现代处理器上产生比天真版本更差的性能 - 特别是在范围超过vector 或类似:每对中两个元素之间的比较具有不可预测的结果,导致处理器中的分支预测失败(尽管只有在数据或多或少随机排序时才会出现这种情况);当前的编译器并不总是将分支转换为条件转移,因为它们可能。此外,编译器更难以矢量化更复杂的算法。

In theory, I think, a C++ library implementation could provide an overload for the minmax_element function which used the naive algorithm for primitive (int etc) element types with the default comparator. While the standard mandates a bound to the number of comparisons, if the effect of those comparisons cannot be observed then the number actually computed is unimportant, so long as the time complexity is the same (which it is - O(N) in both cases). However, while this might give better performance with randomly ordered data, it might produce worse performance when the data is ordered.

理论上,我认为,C ++库实现可以为minmax_element函数提供重载,该函数使用带有默认比较器的原始(int等)元素类型的朴素算法。虽然标准强制要求比较次数,但如果无法观察到这些比较的影响,那么实际计算的数量并不重要,只要时间复杂度相同(在两种情况下都是 - O(N) )。但是,虽然这可能会提供随机排序数据的更好性能,但在订购数据时可能会产生更差的性能。

With all the above in mind, a simple test case (below) shows an interesting result: for randomly ordered data, using min_element and max_element separately can in fact be slightly faster than using minmax_element. However, for sorted data, minmax_element is much faster than using min_element and max_element separately. On my system (Haswell processor) the below (compiled with gcc -O3 -std=c++11 -march=native, GCC version 5.4) a sample run shows 692 milliseconds for min/max separately and 848 for minmax combined. There is of course some variation between runs but these values seem typical.

考虑到上述所有情况,一个简单的测试用例(下面)显示了一个有趣的结果:对于随机排序的数据,分别使用min_element和max_element可能比使用minmax_element稍快。但是,对于排序数据,minmax_element比分别使用min_element和max_element要快得多。在我的系统(Haswell处理器)下面(使用gcc -O3 -std = c ++ 11 -march = native,GCC版本5.4编译),样本运行分别显示692毫秒的最小值/最大值,848表示最小值组合。当然,运行之间存在一些差异,但这些值似乎很典型。

Note that:

  • The difference in performance is small enough that it is unlikely to be a dominating factor in a real-world program;
  • 性能差异很小,不太可能成为现实世界计划的主导因素;

  • The difference is dependent on compiler optimisation; in the future, the results may well reverse;
  • 差异取决于编译器优化;在未来,结果可能会逆转;

  • for more complex data types (or rather for more complex comparators) the results will likely reverse, since in that case the fewer comparisons will likely be a significant improvement.
  • 对于更复杂的数据类型(或者更复杂的比较器),结果可能会逆转,因为在这种情况下,较少的比较可能是一个显着的改进。

  • when the sample data is ordered instead of random (replace v.push_back(r(gen)) with v.push_back(i) in the program below) the performance is very different: for separate min/max, it's around 728 milliseconds, whereas for combined minmax, it goes down to 246 milliseconds.
  • 当样本数据被排序而不是随机(用下面的程序中的v.push_back(i)代替v.push_back(r(gen))时,性能是非常不同的:对于单独的最小值/最大值,它大约是728毫秒,而对于组合的minmax,它下降到246毫秒。

The code:

#include <iostream>
#include <vector>
#include <algorithm>
#include <random>
#include <chrono>

constexpr int numEls = 100000000;

void recresult(std::vector<int> *v, int min, int max)
{
   // Make sure the compiler doesn't optimize out the values: 
   __asm__ volatile (
       ""
       :
       : "rm"(v), "rm"(min), "rm"(max)
   );
}

int main(int argc, char **argv)
{
    using namespace std;

    std::mt19937 gen(0);
    uniform_int_distribution<> r(0, 100000);


    vector<int> v;
    for (int i = 0; i < numEls; i++) {
        v.push_back(r(gen));
    }

    // run once for warmup
    int min = *min_element(v.begin(), v.end());
    int max = *max_element(v.begin(), v.end());
    recresult(&v, min, max);

    // min/max separately:
    {
        auto starttime = std::chrono::high_resolution_clock::now();
        for (int i = 0; i < 5; i++) {
        int min = *min_element(v.begin(), v.end());
            int max = *max_element(v.begin(), v.end());
            recresult(&v, min, max);
        }
        auto endtime = std::chrono::high_resolution_clock::now();
        auto millis = std::chrono::duration_cast<std::chrono::milliseconds>(endtime - starttime).count();

        cout << "min/max element: " << millis << " milliseconds." << endl;
    }

    // run once for warmup
    auto minmaxi = minmax_element(v.begin(), v.end());
    recresult(&v, *(minmaxi.first), *(minmaxi.second));

    // minmax together:
    {
        auto starttime = std::chrono::high_resolution_clock::now();
        for (int i = 0; i < 5; i++) {
        minmaxi = minmax_element(v.begin(), v.end());
        recresult(&v, *(minmaxi.first), *(minmaxi.second));
        }
        auto endtime = std::chrono::high_resolution_clock::now();
        auto millis = std::chrono::duration_cast<std::chrono::milliseconds>(endtime - starttime).count();

        cout << "minmax element: " << millis << " milliseconds." << endl;
    }

    return 0;
}

#3


5  

Yes. You're iterating over the range only once instead of doing it twice.

是。你只需要在范围内迭代一次而不是两次。

#4


3  

std::minmax_element complexity:

At most max(floor(3/2(N−1)), 0) applications of the predicate, where N = std::distance(first, last).

至多max(floor(3/2(N-1)),0)谓词的应用,其中N = std :: distance(first,last)。

std::min_element complexity (same as max_element):

std :: min_element复杂度(与max_element相同):

Exactly max(N-1,0) comparisons, where N = std::distance(first, last).

恰好是最大(N-1,0)比较,其中N = std :: distance(first,last)。

Ignoring max and floor, we get:

忽略最大值和最大值,我们得到:

(N-1) * 2 vs 3/2 (N-1)

So by using minmax_element you get 3/4 of the comparisons you would need by using max_element + min_element, or better.

因此,通过使用minmax_element,您可以使用max_element + min_element或更好的方式获得3/4的比较。

minmax_element uses the transitivity of the < operator, it knows that if it is updating the minimum it does not need to compare for the maximum by comparing two elements at once, i.e. if a < b then we only need to check min(a, current_min) and max(b, current_max), and vice-versa.

minmax_element使用 <运算符的传递性,它知道如果它正在更新最小值,则不需要通过一次比较两个元素来比较最大值,即如果a 则我们只需要检查min(a,current_min)>

Also worth noting:

另外值得注意的是:

This algorithm is different from std::make_pair(std::min_element(), std::max_element()), not only in efficiency, but also in that this algorithm finds the last biggest element while std::max_element finds the first biggest element.

这个算法不同于std :: make_pair(std :: min_element(),std :: max_element()),不仅在效率上,而且在这个算法找到最后一个最大的元素,而std :: max_element找到第一个最大的元素元件。