STL算法:为什么没有容器的附加接口(除了迭代器对)?

时间:2022-07-27 18:19:19

I'm wondering why the STL doesn't overload their algorithm functions such that I can call them by simply providing a container and not taking the more verbose way to pass begin + end iterators. I of course understand why we also want to use an iterator pair for processing subsequences of a container / array, however, almost all calls to these methods are using a whole container:

我想知道为什么STL不会重载它们的算法函数,这样我就可以通过简单地提供一个容器来调用它们,而不是采用更冗长的方式来传递begin + end迭代器。我当然理解为什么我们也想使用迭代器对来处理容器/数组的子序列,但是,几乎所有对这些方法的调用都使用整个容器:

std::for_each(myVector.begin(), myVector.end(), doSomething);

I'd find it more convenient, readable and maintainable to just write

我发现它只是更方便,可读和可维护

std::for_each(myVector, doSomething);

Is there a reason STL doesn't provide these overloads? [EDIT: I don't mean to replace the interface with this restricted one but to also provide a container-based iterface!] Do they introduce ambiguity? I'm thinking about something like this:

有没有理由STL不提供这些重载? [编辑:我不是要用这个受限制的接口替换接口,而是提供基于容器的接口!]它们是否会引入歧义?我在考虑这样的事情:

template<typename _Container, typename _Funct>
inline _Funct for_each(_Container c, _Funct f) {
    return for_each(begin(c), end(c), f);
}

Am I missing something?

我错过了什么吗?

7 个解决方案

#1


17  

They do introduce ambiguity for many algorithms. A lot of <algorithm> looks like

它们确实为许多算法引入了歧义。很多 <算法> 看起来像

template<class iterator>
void do_something(iterator, iterator);

template<class iterator, class funct>
void do_something(iterator, iterator, funct);

If you add additional overloads

如果添加其他重载

template<class container, class funct>
void do_something(container, funct);

the compiler will have some trouble figuring out what do_something(x, y) means. If x and y are of the same type, it will match both iterator = type and container = type, funct = type. *)

编译器在确定do_something(x,y)的含义时会遇到一些麻烦。如果x和y的类型相同,则它将匹配iterator = type和container = type,funct = type。 *)

C++11 tried to solve this with "concepts" that could recognize the difference between a container and an iterator. However, these "concepts" turned out to be too complicated to make it into the standard, so neither did these overloads.

C ++ 11试图用“概念”来解决这个问题,这些概念可以识别容器和迭代器之间的区别。然而,这些“概念”变得太复杂而无法使其成为标准,因此这些超载也没有。

*)compilers disagree here, the Comeau compiler claims that it is ambiguous, g++ 4.5 and MSVC 10 calls the first function.

*)编译器在这里不同意,Comeau编译器声称它不明确,g ++ 4.5和MSVC 10调用第一个函数。


After an extremely long discussion in the comments, here is one example where it doesn't work as expected - using a container adapter that can also double as a predicate.

在评论中进行了长时间的讨论后,这里有一个例子,它没有按预期工作 - 使用一个也可以兼作谓词的容器适配器。

#include <iostream>
#include <vector>

template<class iterator>
void test(iterator, iterator)
{
   std::cout << "test iterator\n";
}

template<class iterator, class predicate>
void test(iterator, iterator, predicate)
{
   std::cout << "test iterator, predicate\n";
}

template<class container, class predicate>
void test(const container& cont, predicate compare)
{
   std::cout << "test container, predicate\n";

   test(cont.begin(), cont.end(), compare);
}

template<class container>
class adapter
{
public:
   typedef typename container::iterator   iterator;

   adapter(container* cont) : cont(cont)
   { }

   iterator begin() const
   { return cont->begin(); }

   iterator end() const
   { return cont->end(); }

   bool operator()(const iterator& one, const iterator& two)
   { return *one < *two; }

private:
   container* cont;
};

int main()
{
   std::vector<int>   v;

   adapter<std::vector<int>>   a(&v);

   test(a, a);

}

Output:

test iterator

http://ideone.com/wps2tZ

#2


10  

Unfortunately, this is a far more generic problem; namely, that iterators were designed to beat those crappy C APIs and Java-style "Put the algorithms as methods of each individual container" solutions. They are the first-generation generic solution and there's no surprise that, on reflection, they were not as good as other possible generic solutions obtainable after we spend twenty years thinking about it.

不幸的是,这是一个更普遍的问题;也就是说,迭代器旨在击败那些糟糕的C API和Java风格的“将算法作为每个单独容器的方法”解决方案。它们是第一代通用解决方案,毫无疑问,经过反思,它们不如我们花了二十年时间考虑它后可获得的其他可能的通用解决方案一样好。

Adding these container overloads would be just band-aiding over a tiny part of the problem space; and it might even make things worse in the future. The solution is ranges, which C++ is looking to introduce ASAP.

添加这些容器重载只会对问题空间的一小部分进行创可贴;甚至可能在将来使事情变得更糟。解决方案是范围,C ++希望尽快引入。

#3


3  

To understand that I think one must have to understand the philosophy of C++ algorithms. Lets ask this question first:

要理解我认为必须要理解C ++算法的哲学。让我们先问这个问题:

Why C++ algorithms are implemented as free functions instead of member functions?

为什么C ++算法是作为*函数而不是成员函数实现的?

Well the answer is pretty much simple : to avoid implementation explosions. Suppose you have M containers and N algorithms, and if you implement them as members of the containers, then there will be M*N implementations. There are two (related) problems in this approach:

答案很简单:避免实施爆炸。假设您有M个容器和N个算法,如果将它们实现为容器的成员,那么将有M * N个实现。这种方法有两个(相关的)问题:

  • First, it doesn't make use of code reuse. Most of the implementations will be repeated.
  • 首先,它没有使用代码重用。大多数实现都将重复进行。

  • Second, implementation explosions, which is a direct consequence of the above.
  • 第二,实施爆炸,这是上述的直接后果。

C++ solves these issues by implementing them as free functions, so that you have only N implementations. Each of the algorithm that operates on a container takes a pair of iterators, that defines the range. If you want overloads that take container, instead of pair of iterators, then the Standard have to provide such overloads for each of the algorithm, and there will be 2*N implementations which pretty much defeat the very purpose why C++ has separated the algorithms from the containers in the first place, and half of these functions don't do anything which cannot be done by the other half.

C ++通过将它们实现为*函数来解决这些问题,因此您只有N个实现。在容器上运行的每个算法都使用一对迭代器来定义范围。如果你想要带容器的重载而不是迭代器对,那么标准必须为每个算法提供这样的重载,并且将有2 * N的实现,这几乎击败了为什么C ++将算法与首先是容器,这些功能的一半不做任何另一半无法完成的事情。

So I don't think it is that much an issue. Just to avoid one single argument, why implement N more functions (which also impose some restriction on its usage such as you cannot pass pointers to it)? However, if programmers want such functions in their utility, they can implement them anytime along with many others based on the Standard algorithm!

所以我认为这不是一个问题。为了避免单个参数,为什么要实现N个更多的函数(这也对它的使用施加了一些限制,比如你不能将指针传递给它)?但是,如果程序员希望在他们的实用程序中使用这些函数,他们可以随时使用基于标准算法的许多其他函数来实现它们!


You commented:

Well, the 2*N implementations are in fact only N implementations. The other N ones are inlined overloads which directly call the "real" version of the algorithm, so they are a header-only thing. Providing container overloads doesn't defeat the purpose to separate algorithms from containers, as (as you can see in my example) they can use templates to handle all types of containers.

嗯,2 * N实现实际上只有N个实现。其他N个是内联重载,直接调用算法的“真实”版本,因此它们只是一个标题。提供容器重载并不会破坏将算法与容器分开的目的,因为(正如您在我的示例中所见),它们可以使用模板来处理所有类型的容器。

Based on this logic, one can very well argue for M*N algorithms. So make them member functions too (and call the free functions internally)? I'm sure many OOP guys would prefer

基于这种逻辑,人们可以很好地争论M * N算法。那么也要将它们作为成员函数(并在内部调用*函数)?我相信很多OOP人都会喜欢

auto result = container.accumulate(val);

over

auto result = std::accumulate(container.begin(), container.end(), val);

#4


3  

Here is a relevant answer from Herb Sutter's blog: Why no container-based algorithms. It shows counterexamples just like Bo Persson did in his answer above.

以下是Herb Sutter博客的相关答案:为什么没有基于容器的算法。它显示了反例,就像博佩尔森在上面的回答中所做的那样。

#5


2  

There is a Range Operators library with intention to fix that. Verbosity was cut several times over.

有一个Range Operators库有意修复它。详细程度被削减了几次。

Your example would look something like this:

你的例子看起来像这样:

auto newVector = myVector * doSomething;

Yes, doSomething - is without parenthesis.

是的,doSomething - 没有括号。

Familiar idiom from shell (with std algorithm):

来自shell的常见习语(使用std算法):

auto t = vector<int>{3,2,1,4} | sort | unique; 

#6


0  

It should be pointed out that it's very easy to define your own trivial wrappers to add containerized versions.

应该指出的是,定义自己的普通包装器以添加容器化版本非常容易。

For example:

template<typename Container, typename Func>
Func for_each(Container& c, Func f) {
    return std::for_each(c.begin(), c.end(), f);
}

Now you can make the simple call you want. There's no ambiguity because your wrappers aren't in the std namespace. You can define overloads that take const Container&. If you want versions that call the C++-11 const iterator methods (e.g. cbegin()), I think you'll need to name the wrapper differently. I use for_each_const.

现在,您可以进行所需的简单通话。没有歧义,因为你的包装器不在std命名空间中。您可以定义带有const Container&的重载。如果你想要调用C ++ - 11 const迭代器方法的版本(例如cbegin()),我认为你需要以不同的方式命名包装器。我使用for_each_const。

#7


0  

Obviously, as other users mentioned, it's a tricky problem, so unfortunately it's been a long time and there's still no solution in the standard library. There are, however, already range libraries available such as Boost::Range and the one in the Adobe Source Libraries that provide not only the simplicity of the interface you describe in your question, but some fancier features as well.

显然,正如其他用户提到的那样,这是一个棘手的问题,所以不幸的是,这已经很长时间了,标准库中仍然没有解决方案。但是,已有范围库可用,例如Boost :: Range和Adobe源库中的一个,它们不仅提供了您在问题中描述的界面的简单性,而且还提供了一些更高级的功能。

Your example works perfectly with Boost (we are using namespace boost::range::adaptors below):

您的示例与Boost完美配合(我们在下面使用命名空间boost :: range :: adapters):

boost::for_each(myVector, doSomething);

We can also slice myVector quickly and easily:

我们还可以快速轻松地切片myVector:

boost::for_each(myVector | sliced(10, 20), doSomething)

boost :: for_each(myVector |切片(10,20),doSomething)

We can even zip myVector with another, filter by a predicate, and sample every other element of the resulting pairs in a single, simple statement (this requires that you unpack in doSomethingElse the tuples produced by boost::combined):

我们甚至可以将myVector压缩到另一个,通过谓词进行过滤,并在一个简单的语句中对结果对的每个其他元素进行采样(这需要您在doSomethingElse中解压缩由boost :: combined生成的元组):

boost::for_each( boost::combined(myVector, myOtherVector) | strided(2), doSomethingElse)

boost :: for_each(boost :: combined(myVector,myOtherVector)| strided(2),doSomethingElse)

#1


17  

They do introduce ambiguity for many algorithms. A lot of <algorithm> looks like

它们确实为许多算法引入了歧义。很多 <算法> 看起来像

template<class iterator>
void do_something(iterator, iterator);

template<class iterator, class funct>
void do_something(iterator, iterator, funct);

If you add additional overloads

如果添加其他重载

template<class container, class funct>
void do_something(container, funct);

the compiler will have some trouble figuring out what do_something(x, y) means. If x and y are of the same type, it will match both iterator = type and container = type, funct = type. *)

编译器在确定do_something(x,y)的含义时会遇到一些麻烦。如果x和y的类型相同,则它将匹配iterator = type和container = type,funct = type。 *)

C++11 tried to solve this with "concepts" that could recognize the difference between a container and an iterator. However, these "concepts" turned out to be too complicated to make it into the standard, so neither did these overloads.

C ++ 11试图用“概念”来解决这个问题,这些概念可以识别容器和迭代器之间的区别。然而,这些“概念”变得太复杂而无法使其成为标准,因此这些超载也没有。

*)compilers disagree here, the Comeau compiler claims that it is ambiguous, g++ 4.5 and MSVC 10 calls the first function.

*)编译器在这里不同意,Comeau编译器声称它不明确,g ++ 4.5和MSVC 10调用第一个函数。


After an extremely long discussion in the comments, here is one example where it doesn't work as expected - using a container adapter that can also double as a predicate.

在评论中进行了长时间的讨论后,这里有一个例子,它没有按预期工作 - 使用一个也可以兼作谓词的容器适配器。

#include <iostream>
#include <vector>

template<class iterator>
void test(iterator, iterator)
{
   std::cout << "test iterator\n";
}

template<class iterator, class predicate>
void test(iterator, iterator, predicate)
{
   std::cout << "test iterator, predicate\n";
}

template<class container, class predicate>
void test(const container& cont, predicate compare)
{
   std::cout << "test container, predicate\n";

   test(cont.begin(), cont.end(), compare);
}

template<class container>
class adapter
{
public:
   typedef typename container::iterator   iterator;

   adapter(container* cont) : cont(cont)
   { }

   iterator begin() const
   { return cont->begin(); }

   iterator end() const
   { return cont->end(); }

   bool operator()(const iterator& one, const iterator& two)
   { return *one < *two; }

private:
   container* cont;
};

int main()
{
   std::vector<int>   v;

   adapter<std::vector<int>>   a(&v);

   test(a, a);

}

Output:

test iterator

http://ideone.com/wps2tZ

#2


10  

Unfortunately, this is a far more generic problem; namely, that iterators were designed to beat those crappy C APIs and Java-style "Put the algorithms as methods of each individual container" solutions. They are the first-generation generic solution and there's no surprise that, on reflection, they were not as good as other possible generic solutions obtainable after we spend twenty years thinking about it.

不幸的是,这是一个更普遍的问题;也就是说,迭代器旨在击败那些糟糕的C API和Java风格的“将算法作为每个单独容器的方法”解决方案。它们是第一代通用解决方案,毫无疑问,经过反思,它们不如我们花了二十年时间考虑它后可获得的其他可能的通用解决方案一样好。

Adding these container overloads would be just band-aiding over a tiny part of the problem space; and it might even make things worse in the future. The solution is ranges, which C++ is looking to introduce ASAP.

添加这些容器重载只会对问题空间的一小部分进行创可贴;甚至可能在将来使事情变得更糟。解决方案是范围,C ++希望尽快引入。

#3


3  

To understand that I think one must have to understand the philosophy of C++ algorithms. Lets ask this question first:

要理解我认为必须要理解C ++算法的哲学。让我们先问这个问题:

Why C++ algorithms are implemented as free functions instead of member functions?

为什么C ++算法是作为*函数而不是成员函数实现的?

Well the answer is pretty much simple : to avoid implementation explosions. Suppose you have M containers and N algorithms, and if you implement them as members of the containers, then there will be M*N implementations. There are two (related) problems in this approach:

答案很简单:避免实施爆炸。假设您有M个容器和N个算法,如果将它们实现为容器的成员,那么将有M * N个实现。这种方法有两个(相关的)问题:

  • First, it doesn't make use of code reuse. Most of the implementations will be repeated.
  • 首先,它没有使用代码重用。大多数实现都将重复进行。

  • Second, implementation explosions, which is a direct consequence of the above.
  • 第二,实施爆炸,这是上述的直接后果。

C++ solves these issues by implementing them as free functions, so that you have only N implementations. Each of the algorithm that operates on a container takes a pair of iterators, that defines the range. If you want overloads that take container, instead of pair of iterators, then the Standard have to provide such overloads for each of the algorithm, and there will be 2*N implementations which pretty much defeat the very purpose why C++ has separated the algorithms from the containers in the first place, and half of these functions don't do anything which cannot be done by the other half.

C ++通过将它们实现为*函数来解决这些问题,因此您只有N个实现。在容器上运行的每个算法都使用一对迭代器来定义范围。如果你想要带容器的重载而不是迭代器对,那么标准必须为每个算法提供这样的重载,并且将有2 * N的实现,这几乎击败了为什么C ++将算法与首先是容器,这些功能的一半不做任何另一半无法完成的事情。

So I don't think it is that much an issue. Just to avoid one single argument, why implement N more functions (which also impose some restriction on its usage such as you cannot pass pointers to it)? However, if programmers want such functions in their utility, they can implement them anytime along with many others based on the Standard algorithm!

所以我认为这不是一个问题。为了避免单个参数,为什么要实现N个更多的函数(这也对它的使用施加了一些限制,比如你不能将指针传递给它)?但是,如果程序员希望在他们的实用程序中使用这些函数,他们可以随时使用基于标准算法的许多其他函数来实现它们!


You commented:

Well, the 2*N implementations are in fact only N implementations. The other N ones are inlined overloads which directly call the "real" version of the algorithm, so they are a header-only thing. Providing container overloads doesn't defeat the purpose to separate algorithms from containers, as (as you can see in my example) they can use templates to handle all types of containers.

嗯,2 * N实现实际上只有N个实现。其他N个是内联重载,直接调用算法的“真实”版本,因此它们只是一个标题。提供容器重载并不会破坏将算法与容器分开的目的,因为(正如您在我的示例中所见),它们可以使用模板来处理所有类型的容器。

Based on this logic, one can very well argue for M*N algorithms. So make them member functions too (and call the free functions internally)? I'm sure many OOP guys would prefer

基于这种逻辑,人们可以很好地争论M * N算法。那么也要将它们作为成员函数(并在内部调用*函数)?我相信很多OOP人都会喜欢

auto result = container.accumulate(val);

over

auto result = std::accumulate(container.begin(), container.end(), val);

#4


3  

Here is a relevant answer from Herb Sutter's blog: Why no container-based algorithms. It shows counterexamples just like Bo Persson did in his answer above.

以下是Herb Sutter博客的相关答案:为什么没有基于容器的算法。它显示了反例,就像博佩尔森在上面的回答中所做的那样。

#5


2  

There is a Range Operators library with intention to fix that. Verbosity was cut several times over.

有一个Range Operators库有意修复它。详细程度被削减了几次。

Your example would look something like this:

你的例子看起来像这样:

auto newVector = myVector * doSomething;

Yes, doSomething - is without parenthesis.

是的,doSomething - 没有括号。

Familiar idiom from shell (with std algorithm):

来自shell的常见习语(使用std算法):

auto t = vector<int>{3,2,1,4} | sort | unique; 

#6


0  

It should be pointed out that it's very easy to define your own trivial wrappers to add containerized versions.

应该指出的是,定义自己的普通包装器以添加容器化版本非常容易。

For example:

template<typename Container, typename Func>
Func for_each(Container& c, Func f) {
    return std::for_each(c.begin(), c.end(), f);
}

Now you can make the simple call you want. There's no ambiguity because your wrappers aren't in the std namespace. You can define overloads that take const Container&. If you want versions that call the C++-11 const iterator methods (e.g. cbegin()), I think you'll need to name the wrapper differently. I use for_each_const.

现在,您可以进行所需的简单通话。没有歧义,因为你的包装器不在std命名空间中。您可以定义带有const Container&的重载。如果你想要调用C ++ - 11 const迭代器方法的版本(例如cbegin()),我认为你需要以不同的方式命名包装器。我使用for_each_const。

#7


0  

Obviously, as other users mentioned, it's a tricky problem, so unfortunately it's been a long time and there's still no solution in the standard library. There are, however, already range libraries available such as Boost::Range and the one in the Adobe Source Libraries that provide not only the simplicity of the interface you describe in your question, but some fancier features as well.

显然,正如其他用户提到的那样,这是一个棘手的问题,所以不幸的是,这已经很长时间了,标准库中仍然没有解决方案。但是,已有范围库可用,例如Boost :: Range和Adobe源库中的一个,它们不仅提供了您在问题中描述的界面的简单性,而且还提供了一些更高级的功能。

Your example works perfectly with Boost (we are using namespace boost::range::adaptors below):

您的示例与Boost完美配合(我们在下面使用命名空间boost :: range :: adapters):

boost::for_each(myVector, doSomething);

We can also slice myVector quickly and easily:

我们还可以快速轻松地切片myVector:

boost::for_each(myVector | sliced(10, 20), doSomething)

boost :: for_each(myVector |切片(10,20),doSomething)

We can even zip myVector with another, filter by a predicate, and sample every other element of the resulting pairs in a single, simple statement (this requires that you unpack in doSomethingElse the tuples produced by boost::combined):

我们甚至可以将myVector压缩到另一个,通过谓词进行过滤,并在一个简单的语句中对结果对的每个其他元素进行采样(这需要您在doSomethingElse中解压缩由boost :: combined生成的元组):

boost::for_each( boost::combined(myVector, myOtherVector) | strided(2), doSomethingElse)

boost :: for_each(boost :: combined(myVector,myOtherVector)| strided(2),doSomethingElse)