stl中的一些容器没有查找功能

时间:2022-07-20 10:06:48

Some of the STL containers such as std::list and std::vector don't have find() method as a member function. Why is that? I know that there is the alternative of using std::find from <algorithm> but still this use isn't 100% natural.

某些STL容器(如std :: list和std :: vector)没有find()方法作为成员函数。这是为什么?我知道有一种替代方法可以使用 中的std :: find,但这种用法仍然不是100%自然。

7 个解决方案

#1


43  

The general design principle is to use std::find where possible, and implement find member functions when it is more efficient.

一般设计原则是尽可能使用std :: find,并在更高效时实现find成员函数。

The containers that do have a find member are containers which have a more efficient element look-up mechanism then the linear search performed in std::find. For example, binary search trees such as std::set and std::map, or hash tables such as their unordered counterparts.

具有查找成员的容器是具有更有效的元素查找机制的容器,然后是在std :: find中执行的线性搜索。例如,二进制搜索树,例如std :: set和std :: map,或散列表,例如它们的无序对应物。

#2


13  

find, lower_bound and upper_bound member functions are only provided when more efficient than using the non-member equivalents, or when the non-members couldn't operate given the container's public API

find,lower_bound和upper_bound成员函数仅在比使用非成员等效项更有效时提供,或者在非成员无法使用容器的公共API操作时提供

Note in particular that std::string has a find function which provides std::find()-like linear search facilities for character searches and std::search()-like facilities for sub-string searches: while the non-member versions may have the same big-O efficiency, they may well be less efficient given dedicated machine code instructions are often available for "string" searching. There are also historical, convenience, and ease-of-porting factors.

特别要注意的是std :: string有一个find函数,它提供std :: find() - 类似于字符搜索的线性搜索工具和std :: search() - 类似于子字符串搜索的工具:而非成员版本由于专用机器代码指令通常可用于“字符串”搜索,因此它们可能具有相同的大O效率,它们可能效率较低。还有历史,便利和易于移植的因素。

Quite apart from the question of efficiency, it's noteworthy that some containers:

除了效率问题,值得注意的是一些容器:

  • are inherently either sorted (multi-set, map) or unsorted (unordered_map, unordered_set), typically unsorted (e.g. std::string), or easily either (std::vector)

    本质上是排序(多集,映射)或未排序(unordered_map,unordered_set),通常是未排序的(例如std :: string),或者很容易(std :: vector)

  • publicly support forward iteration and/or random access

    公开支持前向迭代和/或随机访问

  • possibly privately support binary search

    可能私下支持二进制搜索

  • have such a specialised public API for element access that potential reuse of the algorithm is relatively limited (e.g. unordered_map::bucket / ::begin(n) et al)

    有这样一个专门的元素访问公共API,可能重用的算法相对有限(例如unordered_map :: bucket / :: begin(n)et al)

It's also of interest that searching in a vector can be done using a great many algorithms:

同样有趣的是,使用大量算法可以在向量中进行搜索:

  • std::find does a brute force linear O(n) search which will "find" lower-index elements first,

    std :: find做一个强力线性O(n)搜索,它将首先“找到”低级索引元素,

  • std::binary_search requires a sorted vector but jumps around to achieve O(log2n) complexity.

    std :: binary_search需要一个已排序的向量,但跳转到达到O(log2n)复杂度。

  • other options like extrapolation search and hashing might be applicable

    其他选项如外推搜索和散列可能适用

How would you pick which to implement and add as members? Seems a bit arbitrary. Still, the choice of which to use can be important performance-wise: for a million elements, find averages half-a-million element comparisons before a match and the full million whenever the element's not there, while binary_search typically takes ~20 comparisons either way.

你会如何选择实施和添加成员?似乎有点武断。尽管如此,使用哪种选择在性能方面可能是重要的:对于一百万个元素,在匹配之前找到平均50万个元素比较,并且每当元素不存在时找到满百万个,而binary_search通常需要~20个比较办法。

The containers with find don't typically provide such flexibility, and the find and/or lower_bound/upper_bound they provide can be seen as replacements for the non-member equivalents, and likely the only reasonable way to search the containers.

具有find的容器通常不提供这样的灵活性,并且它们提供的find和/或lower_bound / upper_bound可以被视为非成员等价物的替代,并且可能是搜索容器的唯一合理方式。

#3


10  

Because there's the std::find function from algorithm that applies for them.

因为算法中的std :: find函数适用于它们。

Generally, containers like std::vector and std::list have linear search time complexity. As such attaching to them a member find function is a redundancy because there's already std::find. For other containers (e.g., std::set or std::map etc.) there's a better way (i.e., faster than linear complexity) to implement searching. As such the implementers implemented these faster searching algorithm as member functions.

通常,像std :: vector和std :: list这样的容器具有线性搜索时间复杂度。因此附加到它们的成员查找功能是冗余,因为已经有std :: find。对于其他容器(例如,std :: set或std :: map等),实现搜索有更好的方法(即,比线性复杂度更快)。因此,实现者将这些更快的搜索算法实现为成员函数。

#4


3  

Containers which have a search-by-key like feature will have the find method integrated (e.g. map which is internally implemented with a binary tree which can be looked up efficiently).

具有按键搜索功能的容器将集成查找方法(例如,内部使用可以有效查找的二叉树实现的映射)。

Others, like the ones you cited, will allow a range search with the std::find but don't have a featured find function because it would have no algorithmic advantage over the std::find (except in sorted/special cases)

其他人,比如你引用的那些,将允许使用std :: find进行范围搜索,但没有特色的find函数,因为它没有std :: find的算法优势(除了在排序/特殊情况下)

#5


2  

Such containers as std::vector, std::list, std::forward_list and some others are sequential containers. There is nothing better than sequential search that can be applied to these containers. So there is no need to rewrite the sequential search for each sequential container if it is the same for all these containers.

诸如std :: vector,std :: list,std :: forward_list和其他一些容器是顺序容器。没有比顺序搜索更好的方法可以应用于这些容器了。因此,如果所有这些容器都相同,则无需为每个顺序容器重写顺序搜索。

The exception is class std::basic_string that initially simulates C-strings that already have special search functions as strchr, strstr and others.

例外是std :: basic_string类,它最初模拟已经具有strchr,strstr等特殊搜索功能的C字符串。

#6


2  

Using the same function for various containers makes for a clearer API, you don't have to learn the peculiarities of each of the containers, just how to apply one function that you use for all of them.

对各种容器使用相同的功能可以获得更清晰的API,您不必了解每个容器的特性,只需要如何应用一个用于所有容器的功能。

It's also for code reusability - you use the algorithm that takes iterators from any of the containers that provide them, so the algorithm doesn't have to rely on the container being a std::vector, std::list etc.

它也用于代码可重用性 - 你使用从任何提供它们的容器获取迭代器的算法,因此算法不必依赖于容器是std :: vector,std :: list等。

#7


0  

As mentioned in other comments, the design rationale is that vector::find() can be as efficiently implemented as a non-member function std::find(). The benefits of using the latter is that it decouples the data-structures and the operators acting on the data-structure, which increases maintainability (this is advantageous for the developers of the library).

正如其他评论中所提到的,设计原理是vector :: find()可以像非成员函数std :: find()一样有效地实现。使用后者的好处是它将数据结构和操作符分离到数据结构上,从而提高了可维护性(这对于库的开发人员来说是有利的)。

However, the benefits of the former is that it would make the API between all containers consistent, and make client code less verbose. This would increase learnability and readability (this is advantageous for the users of the library). Also, consistent API would allow to write generic code. Consider this:

但是,前者的好处是它可以使所有容器之间的API保持一致,并使客户端代码更简洁。这将增加可学习性和可读性(这对于库的用户是有利的)。此外,一致的API将允许编写通用代码。考虑一下:

template <typename Container, typename T>
void foo(const Container& c, const T& x) {
    if (std::find(c.begin(), c.end(), x) != c.end()) {
        // ...
    }
}

The above is inefficient when Container is a std::map or std::set. To make it efficient, we'd need to do:

当Container是std :: map或std :: set时,上面的效率很低。为了提高效率,我们需要做:

template <typename Container, typename T>
void foo(const Container& c, const T& x) {
    if (c.find(x) != c.end()) {
        // ...
    }
}

But then it doesn't compile for std::vector and std::list. This places the burden on users of the library to write their own generic function manually specialized/overloaded for each type they want to support:

但是它不会为std :: vector和std :: list编译。这会给库的用户带来负担,为他们想要支持的每种类型编写自己专用/重载的通用函数:

template <typename T>
bool contains(const std::vector<T>& c, const T& x) {
    return std::find(c.begin(), c.end(), x) != c.end();
}

template <typename T>
bool contains(const std::set<T>& c, const T& x) {
    return c.find(x) != c.end();
}

template <typename Container, typename T>
void foo(const Container& c, const T& x) {
    if (contains(c, x)) {
        // ...
    }
}

I acknowledge that making these types of design decisions is hard, but my opinion is that designers of the STL made a mistake here. The very tiny maintainability burden seems quite largely worth the better API and consistency for users. In a nutshell, since find must be a member function for some containers (for performance), then find should be a member function for all containers (for consistency). Note that I'm totally okay with other algorithms being non-member functions.

我承认制作这些类型的设计决策很难,但我认为STL的设计师在这里犯了一个错误。非常小的可维护性负担似乎非常值得用户提供更好的API和一致性。简而言之,由于find必须是某些容器的成员函数(用于性能),因此find应该是所有容器的成员函数(为了一致性)。请注意,我完全可以使用其他算法作为非成员函数。

(I mean, come on, a container is by definition something that contains stuff. It should be trivial for users to write a generic and efficient "contains" function. In fact, I'd argue it should be added to the Container concept, but I digress.)

(我的意思是,来吧,一个容器根据定义包含东西。用户编写一个通用且高效的“包含”函数应该是微不足道的。事实上,我认为它应该被添加到Container概念中,但我离题了。)

#1


43  

The general design principle is to use std::find where possible, and implement find member functions when it is more efficient.

一般设计原则是尽可能使用std :: find,并在更高效时实现find成员函数。

The containers that do have a find member are containers which have a more efficient element look-up mechanism then the linear search performed in std::find. For example, binary search trees such as std::set and std::map, or hash tables such as their unordered counterparts.

具有查找成员的容器是具有更有效的元素查找机制的容器,然后是在std :: find中执行的线性搜索。例如,二进制搜索树,例如std :: set和std :: map,或散列表,例如它们的无序对应物。

#2


13  

find, lower_bound and upper_bound member functions are only provided when more efficient than using the non-member equivalents, or when the non-members couldn't operate given the container's public API

find,lower_bound和upper_bound成员函数仅在比使用非成员等效项更有效时提供,或者在非成员无法使用容器的公共API操作时提供

Note in particular that std::string has a find function which provides std::find()-like linear search facilities for character searches and std::search()-like facilities for sub-string searches: while the non-member versions may have the same big-O efficiency, they may well be less efficient given dedicated machine code instructions are often available for "string" searching. There are also historical, convenience, and ease-of-porting factors.

特别要注意的是std :: string有一个find函数,它提供std :: find() - 类似于字符搜索的线性搜索工具和std :: search() - 类似于子字符串搜索的工具:而非成员版本由于专用机器代码指令通常可用于“字符串”搜索,因此它们可能具有相同的大O效率,它们可能效率较低。还有历史,便利和易于移植的因素。

Quite apart from the question of efficiency, it's noteworthy that some containers:

除了效率问题,值得注意的是一些容器:

  • are inherently either sorted (multi-set, map) or unsorted (unordered_map, unordered_set), typically unsorted (e.g. std::string), or easily either (std::vector)

    本质上是排序(多集,映射)或未排序(unordered_map,unordered_set),通常是未排序的(例如std :: string),或者很容易(std :: vector)

  • publicly support forward iteration and/or random access

    公开支持前向迭代和/或随机访问

  • possibly privately support binary search

    可能私下支持二进制搜索

  • have such a specialised public API for element access that potential reuse of the algorithm is relatively limited (e.g. unordered_map::bucket / ::begin(n) et al)

    有这样一个专门的元素访问公共API,可能重用的算法相对有限(例如unordered_map :: bucket / :: begin(n)et al)

It's also of interest that searching in a vector can be done using a great many algorithms:

同样有趣的是,使用大量算法可以在向量中进行搜索:

  • std::find does a brute force linear O(n) search which will "find" lower-index elements first,

    std :: find做一个强力线性O(n)搜索,它将首先“找到”低级索引元素,

  • std::binary_search requires a sorted vector but jumps around to achieve O(log2n) complexity.

    std :: binary_search需要一个已排序的向量,但跳转到达到O(log2n)复杂度。

  • other options like extrapolation search and hashing might be applicable

    其他选项如外推搜索和散列可能适用

How would you pick which to implement and add as members? Seems a bit arbitrary. Still, the choice of which to use can be important performance-wise: for a million elements, find averages half-a-million element comparisons before a match and the full million whenever the element's not there, while binary_search typically takes ~20 comparisons either way.

你会如何选择实施和添加成员?似乎有点武断。尽管如此,使用哪种选择在性能方面可能是重要的:对于一百万个元素,在匹配之前找到平均50万个元素比较,并且每当元素不存在时找到满百万个,而binary_search通常需要~20个比较办法。

The containers with find don't typically provide such flexibility, and the find and/or lower_bound/upper_bound they provide can be seen as replacements for the non-member equivalents, and likely the only reasonable way to search the containers.

具有find的容器通常不提供这样的灵活性,并且它们提供的find和/或lower_bound / upper_bound可以被视为非成员等价物的替代,并且可能是搜索容器的唯一合理方式。

#3


10  

Because there's the std::find function from algorithm that applies for them.

因为算法中的std :: find函数适用于它们。

Generally, containers like std::vector and std::list have linear search time complexity. As such attaching to them a member find function is a redundancy because there's already std::find. For other containers (e.g., std::set or std::map etc.) there's a better way (i.e., faster than linear complexity) to implement searching. As such the implementers implemented these faster searching algorithm as member functions.

通常,像std :: vector和std :: list这样的容器具有线性搜索时间复杂度。因此附加到它们的成员查找功能是冗余,因为已经有std :: find。对于其他容器(例如,std :: set或std :: map等),实现搜索有更好的方法(即,比线性复杂度更快)。因此,实现者将这些更快的搜索算法实现为成员函数。

#4


3  

Containers which have a search-by-key like feature will have the find method integrated (e.g. map which is internally implemented with a binary tree which can be looked up efficiently).

具有按键搜索功能的容器将集成查找方法(例如,内部使用可以有效查找的二叉树实现的映射)。

Others, like the ones you cited, will allow a range search with the std::find but don't have a featured find function because it would have no algorithmic advantage over the std::find (except in sorted/special cases)

其他人,比如你引用的那些,将允许使用std :: find进行范围搜索,但没有特色的find函数,因为它没有std :: find的算法优势(除了在排序/特殊情况下)

#5


2  

Such containers as std::vector, std::list, std::forward_list and some others are sequential containers. There is nothing better than sequential search that can be applied to these containers. So there is no need to rewrite the sequential search for each sequential container if it is the same for all these containers.

诸如std :: vector,std :: list,std :: forward_list和其他一些容器是顺序容器。没有比顺序搜索更好的方法可以应用于这些容器了。因此,如果所有这些容器都相同,则无需为每个顺序容器重写顺序搜索。

The exception is class std::basic_string that initially simulates C-strings that already have special search functions as strchr, strstr and others.

例外是std :: basic_string类,它最初模拟已经具有strchr,strstr等特殊搜索功能的C字符串。

#6


2  

Using the same function for various containers makes for a clearer API, you don't have to learn the peculiarities of each of the containers, just how to apply one function that you use for all of them.

对各种容器使用相同的功能可以获得更清晰的API,您不必了解每个容器的特性,只需要如何应用一个用于所有容器的功能。

It's also for code reusability - you use the algorithm that takes iterators from any of the containers that provide them, so the algorithm doesn't have to rely on the container being a std::vector, std::list etc.

它也用于代码可重用性 - 你使用从任何提供它们的容器获取迭代器的算法,因此算法不必依赖于容器是std :: vector,std :: list等。

#7


0  

As mentioned in other comments, the design rationale is that vector::find() can be as efficiently implemented as a non-member function std::find(). The benefits of using the latter is that it decouples the data-structures and the operators acting on the data-structure, which increases maintainability (this is advantageous for the developers of the library).

正如其他评论中所提到的,设计原理是vector :: find()可以像非成员函数std :: find()一样有效地实现。使用后者的好处是它将数据结构和操作符分离到数据结构上,从而提高了可维护性(这对于库的开发人员来说是有利的)。

However, the benefits of the former is that it would make the API between all containers consistent, and make client code less verbose. This would increase learnability and readability (this is advantageous for the users of the library). Also, consistent API would allow to write generic code. Consider this:

但是,前者的好处是它可以使所有容器之间的API保持一致,并使客户端代码更简洁。这将增加可学习性和可读性(这对于库的用户是有利的)。此外,一致的API将允许编写通用代码。考虑一下:

template <typename Container, typename T>
void foo(const Container& c, const T& x) {
    if (std::find(c.begin(), c.end(), x) != c.end()) {
        // ...
    }
}

The above is inefficient when Container is a std::map or std::set. To make it efficient, we'd need to do:

当Container是std :: map或std :: set时,上面的效率很低。为了提高效率,我们需要做:

template <typename Container, typename T>
void foo(const Container& c, const T& x) {
    if (c.find(x) != c.end()) {
        // ...
    }
}

But then it doesn't compile for std::vector and std::list. This places the burden on users of the library to write their own generic function manually specialized/overloaded for each type they want to support:

但是它不会为std :: vector和std :: list编译。这会给库的用户带来负担,为他们想要支持的每种类型编写自己专用/重载的通用函数:

template <typename T>
bool contains(const std::vector<T>& c, const T& x) {
    return std::find(c.begin(), c.end(), x) != c.end();
}

template <typename T>
bool contains(const std::set<T>& c, const T& x) {
    return c.find(x) != c.end();
}

template <typename Container, typename T>
void foo(const Container& c, const T& x) {
    if (contains(c, x)) {
        // ...
    }
}

I acknowledge that making these types of design decisions is hard, but my opinion is that designers of the STL made a mistake here. The very tiny maintainability burden seems quite largely worth the better API and consistency for users. In a nutshell, since find must be a member function for some containers (for performance), then find should be a member function for all containers (for consistency). Note that I'm totally okay with other algorithms being non-member functions.

我承认制作这些类型的设计决策很难,但我认为STL的设计师在这里犯了一个错误。非常小的可维护性负担似乎非常值得用户提供更好的API和一致性。简而言之,由于find必须是某些容器的成员函数(用于性能),因此find应该是所有容器的成员函数(为了一致性)。请注意,我完全可以使用其他算法作为非成员函数。

(I mean, come on, a container is by definition something that contains stuff. It should be trivial for users to write a generic and efficient "contains" function. In fact, I'd argue it should be added to the Container concept, but I digress.)

(我的意思是,来吧,一个容器根据定义包含东西。用户编写一个通用且高效的“包含”函数应该是微不足道的。事实上,我认为它应该被添加到Container概念中,但我离题了。)