在迭代时从STL集合中删除元素

时间:2021-05-03 15:40:28

I need to go through a set and remove elements that meet a predefined criteria.

我需要通过一个集合并删除满足预定义条件的元素。

This is the test code I wrote:

这是我写的测试代码:

#include <set>
#include <algorithm>

void printElement(int value) {
    std::cout << value << " ";
}

int main() {
    int initNum[] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
    std::set<int> numbers(initNum, initNum + 10);
    // print '0 1 2 3 4 5 6 7 8 9'
    std::for_each(numbers.begin(), numbers.end(), printElement);

    std::set<int>::iterator it = numbers.begin();

    // iterate through the set and erase all even numbers
    for (; it != numbers.end(); ++it) {
        int n = *it;
        if (n % 2 == 0) {
            // wouldn't invalidate the iterator?
            numbers.erase(it);
        }
    }

    // print '1 3 5 7 9'
    std::for_each(numbers.begin(), numbers.end(), printElement);

    return 0;
}

At first, I thought that erasing an element from the set while iterating through it would invalidate the iterator, and the increment at the for loop would have undefined behavior. Even though, I executed this test code and all went well, and I can't explain why.

首先,我认为在迭代时从集合中删除元素将使迭代器无效,并且for循环的增量将具有未定义的行为。尽管如此,我还是执行了这个测试代码,并且一切顺利,我无法解释为什么。

My question: Is this the defined behavior for std sets or is this implementation specific? I am using gcc 4.3.3 on ubuntu 10.04 (32-bit version), by the way.

我的问题是:这是std集合的定义行为,还是这个实现是特定的?顺便说一下,我正在ubuntu 10.04(32位版本)上使用gcc 4.3.3。

Thanks!

谢谢!

Proposed solution:

建议解决方案:

Is this a correct way to iterate and erase elements from the set?

这是迭代和从集合中删除元素的正确方法吗?

while(it != numbers.end()) {
    int n = *it;
    if (n % 2 == 0) {
        // post-increment operator returns a copy, then increment
        numbers.erase(it++);
    } else {
        // pre-increment operator increments, then return
        ++it;
    }
}

Edit: PREFERED SOLUTION

编辑:优先解决方案

I came around a solution that seems more elegant to me, even though it does exactly the same.

我想到了一个对我来说更优雅的解决方案,尽管它做的完全一样。

while(it != numbers.end()) {
    // copy the current iterator then increment it
    std::set<int>::iterator current = it++;
    int n = *current;
    if (n % 2 == 0) {
        // don't invalidate iterator it, because it is already
        // pointing to the next element
        numbers.erase(current);
    }
}

If there are several test conditions inside the while, each one of them must increment the iterator. I like this code better because the iterator is incremented only in one place, making the code less error-prone and more readable.

如果内部有几个测试条件,那么每个测试条件都必须增加迭代器。我更喜欢这段代码,因为迭代器只在一个地方递增,这使得代码更不容易出错,可读性更好。

5 个解决方案

#1


131  

This is implementation dependent:

这是依赖于实现的:

Standard 23.1.2.8:

标准23.1.2.8:

The insert members shall not affect the validity of iterators and references to the container, and the erase members shall invalidate only iterators and references to the erased elements.

插入成员不应影响迭代器和对容器的引用的有效性,而擦除成员应仅使迭代器和对被擦除元素的引用无效。

Maybe you could try this -- this is standard conforming:

也许你可以试试这个——这是标准一致性:

for (it = numbers.begin(); it != numbers.end(); ) {
    if (*it % 2 == 0) {
        numbers.erase(it++);
    }
    else {
        ++it;
    }
}

Note that it++ is postfix, hence it passes the old position to erase, but first jumps to a newer one due to the operator.

注意,它++是后缀,因此它通过旧的位置来擦除,但是由于操作符,首先跳转到一个较新的位置。

2015.10.27 update: C++11 has resolved the defect. iterator erase (const_iterator position); return an iterator to the element that follows the last element removed (or set::end, if the last element was removed). So C++11 style is:

2015.10.27更新:c++ 11解决了缺陷。迭代器消除(const_iterator位置);返回一个迭代器到元素,该元素紧跟上一个删除的元素(或set::end,如果最后一个元素被删除)。所以c++ 11风格是:

for (it = numbers.begin(); it != numbers.end(); ) {
    if (*it % 2 == 0) {
        it = numbers.erase(it);
    }
    else {
        ++it;
    }
}

#2


18  

If you run your program through valgrind, you'll see a bunch of read errors. In other words, yes, the iterators are being invalidated, but you're getting lucky in your example (or really unlucky, as you're not seeing the negative effects of undefined behavior). One solution to this is to create a temporary iterator, increment the temp, delete the target iterator, then set the target to the temp. For example, re-write your loop as follows:

如果你在valgrind中运行你的程序,你会看到很多读取错误。换句话说,迭代器是无效的,但是您在示例中很幸运(或者非常不幸,因为您没有看到未定义行为的负面影响)。一种解决方法是创建一个临时迭代器,增加临时迭代器,删除目标迭代器,然后将目标设置为临时迭代器。

std::set<int>::iterator it = numbers.begin();                               
std::set<int>::iterator tmp;                                                

// iterate through the set and erase all even numbers                       
for ( ; it != numbers.end(); )                                              
{                                                                           
    int n = *it;                                                            
    if (n % 2 == 0)                                                         
    {                                                                       
        tmp = it;                                                           
        ++tmp;                                                              
        numbers.erase(it);                                                  
        it = tmp;                                                           
    }                                                                       
    else                                                                    
    {                                                                       
        ++it;                                                               
    }                                                                       
} 

#3


5  

You misunderstand what "undefined behavior" means. Undefined behavior does not mean "if you do this, your program will crash or produce unexpected results." It means "if you do this, your program could crash or produce unexpected results", or do anything else, depending on your compiler, your operating system, the phase of the moon, etc.

你误解了“未定义行为”的含义。未定义的行为并不意味着“如果您这样做,您的程序将崩溃或产生意想不到的结果”。它的意思是“如果你这样做,你的程序可能会崩溃或产生意想不到的结果”,或者做任何其他的事情,这取决于你的编译器、你的操作系统、月球的相位等等。

If something executes without crashing and behaves as you expect it to, that is not proof that it is not undefined behavior. All it proves is that its behavior happened to be as observed for that particular run after compiling with that particular compiler on that particular operating system.

如果某个东西执行时没有崩溃,并且按照预期的那样运行,这并不能证明它不是未定义的行为。它所证明的是,它的行为恰好是在特定的操作系统上与特定的编译器一起编译后所观察到的。

Erasing an element from a set invalidates the iterator to the erased element. Using an invalidated iterator is undefined behavior. It just so happened that the observed behavior was what you intended in this particular instance; it does not mean that the code is correct.

从集合中擦除一个元素会使迭代器失效,从而使被擦除的元素失效。使用无效迭代器是未定义的行为。碰巧的是观察到的行为就是你在这个特定的情况下想要的;这并不意味着代码是正确的。

#4


2  

Just to warn, that in case of a deque container, all solutions that check for the deque iterator equality to numbers.end() will likely fail on gcc 4.8.4. Namely, erasing an element of the deque generally invalidates pointer to numbers.end():

提醒一下,对于deque容器,检查deque迭代器与number .end()相等的所有解决方案在gcc 4.8.4上可能会失败。也就是说,删除deque的一个元素通常会使指向number的指针无效。

#include <iostream>
#include <deque>

using namespace std;
int main() 
{

  deque<int> numbers;

  numbers.push_back(0);
  numbers.push_back(1);
  numbers.push_back(2);
  numbers.push_back(3);
  //numbers.push_back(4);

  deque<int>::iterator  it_end = numbers.end();

  for (deque<int>::iterator it = numbers.begin(); it != numbers.end(); ) {
    if (*it % 2 == 0) {
      cout << "Erasing element: " << *it << "\n";
      numbers.erase(it++);
      if (it_end == numbers.end()) {
    cout << "it_end is still pointing to numbers.end()\n";
      } else {
    cout << "it_end is not anymore pointing to numbers.end()\n";
      }
    }
    else {
      cout << "Skipping element: " << *it << "\n";
      ++it;
    }
  }
}

Output:

输出:

Erasing element: 0
it_end is still pointing to numbers.end()
Skipping element: 1
Erasing element: 2
it_end is not anymore pointing to numbers.end()

Note that while the deque transformation is correct in this particular case, the end pointer has been invalidated along the way. With the deque of a different size the error is more apparent:

注意,虽然deque转换在这个特定的情况下是正确的,但是最终指针在整个过程中都是无效的。对于不同尺寸的deque,误差更明显:

int main() 
{

  deque<int> numbers;

  numbers.push_back(0);
  numbers.push_back(1);
  numbers.push_back(2);
  numbers.push_back(3);
  numbers.push_back(4);

  deque<int>::iterator  it_end = numbers.end();

  for (deque<int>::iterator it = numbers.begin(); it != numbers.end(); ) {
    if (*it % 2 == 0) {
      cout << "Erasing element: " << *it << "\n";
      numbers.erase(it++);
      if (it_end == numbers.end()) {
    cout << "it_end is still pointing to numbers.end()\n";
      } else {
    cout << "it_end is not anymore pointing to numbers.end()\n";
      }
    }
    else {
      cout << "Skipping element: " << *it << "\n";
      ++it;
    }
  }
}

Output:

输出:

Erasing element: 0
it_end is still pointing to numbers.end()
Skipping element: 1
Erasing element: 2
it_end is still pointing to numbers.end()
Skipping element: 3
Erasing element: 4
it_end is not anymore pointing to numbers.end()
Erasing element: 0
it_end is not anymore pointing to numbers.end()
Erasing element: 0
it_end is not anymore pointing to numbers.end()
...
Segmentation fault (core dumped)

Here is one of the ways to fix this:

下面是解决这个问题的方法之一:

#include <iostream>
#include <deque>

using namespace std;
int main() 
{

  deque<int> numbers;
  bool done_iterating = false;

  numbers.push_back(0);
  numbers.push_back(1);
  numbers.push_back(2);
  numbers.push_back(3);
  numbers.push_back(4);

  if (!numbers.empty()) {
    deque<int>::iterator it = numbers.begin();
    while (!done_iterating) {
      if (it + 1 == numbers.end()) {
    done_iterating = true;
      } 
      if (*it % 2 == 0) {
    cout << "Erasing element: " << *it << "\n";
      numbers.erase(it++);
      }
      else {
    cout << "Skipping element: " << *it << "\n";
    ++it;
      }
    }
  }
}

#5


1  

This behaviour is implementation specific. To guarantee the correctness of the iterator you should use "it = numbers.erase(it);" statement if you need to delete the element and simply incerement iterator in other case.

这种行为是特定于实现的。要保证迭代器的正确性,您应该使用“it = number .erase(it)”;如果您需要删除元素,在其他情况下只需输入迭代器。

#1


131  

This is implementation dependent:

这是依赖于实现的:

Standard 23.1.2.8:

标准23.1.2.8:

The insert members shall not affect the validity of iterators and references to the container, and the erase members shall invalidate only iterators and references to the erased elements.

插入成员不应影响迭代器和对容器的引用的有效性,而擦除成员应仅使迭代器和对被擦除元素的引用无效。

Maybe you could try this -- this is standard conforming:

也许你可以试试这个——这是标准一致性:

for (it = numbers.begin(); it != numbers.end(); ) {
    if (*it % 2 == 0) {
        numbers.erase(it++);
    }
    else {
        ++it;
    }
}

Note that it++ is postfix, hence it passes the old position to erase, but first jumps to a newer one due to the operator.

注意,它++是后缀,因此它通过旧的位置来擦除,但是由于操作符,首先跳转到一个较新的位置。

2015.10.27 update: C++11 has resolved the defect. iterator erase (const_iterator position); return an iterator to the element that follows the last element removed (or set::end, if the last element was removed). So C++11 style is:

2015.10.27更新:c++ 11解决了缺陷。迭代器消除(const_iterator位置);返回一个迭代器到元素,该元素紧跟上一个删除的元素(或set::end,如果最后一个元素被删除)。所以c++ 11风格是:

for (it = numbers.begin(); it != numbers.end(); ) {
    if (*it % 2 == 0) {
        it = numbers.erase(it);
    }
    else {
        ++it;
    }
}

#2


18  

If you run your program through valgrind, you'll see a bunch of read errors. In other words, yes, the iterators are being invalidated, but you're getting lucky in your example (or really unlucky, as you're not seeing the negative effects of undefined behavior). One solution to this is to create a temporary iterator, increment the temp, delete the target iterator, then set the target to the temp. For example, re-write your loop as follows:

如果你在valgrind中运行你的程序,你会看到很多读取错误。换句话说,迭代器是无效的,但是您在示例中很幸运(或者非常不幸,因为您没有看到未定义行为的负面影响)。一种解决方法是创建一个临时迭代器,增加临时迭代器,删除目标迭代器,然后将目标设置为临时迭代器。

std::set<int>::iterator it = numbers.begin();                               
std::set<int>::iterator tmp;                                                

// iterate through the set and erase all even numbers                       
for ( ; it != numbers.end(); )                                              
{                                                                           
    int n = *it;                                                            
    if (n % 2 == 0)                                                         
    {                                                                       
        tmp = it;                                                           
        ++tmp;                                                              
        numbers.erase(it);                                                  
        it = tmp;                                                           
    }                                                                       
    else                                                                    
    {                                                                       
        ++it;                                                               
    }                                                                       
} 

#3


5  

You misunderstand what "undefined behavior" means. Undefined behavior does not mean "if you do this, your program will crash or produce unexpected results." It means "if you do this, your program could crash or produce unexpected results", or do anything else, depending on your compiler, your operating system, the phase of the moon, etc.

你误解了“未定义行为”的含义。未定义的行为并不意味着“如果您这样做,您的程序将崩溃或产生意想不到的结果”。它的意思是“如果你这样做,你的程序可能会崩溃或产生意想不到的结果”,或者做任何其他的事情,这取决于你的编译器、你的操作系统、月球的相位等等。

If something executes without crashing and behaves as you expect it to, that is not proof that it is not undefined behavior. All it proves is that its behavior happened to be as observed for that particular run after compiling with that particular compiler on that particular operating system.

如果某个东西执行时没有崩溃,并且按照预期的那样运行,这并不能证明它不是未定义的行为。它所证明的是,它的行为恰好是在特定的操作系统上与特定的编译器一起编译后所观察到的。

Erasing an element from a set invalidates the iterator to the erased element. Using an invalidated iterator is undefined behavior. It just so happened that the observed behavior was what you intended in this particular instance; it does not mean that the code is correct.

从集合中擦除一个元素会使迭代器失效,从而使被擦除的元素失效。使用无效迭代器是未定义的行为。碰巧的是观察到的行为就是你在这个特定的情况下想要的;这并不意味着代码是正确的。

#4


2  

Just to warn, that in case of a deque container, all solutions that check for the deque iterator equality to numbers.end() will likely fail on gcc 4.8.4. Namely, erasing an element of the deque generally invalidates pointer to numbers.end():

提醒一下,对于deque容器,检查deque迭代器与number .end()相等的所有解决方案在gcc 4.8.4上可能会失败。也就是说,删除deque的一个元素通常会使指向number的指针无效。

#include <iostream>
#include <deque>

using namespace std;
int main() 
{

  deque<int> numbers;

  numbers.push_back(0);
  numbers.push_back(1);
  numbers.push_back(2);
  numbers.push_back(3);
  //numbers.push_back(4);

  deque<int>::iterator  it_end = numbers.end();

  for (deque<int>::iterator it = numbers.begin(); it != numbers.end(); ) {
    if (*it % 2 == 0) {
      cout << "Erasing element: " << *it << "\n";
      numbers.erase(it++);
      if (it_end == numbers.end()) {
    cout << "it_end is still pointing to numbers.end()\n";
      } else {
    cout << "it_end is not anymore pointing to numbers.end()\n";
      }
    }
    else {
      cout << "Skipping element: " << *it << "\n";
      ++it;
    }
  }
}

Output:

输出:

Erasing element: 0
it_end is still pointing to numbers.end()
Skipping element: 1
Erasing element: 2
it_end is not anymore pointing to numbers.end()

Note that while the deque transformation is correct in this particular case, the end pointer has been invalidated along the way. With the deque of a different size the error is more apparent:

注意,虽然deque转换在这个特定的情况下是正确的,但是最终指针在整个过程中都是无效的。对于不同尺寸的deque,误差更明显:

int main() 
{

  deque<int> numbers;

  numbers.push_back(0);
  numbers.push_back(1);
  numbers.push_back(2);
  numbers.push_back(3);
  numbers.push_back(4);

  deque<int>::iterator  it_end = numbers.end();

  for (deque<int>::iterator it = numbers.begin(); it != numbers.end(); ) {
    if (*it % 2 == 0) {
      cout << "Erasing element: " << *it << "\n";
      numbers.erase(it++);
      if (it_end == numbers.end()) {
    cout << "it_end is still pointing to numbers.end()\n";
      } else {
    cout << "it_end is not anymore pointing to numbers.end()\n";
      }
    }
    else {
      cout << "Skipping element: " << *it << "\n";
      ++it;
    }
  }
}

Output:

输出:

Erasing element: 0
it_end is still pointing to numbers.end()
Skipping element: 1
Erasing element: 2
it_end is still pointing to numbers.end()
Skipping element: 3
Erasing element: 4
it_end is not anymore pointing to numbers.end()
Erasing element: 0
it_end is not anymore pointing to numbers.end()
Erasing element: 0
it_end is not anymore pointing to numbers.end()
...
Segmentation fault (core dumped)

Here is one of the ways to fix this:

下面是解决这个问题的方法之一:

#include <iostream>
#include <deque>

using namespace std;
int main() 
{

  deque<int> numbers;
  bool done_iterating = false;

  numbers.push_back(0);
  numbers.push_back(1);
  numbers.push_back(2);
  numbers.push_back(3);
  numbers.push_back(4);

  if (!numbers.empty()) {
    deque<int>::iterator it = numbers.begin();
    while (!done_iterating) {
      if (it + 1 == numbers.end()) {
    done_iterating = true;
      } 
      if (*it % 2 == 0) {
    cout << "Erasing element: " << *it << "\n";
      numbers.erase(it++);
      }
      else {
    cout << "Skipping element: " << *it << "\n";
    ++it;
      }
    }
  }
}

#5


1  

This behaviour is implementation specific. To guarantee the correctness of the iterator you should use "it = numbers.erase(it);" statement if you need to delete the element and simply incerement iterator in other case.

这种行为是特定于实现的。要保证迭代器的正确性,您应该使用“it = number .erase(it)”;如果您需要删除元素,在其他情况下只需输入迭代器。