According to C++ standard (3.7.3.2/4) using (not only dereferencing, but also copying, casting, whatever else) an invalid pointer is undefined behavior (in case of doubt also see this question). Now the typical code to traverse an STL containter looks like this:
根据C ++标准(3.7.3.2/4)使用(不仅是解除引用,还有复制,转换,其他任何),无效指针是未定义的行为(如果有疑问也会看到这个问题)。现在,遍历STL容器的典型代码如下所示:
std::vector<int> toTraverse;
//populate the vector
for( std::vector<int>::iterator it = toTraverse.begin(); it != toTraverse.end(); ++it ) {
//process( *it );
}
std::vector::end()
is an iterator onto the hypothetic element beyond the last element of the containter. There's no element there, therefore using a pointer through that iterator is undefined behavior.
std :: vector :: end()是超出容器最后一个元素的假设元素的迭代器。那里没有元素,因此使用指针通过迭代器是未定义的行为。
Now how does the != end()
work then? I mean in order to do the comparison an iterator needs to be constructed wrapping an invalid address and then that invalid address will have to be used in a comparison which again is undefined behavior. Is such comparison legal and why?
现在!= end()如何工作呢?我的意思是为了进行比较,需要构造迭代器包装无效地址,然后必须在比较中使用该无效地址,这也是未定义的行为。这样的比较合法吗?为什么?
8 个解决方案
#1
9
You're right that an invalid pointer can't be used, but you're wrong that a pointer to an element one past the last element in an array is an invalid pointer - it's valid.
你是对的,无法使用无效的指针,但你错了指向一个超过数组中最后一个元素的元素的指针是无效的指针 - 它是有效的。
The C standard, section 6.5.6.8 says that it's well defined and valid:
C标准6.5.6.8节说它定义明确且有效:
...if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object...
...如果表达式P指向数组对象的最后一个元素,则表达式(P)+1指向数组对象的最后一个元素之后的一个...
but cannot be dereferenced:
但无法解除引用:
...if the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated...
...如果结果指向一个经过数组对象的最后一个元素,则它不应该被用作被评估的一元*运算符的操作数...
#2
25
The only requirement for end()
is that ++(--end()) == end()
. The end()
could simply be a special state the iterator is in. There is no reason the end()
iterator has to correspond to a pointer of any kind.
end()的唯一要求是++( - end())== end()。 end()可能只是迭代器所在的特殊状态。没有理由end()迭代器必须对应任何类型的指针。
Besides, even if it were a pointer, comparing two pointers doesn't require any sort of dereference anyway. Consider the following:
此外,即使它是指针,比较两个指针也不需要任何类型的解引用。考虑以下:
char[5] a = {'a', 'b', 'c', 'd', 'e'};
char* end = a+5;
for (char* it = a; it != a+5; ++it);
That code will work just fine, and it mirrors your vector code.
该代码将正常工作,它反映了您的矢量代码。
#3
5
One past the end is not an invalid value (neither with regular arrays or iterators). You can't dereference it but it can be used for comparisons.
结束时不是无效值(常规数组或迭代器都没有)。你不能取消引用它,但它可以用于比较。
std::vector<X>::iterator it;
This is a singular iterator. You can only assign a valid iterator to it.
这是一个奇异的迭代器。您只能为其分配有效的迭代器。
std::vector<X>::iterator it = vec.end();
This is a perfectly valid iterator. You can't dereference it but you can use it for comparisons and decrement it (assuming the container has a sufficient size).
这是一个完全有效的迭代器。您不能取消引用它,但您可以使用它进行比较并减少它(假设容器具有足够的大小)。
#4
3
Huh? There's no rule that says that iterators need to be implemented using nothing but a pointer.
咦?没有规则说迭代器只需要使用指针来实现。
It could have a boolean flag in there, which gets set when the increment operation sees that it passes the end of the valid data, for instance.
它可以在那里有一个布尔标志,例如,当增量操作看到它通过有效数据的结尾时,它会被设置。
#5
1
Simple. Iterators aren't (necessarily) pointers.
简单。迭代器不是(必然)指针。
They have some similarities (i.e. you can dereference them), but that's about it.
它们有一些相似之处(即你可以取消引用它们),但这就是它。
#6
1
The implementation of a standard library's container's end()
iterator is, well, implementation-defined, so the implementation can play tricks it knows the platform to support.
If you implemented your own iterators, you can do whatever you want - so long as it is standard-conform. For example, your iterator, if storing a pointer, could store a NULL
pointer to indicate an end iterator. Or it could contain a boolean flag or whatnot.
标准库的容器的end()迭代器的实现是实现定义的,因此实现可以起到它知道要支持的平台的技巧。如果你实现了自己的迭代器,你可以做任何你想做的事 - 只要它符合标准。例如,如果存储指针,则迭代器可以存储NULL指针以指示结束迭代器。或者它可能包含布尔标志或诸如此类的东西。
#7
0
Besides what was already said (iterators need not be pointers), I'd like to point out the rule you cite
除了已经说过的(迭代器不需要指针),我想指出你引用的规则
According to C++ standard (3.7.3.2/4) using (not only dereferencing, but also copying, casting, whatever else) an invalid pointer is undefined behavior
根据C ++标准(3.7.3.2/4)使用(不仅是解除引用,还有复制,转换,还有其他),无效指针是未定义的行为
wouldn't apply to end()
iterator anyway. Basically, when you have an array, all the pointers to its elements, plus one pointer past-the-end, plus one pointer before the start of the array, are valid. That means:
不管怎么说都不适用于end()迭代器。基本上,当你有一个数组时,所有指向其元素的指针,加上一个指针过去的结尾,加上数组开始之前的一个指针,都是有效的。这意味着:
int arr[5];
int *p=0;
p==arr+4; // OK
p==arr+5; // past-the-end, but OK
p==arr-1; // also OK
p==arr+123456; // not OK, according to your rule
#8
0
I answer here since other answers are now out-of-date; nevertheless, they were not quite right to the question.
我在这里回答,因为其他答案现在已经过时了;尽管如此,他们对这个问题并不十分正确。
First, C++14 has changed the rules mentioned in the question. Indirection through an invalid pointer value or passing an invalid pointer value to a deallocation function are still undefined, but other operations are now implemenatation-defined, see Documentation of "invalid pointer value" conversion in C++ implementations.
首先,C ++ 14改变了问题中提到的规则。通过无效指针值的间接或将无效指针值传递给释放函数仍未定义,但现在实现定义了其他操作,请参阅C ++实现中的“无效指针值”转换的文档。
Second, words counts. You can't bypass the definitions while applying the rules. The key point here is the definition of "invalid". For iterators, this is defined in [iterator.requirements]. In fact, even it is true that pointers are iterators, meanings of "invalid" to them are subtly different. Rules for pointers render "invalid" as "don't indirect through invalid value", which is a special case of "not dereferenceable" to iterators; however, "not deferenceable" is not implying "invalid" for iterators. "Invalid" is explicitly defined as "may be singular", while "singular" value is defined as "not associated with any sequence" (in the same paragraph of definition of "dereferenceable"). That paragraph even explicitly defined "past-the-end values".
其次,话语很重要。应用规则时,您无法绕过定义。这里的关键点是“无效”的定义。对于迭代器,这在[iterator.requirements]中定义。实际上,即使指针是迭代器也是如此,对它们来说“无效”的含义是微妙的不同。指针规则将“无效”渲染为“不通过无效值进行间接”,这是对迭代器“不可解除引用”的特殊情况;但是,“不可引用”并不意味着迭代器“无效”。 “无效”明确定义为“可能是单数”,而“单数”值定义为“与任何序列无关”(在“可解除引用”的定义的同一段中)。该段甚至明确定义了“过去的最终价值”。
From the text of the standard in [iterator.requirements], it is clear that:
从[iterator.requirements]中的标准文本中可以清楚地看出:
- Past-the-end values are not assumed to be dereferenceable (at least by the standard library), as the standard states.
- Dereferenceable values are not singular, since they are associated with sequence.
- Past-the-end values are not singular, since they are associated with sequence.
- An iterator is not invalid if it is definitely not singular (by negation on definition of "invalid iterator"). In other words, if an iterator is associated to a sequence, it is not invalid.
正如标准所述,过去的结果值不被认为是可解除引用的(至少是标准库)。
可解除引用的值不是单数,因为它们与序列相关联。
过去的结果值不是单数,因为它们与序列相关联。
如果迭代器绝对不是单数(通过否定“无效迭代器”的定义),则它不是无效的。换句话说,如果迭代器与序列相关联,则它不是无效的。
Value of end()
is a past-the-end value, which is associated with a sequence before it is invalidated. So it is actually valid by definition. Even with misconception on "invalid" literally, the rules of pointers are not applicable here.
end()的值是一个past-the-end值,它在序列失效之前与序列相关联。所以它的定义实际上是有效的。即使对字面上的“无效”有误解,指针规则也不适用于此。
The rules allowing ==
comparison on such values are in input iterator requirements, which is inherited by some other category of iterators (forward, bidirectional, etc). More specifically, valid iterators are required to be comparable in the domain of the iterator in such way (==
). Further, forward iterator requirements specifies the domain is over the underlying sequence. And container requirements specifies the iterator
and const_iterator
member types in any iterator category meets forward iterator requirements. Thus, ==
on end()
and iterator over same container is required to be well-defined. As a standard container, vector<int>
also obey the requirements. That's the whole story.
允许==对这些值进行比较的规则在输入迭代器要求中,这些要求由其他类别的迭代器(正向,双向等)继承。更具体地说,有效的迭代器需要在迭代器的域中以这种方式(==)进行比较。此外,转发迭代器要求指定域位于基础序列之上。容器需求指定任何迭代器类别中的迭代器和const_iterator成员类型满足前向迭代器要求。因此,== on end()和同一容器上的迭代器需要明确定义。作为标准容器,vector
Third, even when end()
is a pointer value (this is likely to happen with optimized implementation of iterator of vector
instance), the rules in the question are still not applicable. The reason is mentioned above (and in some other answers): "invalid" is concerned with *
(indirect through), not comparison. One-past-end value is explicitly allowed to be compared in specified ways by the standard. Also note ISO C++ is not ISO C, they also subtly mismatches (e.g. for <
on pointer values not in the same array, unspecified vs. undefined), though they have similar rules here.
第三,即使end()是一个指针值(这很可能发生在矢量实例的迭代器的优化实现中),问题中的规则仍然不适用。原因如上所述(以及其他一些答案):“无效”涉及*(间接直通),而不是比较。明确允许通过标准以指定的方式比较一个过去的值。另请注意,ISO C ++不是ISO C,它们也巧妙地不匹配(例如,对于 <不在同一数组中的指针值,未指定与未定义),尽管它们在此处具有类似的规则。< p>
#1
9
You're right that an invalid pointer can't be used, but you're wrong that a pointer to an element one past the last element in an array is an invalid pointer - it's valid.
你是对的,无法使用无效的指针,但你错了指向一个超过数组中最后一个元素的元素的指针是无效的指针 - 它是有效的。
The C standard, section 6.5.6.8 says that it's well defined and valid:
C标准6.5.6.8节说它定义明确且有效:
...if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object...
...如果表达式P指向数组对象的最后一个元素,则表达式(P)+1指向数组对象的最后一个元素之后的一个...
but cannot be dereferenced:
但无法解除引用:
...if the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated...
...如果结果指向一个经过数组对象的最后一个元素,则它不应该被用作被评估的一元*运算符的操作数...
#2
25
The only requirement for end()
is that ++(--end()) == end()
. The end()
could simply be a special state the iterator is in. There is no reason the end()
iterator has to correspond to a pointer of any kind.
end()的唯一要求是++( - end())== end()。 end()可能只是迭代器所在的特殊状态。没有理由end()迭代器必须对应任何类型的指针。
Besides, even if it were a pointer, comparing two pointers doesn't require any sort of dereference anyway. Consider the following:
此外,即使它是指针,比较两个指针也不需要任何类型的解引用。考虑以下:
char[5] a = {'a', 'b', 'c', 'd', 'e'};
char* end = a+5;
for (char* it = a; it != a+5; ++it);
That code will work just fine, and it mirrors your vector code.
该代码将正常工作,它反映了您的矢量代码。
#3
5
One past the end is not an invalid value (neither with regular arrays or iterators). You can't dereference it but it can be used for comparisons.
结束时不是无效值(常规数组或迭代器都没有)。你不能取消引用它,但它可以用于比较。
std::vector<X>::iterator it;
This is a singular iterator. You can only assign a valid iterator to it.
这是一个奇异的迭代器。您只能为其分配有效的迭代器。
std::vector<X>::iterator it = vec.end();
This is a perfectly valid iterator. You can't dereference it but you can use it for comparisons and decrement it (assuming the container has a sufficient size).
这是一个完全有效的迭代器。您不能取消引用它,但您可以使用它进行比较并减少它(假设容器具有足够的大小)。
#4
3
Huh? There's no rule that says that iterators need to be implemented using nothing but a pointer.
咦?没有规则说迭代器只需要使用指针来实现。
It could have a boolean flag in there, which gets set when the increment operation sees that it passes the end of the valid data, for instance.
它可以在那里有一个布尔标志,例如,当增量操作看到它通过有效数据的结尾时,它会被设置。
#5
1
Simple. Iterators aren't (necessarily) pointers.
简单。迭代器不是(必然)指针。
They have some similarities (i.e. you can dereference them), but that's about it.
它们有一些相似之处(即你可以取消引用它们),但这就是它。
#6
1
The implementation of a standard library's container's end()
iterator is, well, implementation-defined, so the implementation can play tricks it knows the platform to support.
If you implemented your own iterators, you can do whatever you want - so long as it is standard-conform. For example, your iterator, if storing a pointer, could store a NULL
pointer to indicate an end iterator. Or it could contain a boolean flag or whatnot.
标准库的容器的end()迭代器的实现是实现定义的,因此实现可以起到它知道要支持的平台的技巧。如果你实现了自己的迭代器,你可以做任何你想做的事 - 只要它符合标准。例如,如果存储指针,则迭代器可以存储NULL指针以指示结束迭代器。或者它可能包含布尔标志或诸如此类的东西。
#7
0
Besides what was already said (iterators need not be pointers), I'd like to point out the rule you cite
除了已经说过的(迭代器不需要指针),我想指出你引用的规则
According to C++ standard (3.7.3.2/4) using (not only dereferencing, but also copying, casting, whatever else) an invalid pointer is undefined behavior
根据C ++标准(3.7.3.2/4)使用(不仅是解除引用,还有复制,转换,还有其他),无效指针是未定义的行为
wouldn't apply to end()
iterator anyway. Basically, when you have an array, all the pointers to its elements, plus one pointer past-the-end, plus one pointer before the start of the array, are valid. That means:
不管怎么说都不适用于end()迭代器。基本上,当你有一个数组时,所有指向其元素的指针,加上一个指针过去的结尾,加上数组开始之前的一个指针,都是有效的。这意味着:
int arr[5];
int *p=0;
p==arr+4; // OK
p==arr+5; // past-the-end, but OK
p==arr-1; // also OK
p==arr+123456; // not OK, according to your rule
#8
0
I answer here since other answers are now out-of-date; nevertheless, they were not quite right to the question.
我在这里回答,因为其他答案现在已经过时了;尽管如此,他们对这个问题并不十分正确。
First, C++14 has changed the rules mentioned in the question. Indirection through an invalid pointer value or passing an invalid pointer value to a deallocation function are still undefined, but other operations are now implemenatation-defined, see Documentation of "invalid pointer value" conversion in C++ implementations.
首先,C ++ 14改变了问题中提到的规则。通过无效指针值的间接或将无效指针值传递给释放函数仍未定义,但现在实现定义了其他操作,请参阅C ++实现中的“无效指针值”转换的文档。
Second, words counts. You can't bypass the definitions while applying the rules. The key point here is the definition of "invalid". For iterators, this is defined in [iterator.requirements]. In fact, even it is true that pointers are iterators, meanings of "invalid" to them are subtly different. Rules for pointers render "invalid" as "don't indirect through invalid value", which is a special case of "not dereferenceable" to iterators; however, "not deferenceable" is not implying "invalid" for iterators. "Invalid" is explicitly defined as "may be singular", while "singular" value is defined as "not associated with any sequence" (in the same paragraph of definition of "dereferenceable"). That paragraph even explicitly defined "past-the-end values".
其次,话语很重要。应用规则时,您无法绕过定义。这里的关键点是“无效”的定义。对于迭代器,这在[iterator.requirements]中定义。实际上,即使指针是迭代器也是如此,对它们来说“无效”的含义是微妙的不同。指针规则将“无效”渲染为“不通过无效值进行间接”,这是对迭代器“不可解除引用”的特殊情况;但是,“不可引用”并不意味着迭代器“无效”。 “无效”明确定义为“可能是单数”,而“单数”值定义为“与任何序列无关”(在“可解除引用”的定义的同一段中)。该段甚至明确定义了“过去的最终价值”。
From the text of the standard in [iterator.requirements], it is clear that:
从[iterator.requirements]中的标准文本中可以清楚地看出:
- Past-the-end values are not assumed to be dereferenceable (at least by the standard library), as the standard states.
- Dereferenceable values are not singular, since they are associated with sequence.
- Past-the-end values are not singular, since they are associated with sequence.
- An iterator is not invalid if it is definitely not singular (by negation on definition of "invalid iterator"). In other words, if an iterator is associated to a sequence, it is not invalid.
正如标准所述,过去的结果值不被认为是可解除引用的(至少是标准库)。
可解除引用的值不是单数,因为它们与序列相关联。
过去的结果值不是单数,因为它们与序列相关联。
如果迭代器绝对不是单数(通过否定“无效迭代器”的定义),则它不是无效的。换句话说,如果迭代器与序列相关联,则它不是无效的。
Value of end()
is a past-the-end value, which is associated with a sequence before it is invalidated. So it is actually valid by definition. Even with misconception on "invalid" literally, the rules of pointers are not applicable here.
end()的值是一个past-the-end值,它在序列失效之前与序列相关联。所以它的定义实际上是有效的。即使对字面上的“无效”有误解,指针规则也不适用于此。
The rules allowing ==
comparison on such values are in input iterator requirements, which is inherited by some other category of iterators (forward, bidirectional, etc). More specifically, valid iterators are required to be comparable in the domain of the iterator in such way (==
). Further, forward iterator requirements specifies the domain is over the underlying sequence. And container requirements specifies the iterator
and const_iterator
member types in any iterator category meets forward iterator requirements. Thus, ==
on end()
and iterator over same container is required to be well-defined. As a standard container, vector<int>
also obey the requirements. That's the whole story.
允许==对这些值进行比较的规则在输入迭代器要求中,这些要求由其他类别的迭代器(正向,双向等)继承。更具体地说,有效的迭代器需要在迭代器的域中以这种方式(==)进行比较。此外,转发迭代器要求指定域位于基础序列之上。容器需求指定任何迭代器类别中的迭代器和const_iterator成员类型满足前向迭代器要求。因此,== on end()和同一容器上的迭代器需要明确定义。作为标准容器,vector
Third, even when end()
is a pointer value (this is likely to happen with optimized implementation of iterator of vector
instance), the rules in the question are still not applicable. The reason is mentioned above (and in some other answers): "invalid" is concerned with *
(indirect through), not comparison. One-past-end value is explicitly allowed to be compared in specified ways by the standard. Also note ISO C++ is not ISO C, they also subtly mismatches (e.g. for <
on pointer values not in the same array, unspecified vs. undefined), though they have similar rules here.
第三,即使end()是一个指针值(这很可能发生在矢量实例的迭代器的优化实现中),问题中的规则仍然不适用。原因如上所述(以及其他一些答案):“无效”涉及*(间接直通),而不是比较。明确允许通过标准以指定的方式比较一个过去的值。另请注意,ISO C ++不是ISO C,它们也巧妙地不匹配(例如,对于 <不在同一数组中的指针值,未指定与未定义),尽管它们在此处具有类似的规则。< p>