为什么C ++不支持基于动态数组循环的范围?

时间:2022-05-31 21:22:14

Why doesn't C++ support range based for loop over dynamic arrays? That is, something like this:

为什么C ++不支持基于动态数组循环的范围?就是这样的:

int* array = new int[len];
for[] (int i : array) {};

I just invented the for[] statement to rhyme with new[] and delete[]. As far as I understand, the runtime has the size of the array available (otherwise delete[] could not work) so in theory, range based for loop could also be made to work. What is the reason that it's not made to work?

我刚刚用new []和delete []发明了for []语句来押韵。据我所知,运行时具有可用数组的大小(否则delete []无法工作)所以理论上,基于for循环的范围也可以工作。它不起作用的原因是什么?

6 个解决方案

#1


2  

int* array = new int[len];
for[] (int i : array) {}

There are several points which must be addressed; I'll tackle them one at a time.

必须解决几点问题;我会一次解决一个问题。

Does the run-time knows the size of the array?

运行时是否知道数组的大小?

In certain conditions, it must. As you pointed out, a call to delete[] will call the destructor of each element (in reserve order) and therefore must know how many there are.

在某些情况下,它必须。正如您所指出的,对delete []的调用将调用每个元素的析构函数(按预留顺序),因此必须知道有多少元素。

However, by not specifying that the number of elements must be known, and accessible, the C++ standard allows an implementation to omit it whenever the call to the destructor is not required (std::is_trivially_destructible<T>::value evaluates to true).

但是,通过不指定必须知道和访问元素的数量,C ++标准允许实现在不需要调用析构函数时省略它(std :: is_trivially_destructible :: value计算结果为true) 。

Can the run-time distinguish between pointer and array?

运行时可以区分指针和数组吗?

In general, no.

一般来说,没有。

When you have a pointer, it could point to anything:

当你有一个指针时,它可以指向任何东西:

  • a single item, or an item in an array,
  • 单个项目,或数组中的项目,

  • the first item in an array, or any other,
  • 数组中的第一项或任何其他项,

  • an array on the stack, or an array on the heap,
  • 堆栈上的数组,或堆上的数组,

  • just an array, or an array part of a larger object.
  • 只是一个数组,或更大对象的数组部分。

This is the reason what delete[] exists, and using delete here would be incorrect. With delete[], you the user state: this pointer points to the first item of a heap-allocated array.

这就是delete []存在的原因,并且在这里使用delete是不正确的。使用delete [],用户状态:this指针指向堆分配数组的第一项。

The implementation can then assume that, for example, in the 8 bytes preceding this first item it can find the size of the array. Without you guaranteeing this, those 8 bytes could be anything.

然后,该实现可以假设,例如,在该第一项之前的8个字节中,它可以找到该数组的大小。如果不保证这一点,那8个字节可以是任何东西。

Then, why not go all the way and create for[] (int i : array)?

那么,为什么不一直为[](int i:array)创建并创建?

There are two reasons:

有两个原因:

  1. As mentioned, today an implementation can elide the size on a number of elements; with this new for[] syntax, it would no longer be possible on a per-type basis.
  2. 如上所述,今天的实施可以在许多要素上消除规模;使用这个新的[]语法,在每个类型的基础上将不再可能。

  3. It's not worth it.
  4. 这不值得。

Let us be honest, new[] and delete[] are relics of an older time. They are incredibly awkward:

说实话,新[]和删除[]是旧时代的遗物。他们非常尴尬:

  • the number of elements has to be known in advance, and cannot be changed,
  • 元素的数量必须事先知道,不能改变,

  • the elements must be default constructible, or otherwise C-ish,
  • 元素必须是默认可构造的,否则C-ish,

and unsafe to use:

并且使用不安全:

  • the number of elements is inaccessible to the user.
  • 用户无法访问元素的数量。

There is generally no reason to use new[] and delete[] in modern C++. Most of the times a std::vector should be preferred; in the few instances where the capacity is superfluous, a std::dynarray is still better (because it keeps track of the size).

在现代C ++中通常没有理由使用new []和delete []。大多数时候应该首选std :: vector;在容量多余的少数情况下,std :: dynarray仍然更好(因为它跟踪大小)。

Therefore, without a valid reason to keep using these statements, there is no motivation to include new semantic constructs specifically dedicated to handling them.

因此,没有合理的理由继续使用这些语句,就没有动力包含专门用于处理它们的新语义结构。

And should anyone be motivated enough to make such a proposal:

如果有人有足够的动力提出这样的建议:

  • the inhibition of the current optimization, a violation of C++ philosophy of "You don't pay for what you don't use", would likely be held against them,
  • 对当前优化的抑制,违反了C ++哲学“你不为你不使用的东西买单”,很可能会违反它们,

  • the inclusion of new syntax, when modern C++ proposals have gone to great lengths to avoid it as much as possible (to the point of having a library defined std::variant), would also likely be held against them.
  • 包含新语法,当现代C ++提案尽可能地避免(尽管有一个库定义了std :: variant)时,也可能会对它们采取措施。

I recommend that you simply use std::vector.

我建议你只使用std :: vector。

#2


12  

What is the reason that it's not made to work?

它不起作用的原因是什么?

A range based loop like

一个基于范围的循环

 for(auto a : y) {
     // ...
 }

is just syntactic sugar for the following expression

只是以下表达式的语法糖

 auto endit = std::end(y);
 for(auto it = std::begin(y); it != endit; ++it) {
     auto a = *it;
     // ...
 }

Since std::begin() and std::end() cannot be used with a plain pointer, this can't be applied with a pointer allocated with new[].

由于std :: begin()和std :: end()不能与普通指针一起使用,因此不能使用new []分配的指针来应用它。

As far as I understand, the runtime has the size of the array available (otherwise delete[] could not work)

据我所知,运行时具有可用数组的大小(否则delete []无法工作)

How delete[] keeps track of the memory block that was allocated with new[] (which isn't necessarily the same size as was specified by the user), is a completely different thing and the compiler most probably doesn't even know how exactly this is implemented.

delete []如何跟踪用new []分配的内存块(不一定与用户指定的大小相同),这是完全不同的事情,编译器很可能甚至不知道如何这正是实现的。

#3


6  

When you have this:

当你有这个:

int* array = new int[len];

The problem here is that your variable called array is not an array at all. It is a pointer. That means it only contains the address of one object (in this case the first element of the array created using new).

这里的问题是你的变量称为数组根本不是一个数组。这是一个指针。这意味着它只包含一个对象的地址(在本例中是使用new创建的数组的第一个元素)。

For range based for to work the compiler needs two addresses, the beginning and the end of the array.

对于基于范围的工作,编译器需要两个地址,即数组的开头和结尾。

So the problem is the compiler does not have enough information to do this:

所以问题是编译器没有足够的信息来执行此操作:

// array is only a pointer and does not have enough information
for(int i : array)
{
} 

#4


3  

This is not related to dynamic arrays, it is more general. Of course for dynamic arrays there exists somewhere the size to be able to call destructors (but remember that standard doesn't says anything about that, just that calling delete [] works as intended).

这与动态数组无关,更为通用。当然对于动态数组,存在一个能够调用析构函数的大小(但请记住,标准没有说明任何内容,只是调用delete []按预期工作)。

The problem is with pointers in general as given a pointer you can't tell if it correspond to any kind of...what?

问题是指针通常是指针指示你无法判断它是否对应于任何类型的......什么?

Arrays decay to pointers but given a pointer what can you say?

数组衰减到指针,但给出一个指针,你能说什么?

#5


1  

array is not an array, but a pointer and there's no information about the size of the "array". So, compiler can not deduce begin and end of this array.

array不是数组,而是指针,并且没有关于“数组”大小的信息。因此,编译器无法推断出此数组的开始和结束。

See the syntax of range based for loop:

请参阅基于for循环的范围的语法:

{
   auto && __range = range_expression ; 
   for (auto __begin = begin_expr, __end = end_expr; 
   __begin != __end; ++__begin) { 
   range_declaration = *__begin; 
   loop_statement 
   } 
} 

range_expression - any expression that represents a suitable sequence (either an array or an object for which begin and end member functions or free functions are defined, see below) or a braced-init-list.

range_expression - 表示合适序列的任何表达式(定义了开始和结束成员函数或*函数的数组或对象,请参见下文)或braced-init-list。

auto works at compile time.So, begin_expr and end_expr doesn't at deduct runtime.

auto在编译时工作。因此,begin_expr和end_expr不会扣除运行时。

#6


1  

The reason is that, given only the value of the pointer array, the compiler (and your code) has no information about what it points at. The only thing known is that array has a value which is the address of a single int.

原因是,只给出指针数组的值,编译器(和你的代码)没有关于它所指向的信息。唯一已知的是数组的值是单个int的地址。

It could point at the first element of a statically allocated array. It could point at an element in the middle of a dynamically allocated array. It could point at a member of a data structure. It could point at an element of an array that is within a data structure. The list goes on.

它可以指向静态分配的数组的第一个元素。它可以指向动态分配的数组中间的元素。它可以指向数据结构的成员。它可以指向数据结构中的数组元素。名单还在继续。

Your code will make ASSUMPTIONS about what the pointer points at. It may assume it is an array of 50 elements. Your code may access the value of len, and assume array points at the (first element of) an array of len elements. If your code gets it right, all works as intended. If your code gets it wrong (e.g. accessing the 50th element of an array with 5 elements) then the behaviour is simply undefined. It is undefined because the possibilities are endless - the book-keeping to keep track of what an arbitrary pointer ACTUALLY points at (beyond the information that there is an int at that address) would be enormous.

您的代码将对指针指向的内容进行假设。它可能假设它是一个由50个元素组成的数组。您的代码可以访问len的值,并假设在len元素数组的(第一个元素)处的数组点。如果您的代码正确,则所有代码都按预期工作。如果您的代码出错(例如,访问具有5个元素的数组的第50个元素),那么行为就是未定义的。它是未定义的,因为可能性是无穷无尽的 - 用于记录任意指针ACTUALLY指向的书籍(超出该地址处有int的信息)将是巨大的。

You're starting with the ASSUMPTION that array points at the result from new int[len]. But that information is not stored in the value of array itself, so the compiler has no way to work back to a value of len. That would be needed for your "range based" approach to work.

你从ASSUMPTION开始,该数组指向new int [len]的结果。但是该信息不存储在数组本身的值中,因此编译器无法恢复到len的值。这将是您的“基于范围”的工作方法所需要的。

While, yes, given array = new int[len], the machinery invoked by delete [] array will work out that array has len elements, and release them. But delete [] array also has undefined behaviour if array results from something other than a new [] expression. Even

虽然,是的,给定array = new int [len],delete []数组调用的机制将计算出该数组具有len元素,并释放它们。但是,如果数组来自new []表达式以外的其他内容,则delete []数组也会有未定义的行为。甚至

  int *array = new int;
  delete [] array;

gives undefined behaviour. The "runtime" is not required to work out, in this case, that array is actually the address of a single dynamically allocated int (and not an actual array). So it is not required to cope with that.

给出未定义的行为。在这种情况下,“运行时”不需要计算,该数组实际上是单个动态分配的int(而不是实际数组)的地址。所以不需要处理这个问题。

#1


2  

int* array = new int[len];
for[] (int i : array) {}

There are several points which must be addressed; I'll tackle them one at a time.

必须解决几点问题;我会一次解决一个问题。

Does the run-time knows the size of the array?

运行时是否知道数组的大小?

In certain conditions, it must. As you pointed out, a call to delete[] will call the destructor of each element (in reserve order) and therefore must know how many there are.

在某些情况下,它必须。正如您所指出的,对delete []的调用将调用每个元素的析构函数(按预留顺序),因此必须知道有多少元素。

However, by not specifying that the number of elements must be known, and accessible, the C++ standard allows an implementation to omit it whenever the call to the destructor is not required (std::is_trivially_destructible<T>::value evaluates to true).

但是,通过不指定必须知道和访问元素的数量,C ++标准允许实现在不需要调用析构函数时省略它(std :: is_trivially_destructible :: value计算结果为true) 。

Can the run-time distinguish between pointer and array?

运行时可以区分指针和数组吗?

In general, no.

一般来说,没有。

When you have a pointer, it could point to anything:

当你有一个指针时,它可以指向任何东西:

  • a single item, or an item in an array,
  • 单个项目,或数组中的项目,

  • the first item in an array, or any other,
  • 数组中的第一项或任何其他项,

  • an array on the stack, or an array on the heap,
  • 堆栈上的数组,或堆上的数组,

  • just an array, or an array part of a larger object.
  • 只是一个数组,或更大对象的数组部分。

This is the reason what delete[] exists, and using delete here would be incorrect. With delete[], you the user state: this pointer points to the first item of a heap-allocated array.

这就是delete []存在的原因,并且在这里使用delete是不正确的。使用delete [],用户状态:this指针指向堆分配数组的第一项。

The implementation can then assume that, for example, in the 8 bytes preceding this first item it can find the size of the array. Without you guaranteeing this, those 8 bytes could be anything.

然后,该实现可以假设,例如,在该第一项之前的8个字节中,它可以找到该数组的大小。如果不保证这一点,那8个字节可以是任何东西。

Then, why not go all the way and create for[] (int i : array)?

那么,为什么不一直为[](int i:array)创建并创建?

There are two reasons:

有两个原因:

  1. As mentioned, today an implementation can elide the size on a number of elements; with this new for[] syntax, it would no longer be possible on a per-type basis.
  2. 如上所述,今天的实施可以在许多要素上消除规模;使用这个新的[]语法,在每个类型的基础上将不再可能。

  3. It's not worth it.
  4. 这不值得。

Let us be honest, new[] and delete[] are relics of an older time. They are incredibly awkward:

说实话,新[]和删除[]是旧时代的遗物。他们非常尴尬:

  • the number of elements has to be known in advance, and cannot be changed,
  • 元素的数量必须事先知道,不能改变,

  • the elements must be default constructible, or otherwise C-ish,
  • 元素必须是默认可构造的,否则C-ish,

and unsafe to use:

并且使用不安全:

  • the number of elements is inaccessible to the user.
  • 用户无法访问元素的数量。

There is generally no reason to use new[] and delete[] in modern C++. Most of the times a std::vector should be preferred; in the few instances where the capacity is superfluous, a std::dynarray is still better (because it keeps track of the size).

在现代C ++中通常没有理由使用new []和delete []。大多数时候应该首选std :: vector;在容量多余的少数情况下,std :: dynarray仍然更好(因为它跟踪大小)。

Therefore, without a valid reason to keep using these statements, there is no motivation to include new semantic constructs specifically dedicated to handling them.

因此,没有合理的理由继续使用这些语句,就没有动力包含专门用于处理它们的新语义结构。

And should anyone be motivated enough to make such a proposal:

如果有人有足够的动力提出这样的建议:

  • the inhibition of the current optimization, a violation of C++ philosophy of "You don't pay for what you don't use", would likely be held against them,
  • 对当前优化的抑制,违反了C ++哲学“你不为你不使用的东西买单”,很可能会违反它们,

  • the inclusion of new syntax, when modern C++ proposals have gone to great lengths to avoid it as much as possible (to the point of having a library defined std::variant), would also likely be held against them.
  • 包含新语法,当现代C ++提案尽可能地避免(尽管有一个库定义了std :: variant)时,也可能会对它们采取措施。

I recommend that you simply use std::vector.

我建议你只使用std :: vector。

#2


12  

What is the reason that it's not made to work?

它不起作用的原因是什么?

A range based loop like

一个基于范围的循环

 for(auto a : y) {
     // ...
 }

is just syntactic sugar for the following expression

只是以下表达式的语法糖

 auto endit = std::end(y);
 for(auto it = std::begin(y); it != endit; ++it) {
     auto a = *it;
     // ...
 }

Since std::begin() and std::end() cannot be used with a plain pointer, this can't be applied with a pointer allocated with new[].

由于std :: begin()和std :: end()不能与普通指针一起使用,因此不能使用new []分配的指针来应用它。

As far as I understand, the runtime has the size of the array available (otherwise delete[] could not work)

据我所知,运行时具有可用数组的大小(否则delete []无法工作)

How delete[] keeps track of the memory block that was allocated with new[] (which isn't necessarily the same size as was specified by the user), is a completely different thing and the compiler most probably doesn't even know how exactly this is implemented.

delete []如何跟踪用new []分配的内存块(不一定与用户指定的大小相同),这是完全不同的事情,编译器很可能甚至不知道如何这正是实现的。

#3


6  

When you have this:

当你有这个:

int* array = new int[len];

The problem here is that your variable called array is not an array at all. It is a pointer. That means it only contains the address of one object (in this case the first element of the array created using new).

这里的问题是你的变量称为数组根本不是一个数组。这是一个指针。这意味着它只包含一个对象的地址(在本例中是使用new创建的数组的第一个元素)。

For range based for to work the compiler needs two addresses, the beginning and the end of the array.

对于基于范围的工作,编译器需要两个地址,即数组的开头和结尾。

So the problem is the compiler does not have enough information to do this:

所以问题是编译器没有足够的信息来执行此操作:

// array is only a pointer and does not have enough information
for(int i : array)
{
} 

#4


3  

This is not related to dynamic arrays, it is more general. Of course for dynamic arrays there exists somewhere the size to be able to call destructors (but remember that standard doesn't says anything about that, just that calling delete [] works as intended).

这与动态数组无关,更为通用。当然对于动态数组,存在一个能够调用析构函数的大小(但请记住,标准没有说明任何内容,只是调用delete []按预期工作)。

The problem is with pointers in general as given a pointer you can't tell if it correspond to any kind of...what?

问题是指针通常是指针指示你无法判断它是否对应于任何类型的......什么?

Arrays decay to pointers but given a pointer what can you say?

数组衰减到指针,但给出一个指针,你能说什么?

#5


1  

array is not an array, but a pointer and there's no information about the size of the "array". So, compiler can not deduce begin and end of this array.

array不是数组,而是指针,并且没有关于“数组”大小的信息。因此,编译器无法推断出此数组的开始和结束。

See the syntax of range based for loop:

请参阅基于for循环的范围的语法:

{
   auto && __range = range_expression ; 
   for (auto __begin = begin_expr, __end = end_expr; 
   __begin != __end; ++__begin) { 
   range_declaration = *__begin; 
   loop_statement 
   } 
} 

range_expression - any expression that represents a suitable sequence (either an array or an object for which begin and end member functions or free functions are defined, see below) or a braced-init-list.

range_expression - 表示合适序列的任何表达式(定义了开始和结束成员函数或*函数的数组或对象,请参见下文)或braced-init-list。

auto works at compile time.So, begin_expr and end_expr doesn't at deduct runtime.

auto在编译时工作。因此,begin_expr和end_expr不会扣除运行时。

#6


1  

The reason is that, given only the value of the pointer array, the compiler (and your code) has no information about what it points at. The only thing known is that array has a value which is the address of a single int.

原因是,只给出指针数组的值,编译器(和你的代码)没有关于它所指向的信息。唯一已知的是数组的值是单个int的地址。

It could point at the first element of a statically allocated array. It could point at an element in the middle of a dynamically allocated array. It could point at a member of a data structure. It could point at an element of an array that is within a data structure. The list goes on.

它可以指向静态分配的数组的第一个元素。它可以指向动态分配的数组中间的元素。它可以指向数据结构的成员。它可以指向数据结构中的数组元素。名单还在继续。

Your code will make ASSUMPTIONS about what the pointer points at. It may assume it is an array of 50 elements. Your code may access the value of len, and assume array points at the (first element of) an array of len elements. If your code gets it right, all works as intended. If your code gets it wrong (e.g. accessing the 50th element of an array with 5 elements) then the behaviour is simply undefined. It is undefined because the possibilities are endless - the book-keeping to keep track of what an arbitrary pointer ACTUALLY points at (beyond the information that there is an int at that address) would be enormous.

您的代码将对指针指向的内容进行假设。它可能假设它是一个由50个元素组成的数组。您的代码可以访问len的值,并假设在len元素数组的(第一个元素)处的数组点。如果您的代码正确,则所有代码都按预期工作。如果您的代码出错(例如,访问具有5个元素的数组的第50个元素),那么行为就是未定义的。它是未定义的,因为可能性是无穷无尽的 - 用于记录任意指针ACTUALLY指向的书籍(超出该地址处有int的信息)将是巨大的。

You're starting with the ASSUMPTION that array points at the result from new int[len]. But that information is not stored in the value of array itself, so the compiler has no way to work back to a value of len. That would be needed for your "range based" approach to work.

你从ASSUMPTION开始,该数组指向new int [len]的结果。但是该信息不存储在数组本身的值中,因此编译器无法恢复到len的值。这将是您的“基于范围”的工作方法所需要的。

While, yes, given array = new int[len], the machinery invoked by delete [] array will work out that array has len elements, and release them. But delete [] array also has undefined behaviour if array results from something other than a new [] expression. Even

虽然,是的,给定array = new int [len],delete []数组调用的机制将计算出该数组具有len元素,并释放它们。但是,如果数组来自new []表达式以外的其他内容,则delete []数组也会有未定义的行为。甚至

  int *array = new int;
  delete [] array;

gives undefined behaviour. The "runtime" is not required to work out, in this case, that array is actually the address of a single dynamically allocated int (and not an actual array). So it is not required to cope with that.

给出未定义的行为。在这种情况下,“运行时”不需要计算,该数组实际上是单个动态分配的int(而不是实际数组)的地址。所以不需要处理这个问题。