在c++中，二维数组构造混乱

I am working on a lab for a class I'm taking and I'm reading the code provided by the teacher... I'm wondering if anyone here can make sense of this array construction (which is completely unrelated to the rest of the program, so no, you're not doing my homework for me :)

我正在一个实验室工作，我正在学习一门课，我正在阅读老师提供的代码……我想知道这里是否有人能理解这个数组结构(它与程序的其他部分完全无关，所以不，你不是在为我做作业)

float **b = new float*[n]; //create local matrix
b[0] = new float[n*nCols];
for (int j = 1; j < n; j++) b[j] = b[j-1] + n;

To me, it looks like this will create a bunch of garbage based on whatever was in memory before b was created... There is no initialization of b[0][0] so it seems like this will generate unpredictable output.

在我看来，这似乎会基于b被创建之前的内存创建一堆垃圾……没有对b[0][0]进行初始化，所以这似乎会产生不可预测的输出。

Is that correct?

那是正确的吗?

Also, as far as I can tell, this is the "shape" of the b array of array pointers:

而且，据我所知，这是数组指针的b数组的“形状”:

b= {
     *{ ?, ?, ?, ?, ?, ?, ?, .... nCols unknown floats },
     *{ ?, ?, ?, ?, ?, ?, ?, .... nCols unknown floats },
     *{ ?, ?, ?, ?, ?, ?, ?, .... nCols unknown floats },
     *{ ?, ?, ?, ?, ?, ?, ?, .... nCols unknown floats },
     *{ ?, ?, ?, ?, ?, ?, ?, .... nCols unknown floats },
     ...
     n
   }

So b is an array of size n pointers to arrays of size nCols

b是一个大小为n的数组指向大小为nCols的数组

Is this correct?

这是正确的吗?

Doesn't this code seem a little more confusing than it should be, and probably incorrect (since it's using default-initialized garbage values)?

这段代码看起来是不是比它应该的更令人困惑，而且可能是错误的(因为它使用的是默认初始化的垃圾值)?

I'm new to C++ and I just want to make sure I'm understanding what I'm reading here. The teacher is used to Fortran and we're learning about parallel systems, so this isn't really related to class.

我是c++的新手，我只是想确认一下我在这里读到什么。老师已经习惯了Fortran，我们正在学习并行系统，所以这和课程没有什么关系。

Thanks.

谢谢。

EDIT (update)

编辑(更新)

user2079303 pointed out that this line: for (int j = 1; j < n; j++) b[j] = b[j-1] + n; (which should be: for (int j = 1; j < n; j++) b[j] = b[j-1] + nCols; )

user2079303指出这一行:for (int j = 1;j < n;j+) b[j] = b[j-1] + n;(应该是:for (int j = 1;j < n;b[j] = b[j-1] + nCols;

is setting up b[x] where x > 0, as an array of pointers to elements in a n*nCols matrix (stored at address b[0]) by doing pointer arithmetic with b[j-1]...

设置b[x]，其中x > 0，作为指向n*nCols矩阵元素的指针数组(存储在地址b[0])，通过对b进行指针运算[j-1]…

That makes sense to me now why this is all just garbage in this array. All this construction is doing is creating a bunch of pointers to a space in memory that will be used later.

这对我来说很有道理，为什么这都是这个数组中的垃圾。所有这些构造所做的就是创建一系列指向内存中的空间的指针，这些空间将在以后使用。

In fact, as user2079303 surmised, the data is not read from these pointers until it's set but other processes: MPI_Scatter(a[0],n*nCols,MPI_FLOAT,b[0],n*nCols,MPI_FLOAT,0,MPI_COMM_WORLD);

实际上，正如user2079303推测的那样，数据在设置之前不会从这些指针中读取，而是通过其他进程:MPI_Scatter(a[0]，n*nCols,MPI_FLOAT,b[0]，n*nCols,MPI_FLOAT,0,MPI_COMM_WORLD);

In that function the pointer to b[0] is a receive buffer! So the garbage is overwritten, not ever used, but the pointers are useful for later reading after the buffer is overwritten.

在该函数中，指向b[0]的指针是一个接收缓冲区!因此，垃圾被覆盖，从来没有使用过，但是指针对于以后重写缓冲区之后的读取很有用。

Thanks everyone.

谢谢每一个人。

4 个解决方案

#1

To me, it looks like this will create a bunch of garbage based on whatever was in memory before b was created...

在我看来，这将会创建一堆垃圾，基于在b创建之前内存中的任何东西……

Correct, the value of the floats in the array are not initialized.

正确，数组中的浮点数的值没有初始化。

There is no initialization of b[0][0]

b[0][0]没有初始化

None of the b[0][y] for all y in [0,n*nCols) are initialized.

在[0,n*nCols]中，所有的b[0][y]都没有被初始化。

so it seems like this will generate unpredictable output.

所以这似乎会产生不可预测的输出。

Only if you don't initialize those values before you use them. There is no output at all in the code that you show. In fact, the values are not even read in the shown code.

只有当您在使用这些值之前没有初始化这些值时。您所显示的代码中根本没有输出。事实上，这些值甚至没有在显示的代码中读取。

Well, there appears to be a typo in the original code:

嗯，在原始代码中似乎有一个错误:

b[j] = b[j-1] + n;

Should be

应该是

b[j] = b[j-1] + nCols;

The code is correct assuming that bug is fixed. Otherwise, if n is greater than nCols, there will be a buffer overflow.

如果错误被修复，代码是正确的。否则，如果n大于nCols，就会出现缓冲区溢出。

So b is an array of size n pointers to arrays of size nCols

b是一个大小为n的数组指向大小为nCols的数组

Not quite. b[0] points to an array that contains all n*nCols values. The rest of b[x], x in [1,n) point to different locations of the same array. But indeed, you can use the pointers as if they pointed to arrays of size nCols.

不完全是。b[0]指向一个包含所有n*nCols值的数组。剩下的b[x]和[1,n]中的x指向同一个数组的不同位置。但实际上，您可以使用指针，就像它们指向大小为nCols的数组一样。

#2

The code is both confusing, and probably (it's hard to tell the intent) incorrect.

这段代码既令人困惑，也可能(很难说其意图)不正确。

// we create an arry of float*
float **b = new float*[n];

// assume n is 3
//    b[0] is a float*, value not explicitly set
//    b[1] is a float*, value not explicitly set
//    b[2] is a float*, value not explicitly set

// b[0] then creates an array of float, assume nCols is 2
b[0] = new float[n*nCols]; 

//    b[0] is a float*, pointing to an array of 6 (3*2) floats
//    b[1] is a float*, value not explicitly set
//    b[2] is a float*, value not explicitly set

// we then iterate the rest and assign values (apparently incorrectly)
for (int j = 1; j < n; j++)
{
    // b[1] = b[1-1] + 3
    b[j] = b[j-1] + n;
}

The canonical way to do this in C++ is to use a std::vector.

在c++中实现这一点的典型方法是使用std::vector。

std::vector<std::vector<float>> matrix;

matrix.resize(n);  // resize to the number of rows

for(auto& row : matrix)
{
    row.resize(nCols);  // resize each row to the number of columns
                        // std::vector::resize will default initialize
                        // the floats created here
}

Now you can access it as you expect:

现在您可以按照您的期望访问它:

matrix[2][1] = 6;

In this case, the memory layout is non-contiguous, which for large matrices can be burdensome, as the different parts need to be fetched from memory that may not be in cache. It basically looks like this -- the "addresses" are just there to indicate that each value in a row is contiguous, and each column is contiguous, but there is a disconnect where the entire structure is NOT.

在这种情况下，内存布局是非连续的，这对于大型矩阵来说是很麻烦的，因为不同的部分需要从可能不在缓存中的内存中提取。它基本上是这样的——“地址”就在那里，表示行中的每个值都是连续的，并且每个列都是连续的，但是在整个结构都不是的地方有一个断开。

matrix[0] - some address, assume 0
   row[0] - some address, assume 64
   row[1] - some address, assume 72
   row[2] - some address, assume 80
matrix[1] - some address, assume 8
   row[0] - some address, assume 128
   ...
matrix[1] - some address, assume 16
   row[0] - some address, assume 256

For a true matrix (where both the rows and columns are fixed), you can use a single vector and use multiplication to get/set the appropriate value.

对于一个真正的矩阵(行和列都是固定的)，您可以使用一个向量，并使用乘法来获得/设置适当的值。

A 2*3 matrix will have 6 elements.

2*3矩阵有6个元素。

  std::vector<float> matrix(6);  // create a vector with 6 floats

  size_t row = 1, col = 2;
  matrix[row*col] = 3;

In this case, all of the data is kept in contiguous memory.

在这种情况下，所有数据都保存在连续内存中。

#3

It's difficult to try to predict the intention of the teacher just from this piece of code. Here are some thoughts about arrays that may help you in your understanding.

仅仅从这段代码就很难预测老师的意图。这里有一些关于数组的想法，可以帮助您理解。

Dynamic arrays are indistinguishable from pointers. A pointer points to a memory address. A dynamic array is a pointer to the beginning of the array, and an associated length (which you have to know and keep somewhere).

动态数组与指针不可区分。指针指向一个内存地址。动态数组是指向数组开头的指针，以及相关的长度(您必须知道并将其保存在某处)。

In your provided code sample, where you have:

在你提供的代码样本中，你有:

float **b = new float*[n];

You are defining a new array of memory. The size of the array in bytes, is the size of float* (a pointer to a float) times n, and the address of that array is stored in b, which can hold addresses to float pointers. None of the positions of the new array are initialized yet, and contain garbage as you mentioned.

您正在定义一个新的内存数组。数组以字节为单位的大小，是float*(指针指向浮点数)的大小乘以n，该数组的地址存储在b中，b可以保存浮点数的地址。新数组的所有位置都还没有初始化，并且如您所述包含垃圾。

Afterwards where you have:

后来,你有:

b[0] = new float[n*nCols];

The thing happening there is that you are requesting a new array of memory. The size of this new array in bytes, is the size of a float times n*nCols, and the address of that array is stored in b[0]. None of the positions of this new array are initialized yet.

这里发生的事情是你正在请求一个新的内存数组。以字节为单位的新数组的大小，是浮点数乘以n*nCols的大小，该数组的地址存储在b[0]中。这个新数组的所有位置都还没有初始化。

Finally, the rest of the code provided initializes the rest of positions of b, using the values of previously initialized positions. So b[1] is initialized to something that depends on b[0], b[2] is initialized to something that depends on b[1], and so on.

最后，提供的其余代码使用先前初始化的位置的值初始化b的其余位置。b[1]被初始化为依赖于b[0]的东西，b[2]被初始化为依赖于b[1]的东西，以此类推。

for (int j = 1; j < n; j++) b[j] = b[j-1] + n;

Now lets drill down that a little. b[j-1] + n is adding the integer value n to the expression b[j-1], which is of type float*. When you add an integer to a memory address, you are basically defining a new memory address that points to the original memory address plus the integer value times the size of the pointer. This is known as pointers arithmetic.

现在让我们深入一点。b[j-1] + n向表达式b[j-1]中添加整数值n [j-1]，类型为float*。当您将一个整数添加到一个内存地址时，您基本上是在定义一个新的内存地址，该地址指向原始内存地址，加上整数值乘以指针的大小。这就是指针算法。

b[1] = b[0] + n

This assigns in b[1] a shifted part of the array in b[0]. So basically b[1][X] == b[0+n][X] will be true for any value of X. That is true, because b[1] points to the same memory address than b[0+n].

它在b[1] a中赋值b[0]中移位的数组的一部分。基本上，b[1][X] = b[0+n][X]对X的任何值都成立，这是对的，因为b[1]指向与b[0+n]相同的内存地址。

I don't know if the code provided by your professor is correct. We would need the whole code and the description of the problem being solved. The fragment of the code you provided only created 2 arrays, and initialized just 1 of them. The second (the big array of floats) remains un-initialized with this code.

我不知道你教授提供的代码是否正确。我们需要完整的代码和解决问题的描述。您提供的代码片段只创建了两个数组，并初始化了其中的一个数组。第二个(浮动的大数组)仍然未用此代码初始化。

Hope this helps you in your understanding.

希望这对你的理解有所帮助。

#4

The code is perfectly fine, but yes it is a little confusing. Instead of making an allocate call for every subarray, your teacher allocated the memory once and distribute it across different rows.

代码非常好，但确实有点混乱。不是为每个子数组分配一个分配调用，而是您的老师分配一次内存，并将它分配到不同的行中。

The shape of b that you created is fine, but yes the allocated memory contains garbage. To initialize it with zero you can do following,

您创建的b的形状很好，但是分配的内存包含垃圾。要初始化为0，你可以这样做，

float **b = new float*[n]; //create local matrix
b[0] = new float[n*nCols] {0.0f}; // this line changed
for (int j = 1; j < n; j++) b[j] = b[j-1] + n;

or you can use following to initialize with default value,

或者可以使用以下方法初始化默认值，

float **b = new float*[n]; //create local matrix
b[0] = new float[n*nCols](); // this line changed
for (int j = 1; j < n; j++) b[j] = b[j-1] + n;

#1