I'm trying to gather different size arrays using MPI_Gatherv
, but for some reason, it's only gathering the first object in the first processor. When I do the above for loop, I get the correct values from xPos and yPos, but when I gather the data to the xFinal and yFinal array and print out the values, I'm only getting the first x and y. So basically the first object has an (x,y) of (0,0) and I have 10 objects, and all the objects only print out (0,0) when the actual object it should be referencing has a different (x,y)
我正在尝试使用MPI_Gatherv收集不同大小的数组,但由于某种原因,它只收集第一个处理器中的第一个对象。当我执行上面的for循环时,我从xPos和yPos获得正确的值,但是当我将数据收集到xFinal和yFinal数组并打印出值时,我只得到第一个x和y。所以基本上第一个对象有(0,0)的(x,y),我有10个对象,当它应该引用的实际对象有不同的时候,所有对象只打印出(0,0)(x, Y)
Just in case to say, the counts[rank] and displs are definitely right because I used them to scatterv previously.
为了说明,计数[rank]和displ肯定是正确的,因为我之前使用它们来分散。
Am I using gatherrv incorrectly? Or am I printing incorrectly?
我是否错误地使用了gatherrv?或者我打印错误?
for ( a = 0; a < size; a++) {
if (rank == a) {
for ( i = 0 ; i < counts[rank]; i++) {
printf("from procs %d: %lE %lE\n", rank, xPos[i], yPos[i]);
}
}
}
MPI_Gatherv(&xPos, counts[rank], MPI_DOUBLE, &xFinal, counts, displs, MPI_DOUBLE,0, MPI_COMM_WORLD);
MPI_Gatherv(&yPos, counts[rank], MPI_DOUBLE, &yFinal, counts, displs, MPI_DOUBLE,0, MPI_COMM_WORLD);
MPI_Finalize();
FILE* f = fopen("universe.out", "wt");
for (i = 0; i < N; i++)
fprintf(f, "%lE %lE\n", xFinal[i], yFinal[i]);
fclose(f);
2 个解决方案
#1
2
It seems that you are writing the file from all ranks simultaneously. You should put the file writing code within if (rank == 0) { ... }
to only let rank 0 write:
您似乎正在同时编写所有级别的文件。您应该将文件编写代码置于if(rank == 0){...}中,只允许等级0写入:
if (rank == 0)
{
FILE* f = fopen("universe.out", "wt");
for (i = 0; i < N; i++)
fprintf(f, "%lE %lE\n", xFinal[i], yFinal[i]);
fclose(f);
}
Otherwise the content of the file could be anything.
否则,文件的内容可能是任何内容。
#2
0
In my usual role of MPI-IO advocate, please consider MPI-IO for this kind of problem. You may be able to skip the gather entirely by having each process write to the file. Furthermore, if N is large, you do not want N file operations. Let MPI (either directly or through a library) make your life easier.
在我通常的MPI-IO倡导者角色中,请考虑MPI-IO这类问题。您可以通过让每个进程写入文件来完全跳过聚集。此外,如果N很大,则不需要N个文件操作。让MPI(直接或通过图书馆)让您的生活更轻松。
First off, is a text-based output format really what you (and your collaborators) want? if universe.out gets large enough that you want to read it in parallel, you are going to have a challenge decomposing the file across processors. Consider parallel HDF5 (phdf5) or paralllel-netcdf (pnetcdf) or any other of the higher-level libraries with a self-describing portable file format.
首先,基于文本的输出格式真的是你(和你的合作者)想要的吗?如果universe.out变得足够大以至于您想要并行读取它,那么您将面临跨处理器分解文件的挑战。考虑并行HDF5(phdf5)或paralllel-netcdf(pnetcdf)或任何其他具有自描述便携式文件格式的高级库。
Here's an example of how you would write out all the x values then all the y values.
这是一个如何写出所有x值然后写出所有y值的示例。
#include <stdio.h>
#include <mpi.h>
#include <unistd.h> //getpid
#include <stdlib.h> //srandom, random
#define MAX_PARTS 10
int main(int argc, char **argv) {
int rank, nprocs;
MPI_Offset nparts; // more than 2 billion particle possible
int i;
double *x, *y;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Offset start=0, x_end;
MPI_Info info;
MPI_File fh;
MPI_Status status;
srandom(getpid());
/* demonstrate this works even when particles are not evenly
* distributed over the processes */
nparts=((double)random()/RAND_MAX)*MAX_PARTS;
x = malloc(nparts*sizeof(*x));
y = malloc(nparts*sizeof(*y));
for (i = 0; i< nparts; i++) {
/* just some bogus data to see what happens */
x[i] = rank*100+i;
y[i] = rank*200+i;
}
/* not using this now. might tune later if needed */
MPI_Info_create(&info);
MPI_File_open(MPI_COMM_WORLD, "universe.out",
MPI_MODE_CREATE|MPI_MODE_WRONLY, info, &fh);
MPI_File_set_view(fh, 0, MPI_DOUBLE, MPI_DOUBLE, "native", info);
MPI_Scan(&nparts, &start, 1, MPI_OFFSET, MPI_SUM, MPI_COMM_WORLD);
/* MPI_Scan is a prefix reduction: remove our contribution */
x_end = start; /* only the last rank will use this in the bcast below */
start -= nparts;
MPI_Bcast(&x_end, 1, MPI_OFFSET, nprocs-1, MPI_COMM_WORLD);
MPI_File_write_at_all(fh, start, x, nparts, MPI_DOUBLE, &status);
MPI_File_write_at_all(fh, start+x_end, y, nparts, MPI_DOUBLE, &status);
MPI_Info_free(&info);
MPI_File_close(&fh);
MPI_Finalize();
}
Writing out the x array and y array as pairs of (x,y) values is possible, but a bit more complicated (You would have to make an MPI datatype).
将x数组和y数组写成(x,y)值对是可能的,但有点复杂(你必须制作一个MPI数据类型)。
Parallel-NetCDF has an "operation combining" optimization that does this for you. Parallel-HDF5 has a "multi-dataset i/o" optimization in the works for the next release. With those optimizations you could define a 3d array, with one dimension for x and y and the third for "particle identifier". Then you could post the operation for the x values, the operation for the y values, and let the library stich all that together into one call.
Parallel-NetCDF具有“操作组合”优化功能,可以为您完成此任务。 Parallel-HDF5在下一版本中具有“多数据集i / o”优化功能。通过这些优化,您可以定义一个3d数组,其中x和y为一维,“粒子标识符”为第三维。然后你可以发布x值的操作,y值的操作,并让库将所有这些组合成一个调用。
#1
2
It seems that you are writing the file from all ranks simultaneously. You should put the file writing code within if (rank == 0) { ... }
to only let rank 0 write:
您似乎正在同时编写所有级别的文件。您应该将文件编写代码置于if(rank == 0){...}中,只允许等级0写入:
if (rank == 0)
{
FILE* f = fopen("universe.out", "wt");
for (i = 0; i < N; i++)
fprintf(f, "%lE %lE\n", xFinal[i], yFinal[i]);
fclose(f);
}
Otherwise the content of the file could be anything.
否则,文件的内容可能是任何内容。
#2
0
In my usual role of MPI-IO advocate, please consider MPI-IO for this kind of problem. You may be able to skip the gather entirely by having each process write to the file. Furthermore, if N is large, you do not want N file operations. Let MPI (either directly or through a library) make your life easier.
在我通常的MPI-IO倡导者角色中,请考虑MPI-IO这类问题。您可以通过让每个进程写入文件来完全跳过聚集。此外,如果N很大,则不需要N个文件操作。让MPI(直接或通过图书馆)让您的生活更轻松。
First off, is a text-based output format really what you (and your collaborators) want? if universe.out gets large enough that you want to read it in parallel, you are going to have a challenge decomposing the file across processors. Consider parallel HDF5 (phdf5) or paralllel-netcdf (pnetcdf) or any other of the higher-level libraries with a self-describing portable file format.
首先,基于文本的输出格式真的是你(和你的合作者)想要的吗?如果universe.out变得足够大以至于您想要并行读取它,那么您将面临跨处理器分解文件的挑战。考虑并行HDF5(phdf5)或paralllel-netcdf(pnetcdf)或任何其他具有自描述便携式文件格式的高级库。
Here's an example of how you would write out all the x values then all the y values.
这是一个如何写出所有x值然后写出所有y值的示例。
#include <stdio.h>
#include <mpi.h>
#include <unistd.h> //getpid
#include <stdlib.h> //srandom, random
#define MAX_PARTS 10
int main(int argc, char **argv) {
int rank, nprocs;
MPI_Offset nparts; // more than 2 billion particle possible
int i;
double *x, *y;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Offset start=0, x_end;
MPI_Info info;
MPI_File fh;
MPI_Status status;
srandom(getpid());
/* demonstrate this works even when particles are not evenly
* distributed over the processes */
nparts=((double)random()/RAND_MAX)*MAX_PARTS;
x = malloc(nparts*sizeof(*x));
y = malloc(nparts*sizeof(*y));
for (i = 0; i< nparts; i++) {
/* just some bogus data to see what happens */
x[i] = rank*100+i;
y[i] = rank*200+i;
}
/* not using this now. might tune later if needed */
MPI_Info_create(&info);
MPI_File_open(MPI_COMM_WORLD, "universe.out",
MPI_MODE_CREATE|MPI_MODE_WRONLY, info, &fh);
MPI_File_set_view(fh, 0, MPI_DOUBLE, MPI_DOUBLE, "native", info);
MPI_Scan(&nparts, &start, 1, MPI_OFFSET, MPI_SUM, MPI_COMM_WORLD);
/* MPI_Scan is a prefix reduction: remove our contribution */
x_end = start; /* only the last rank will use this in the bcast below */
start -= nparts;
MPI_Bcast(&x_end, 1, MPI_OFFSET, nprocs-1, MPI_COMM_WORLD);
MPI_File_write_at_all(fh, start, x, nparts, MPI_DOUBLE, &status);
MPI_File_write_at_all(fh, start+x_end, y, nparts, MPI_DOUBLE, &status);
MPI_Info_free(&info);
MPI_File_close(&fh);
MPI_Finalize();
}
Writing out the x array and y array as pairs of (x,y) values is possible, but a bit more complicated (You would have to make an MPI datatype).
将x数组和y数组写成(x,y)值对是可能的,但有点复杂(你必须制作一个MPI数据类型)。
Parallel-NetCDF has an "operation combining" optimization that does this for you. Parallel-HDF5 has a "multi-dataset i/o" optimization in the works for the next release. With those optimizations you could define a 3d array, with one dimension for x and y and the third for "particle identifier". Then you could post the operation for the x values, the operation for the y values, and let the library stich all that together into one call.
Parallel-NetCDF具有“操作组合”优化功能,可以为您完成此任务。 Parallel-HDF5在下一版本中具有“多数据集i / o”优化功能。通过这些优化,您可以定义一个3d数组,其中x和y为一维,“粒子标识符”为第三维。然后你可以发布x值的操作,y值的操作,并让库将所有这些组合成一个调用。