如何在MPI中实现链表或二叉树的结构?

时间:2021-01-21 07:18:16

In C we define structure for linked list or binary tree like that:

在C中,我们定义链表或二叉树的结构,如下所示:

struct list{
  int val;
  list *next;
};

OR

要么

struct tree_node{
  int val;
  tree_node *left, *right;
};

we can easily assign pointer of next memory location in serial programming. My question is how do I handle pointer in MPI where multiple processor has its local memory? How do I keep track it? How to implement linked list/binary tree in MPI? I know about MPI_Graph. But it is not useful in my scenario.

我们可以在串行编程中轻松分配下一个存储单元的指针。我的问题是如何在多处理器有本地内存的MPI中处理指针?我该如何跟踪它?如何在MPI中实现链表/二叉树?我知道MPI_Graph。但它在我的场景中没用。

I appreciate your answer. Thanks in advance.

我很感激你的回答。提前致谢。

1 个解决方案

#1


2  

I'll discuss a linked list, but all of this applies to a binary tree just as easily with a little extra work.

我将讨论链接列表,但所有这些都适用于二叉树,只需要一些额外的工作即可。

Implementing a linked list in the classical sense isn't exactly possible in MPI because, as you said, each process has its own local memory which won't be consistent on other processes. So that essentially limits using something simple like point to point messaging unless you want to do a lot of work that wouldn't really make sense.

在MPI中实现链接列表在MPI中是不可能的,因为正如您所说,每个进程都有自己的本地内存,这在其他进程中是不一致的。所以这基本上限制了使用像点对点消息传递这样简单的东西,除非你想做很多没有意义的工作。

However, it is possible to do something using one sided communication, or RMA. In fact, there's some example code here. The basic idea of RMA is that each rank exposes a region of memory to the other processes. Then, with the appropriate accessors and synchronization calls, each process can get data from and put data into the other processes memory.

但是,可以使用单面通信或RMA执行某些操作。实际上,这里有一些示例代码。 RMA的基本思想是每个等级将一个内存区域暴露给其他进程。然后,通过适当的访问器和同步调用,每个进程都可以从其他进程内存中获取数据并将数据放入其他进程内存中。

The example uses a dynamic window to allow the application to allocate memory as needed, but it's also possible to statically allocate all your memory up front and point each process to it at the beginning of the application, which might make it a little easier to understand.

该示例使用动态窗口允许应用程序根据需要分配内存,但也可以预先静态分配所有内存并在应用程序开始时将每个进程指向它,这可能使它更容易理解。

Whether or not all of this is efficient or the right thing to do is a different argument. For sufficiently large lists, this can be powerful because you can store more data that you would be able to in a single node's memory. However, for small data structures, the costs of traversing the list become rather high, so it's pretty inefficient to distribute the list and it might be more practical to replicate it on each node.

所有这些都是有效的还是正确的做法是一个不同的论点。对于足够大的列表,这可能很强大,因为您可以在单个节点的内存中存储更多数据。但是,对于小型数据结构,遍历列表的成本相当高,因此分发列表的效率非常低,在每个节点上复制它可能更实际。

#1


2  

I'll discuss a linked list, but all of this applies to a binary tree just as easily with a little extra work.

我将讨论链接列表,但所有这些都适用于二叉树,只需要一些额外的工作即可。

Implementing a linked list in the classical sense isn't exactly possible in MPI because, as you said, each process has its own local memory which won't be consistent on other processes. So that essentially limits using something simple like point to point messaging unless you want to do a lot of work that wouldn't really make sense.

在MPI中实现链接列表在MPI中是不可能的,因为正如您所说,每个进程都有自己的本地内存,这在其他进程中是不一致的。所以这基本上限制了使用像点对点消息传递这样简单的东西,除非你想做很多没有意义的工作。

However, it is possible to do something using one sided communication, or RMA. In fact, there's some example code here. The basic idea of RMA is that each rank exposes a region of memory to the other processes. Then, with the appropriate accessors and synchronization calls, each process can get data from and put data into the other processes memory.

但是,可以使用单面通信或RMA执行某些操作。实际上,这里有一些示例代码。 RMA的基本思想是每个等级将一个内存区域暴露给其他进程。然后,通过适当的访问器和同步调用,每个进程都可以从其他进程内存中获取数据并将数据放入其他进程内存中。

The example uses a dynamic window to allow the application to allocate memory as needed, but it's also possible to statically allocate all your memory up front and point each process to it at the beginning of the application, which might make it a little easier to understand.

该示例使用动态窗口允许应用程序根据需要分配内存,但也可以预先静态分配所有内存并在应用程序开始时将每个进程指向它,这可能使它更容易理解。

Whether or not all of this is efficient or the right thing to do is a different argument. For sufficiently large lists, this can be powerful because you can store more data that you would be able to in a single node's memory. However, for small data structures, the costs of traversing the list become rather high, so it's pretty inefficient to distribute the list and it might be more practical to replicate it on each node.

所有这些都是有效的还是正确的做法是一个不同的论点。对于足够大的列表,这可能很强大,因为您可以在单个节点的内存中存储更多数据。但是,对于小型数据结构,遍历列表的成本相当高,因此分发列表的效率非常低,在每个节点上复制它可能更实际。