Infiniband寻址 - 没有IBoIP的IB地址的主机名

时间:2022-11-24 22:42:54

I've just started getting familiar with infiniband and I'm wanting to understand the methods you can use to address the infiniband nodes.

我刚刚开始熟悉infiniband,我想了解可用于解决infiniband节点的方法。

Based on the code is the example from: RDMA read and write with IB verbs I can address individual nodes by IP or hostname using IPoIB.

基于代码的例子来自:RDMA使用IB动词读取和写入我可以使用IPoIB通过IP或主机名寻址单个节点。

Another way is to use a port GUID address directly. But it looks like you'd have to look those up and is more similar to ethernet mac addressing.

另一种方法是直接使用端口GUID地址。但看起来你必须要查看它们,并且更类似于以太网mac寻址。

Then then is something called an LID address, a 16bit local address assigned by the fabric manager. How do I use and determine at runtime an LID address? for example, I run ibaddr and get GID fe80::1a:4bff:ff0c:34e5 LID start 0x6 end 0x6

然后是称为LID地址的东西,由结构管理器分配的16位本地地址。如何在运行时使用和确定LID地址?例如,我运行ibaddr并得到GID fe80 :: 1a:4bff:ff0c:34e5 LID start 0x6 end 0x6

Basically, if you're not using IPoIB how do you convert host names to addresses or similar? Is there a hosts file or some equivalent?

基本上,如果您不使用IPoIB,如何将主机名转换为地址或类似名称?是否有主机文件或等效文件?

1 个解决方案

#1


8  

There is a basic difference between the various addressing methods that you are listing:

您列出的各种寻址方法之间存在基本差异:

  1. Addressing with pure IB verbs
  2. 解决纯IB动词

  3. Addressing with some level of abstraction
  4. 用一定程度的抽象来解决

When a packet is "injected" into the IB fabric, it is routed by LID only, that is part of Local Routing Header of a packet. LID is Local ID, 16 bits, assigned by OpenSM (there's also a case of GID and Global Routing Header, but let's leave this case aside - it won't make the explanation any easier, and you obviously don't need this at this point).

当数据包“注入”IB结构时,它仅由LID路由,这是数据包的本地路由标头的一部分。 LID是本地ID,16位,由OpenSM分配(还有一个GID和全局路由标题的情况,但让我们把这个案例放在一边 - 它不会使解释更容易,你显然不需要这个点)。

This means that if you're writing your application using pure IB verbs, you will need to address the endpoints by LID. You can obtain the LID of a local port with ibv_query_port() - it is part of the port attribute fields.

这意味着如果您使用纯IB动词编写应用程序,则需要通过LID来处理端点。您可以使用ibv_query_port()获取本地端口的LID - 它是端口属性字段的一部分。

But you don't have to do all the dirty work yourself - you can use abstraction libraries such as librdmacm (RDMA Connection Manager) to create connection between endpoints (and by "endpoints" I mean RC QPs), and then use pure verbs to actually send/receive your data.

但是您不必自己完成所有脏工作 - 您可以使用抽象库(如librdmacm(RDMA Connection Manager))在端点之间创建连接(以及“端点”我指的是RC Q​​P),然后使用纯动词来实现实际上发送/接收您的数据。

Basically, if you're not using IPoIB how do you convert host names to addresses or similar? Is there a hosts file or some equivalent?

基本上,如果您不使用IPoIB,如何将主机名转换为地址或类似名称?是否有主机文件或等效文件?

You can't, and there isn't :( If you go through the earlier post on that blog that you linked to, you see that you need to:

你不能,也没有:(如果你浏览了你链接到的博客上的早期帖子,你会发现你需要:

  • Determine the queue pair’s address.
  • 确定队列对的地址。

  • Communicate the address to the other node (through some out-of-band mechanism).
  • 将地址传递给另一个节点(通过一些带外机制)。

The key item here is "out-of-band". For instance, MPI exchanges all these addresses over SSH (which, BTW, can also run on top of IPoIB), and once this info is exchanged and all the QPs are connected, data starts flowing via these RC QPs.

这里的关键项目是“带外”。例如,MPI通过SSH(BTW,也可以在IPoIB之上运行)交换所有这些地址,并且一旦交换了这些信息并且所有QP连接,数据就开始通过这些RC QP流动。

#1


8  

There is a basic difference between the various addressing methods that you are listing:

您列出的各种寻址方法之间存在基本差异:

  1. Addressing with pure IB verbs
  2. 解决纯IB动词

  3. Addressing with some level of abstraction
  4. 用一定程度的抽象来解决

When a packet is "injected" into the IB fabric, it is routed by LID only, that is part of Local Routing Header of a packet. LID is Local ID, 16 bits, assigned by OpenSM (there's also a case of GID and Global Routing Header, but let's leave this case aside - it won't make the explanation any easier, and you obviously don't need this at this point).

当数据包“注入”IB结构时,它仅由LID路由,这是数据包的本地路由标头的一部分。 LID是本地ID,16位,由OpenSM分配(还有一个GID和全局路由标题的情况,但让我们把这个案例放在一边 - 它不会使解释更容易,你显然不需要这个点)。

This means that if you're writing your application using pure IB verbs, you will need to address the endpoints by LID. You can obtain the LID of a local port with ibv_query_port() - it is part of the port attribute fields.

这意味着如果您使用纯IB动词编写应用程序,则需要通过LID来处理端点。您可以使用ibv_query_port()获取本地端口的LID - 它是端口属性字段的一部分。

But you don't have to do all the dirty work yourself - you can use abstraction libraries such as librdmacm (RDMA Connection Manager) to create connection between endpoints (and by "endpoints" I mean RC QPs), and then use pure verbs to actually send/receive your data.

但是您不必自己完成所有脏工作 - 您可以使用抽象库(如librdmacm(RDMA Connection Manager))在端点之间创建连接(以及“端点”我指的是RC Q​​P),然后使用纯动词来实现实际上发送/接收您的数据。

Basically, if you're not using IPoIB how do you convert host names to addresses or similar? Is there a hosts file or some equivalent?

基本上,如果您不使用IPoIB,如何将主机名转换为地址或类似名称?是否有主机文件或等效文件?

You can't, and there isn't :( If you go through the earlier post on that blog that you linked to, you see that you need to:

你不能,也没有:(如果你浏览了你链接到的博客上的早期帖子,你会发现你需要:

  • Determine the queue pair’s address.
  • 确定队列对的地址。

  • Communicate the address to the other node (through some out-of-band mechanism).
  • 将地址传递给另一个节点(通过一些带外机制)。

The key item here is "out-of-band". For instance, MPI exchanges all these addresses over SSH (which, BTW, can also run on top of IPoIB), and once this info is exchanged and all the QPs are connected, data starts flowing via these RC QPs.

这里的关键项目是“带外”。例如,MPI通过SSH(BTW,也可以在IPoIB之上运行)交换所有这些地址,并且一旦交换了这些信息并且所有QP连接,数据就开始通过这些RC QP流动。