Linux 4.4 PCIe DMA进入用户空间页面不工作- highmem不能用于DMA?

时间:2022-01-08 16:30:04

I am updating an older linux driver that transfers data via DMA to userspace pages which are passed down from the application via get_user_pages().

我正在更新一个较老的linux驱动程序,它通过DMA向用户空间页面传输数据,这些页面通过get_user_pages()从应用程序中传递下来。

My hardware is a new x86 Xeon based board with 12GB of RAM.

我的硬件是一个新的x86 Xeon基板,12GB内存。

The driver gets data from a VME to PCIe FPGA, and is supposed to write it into the main memory. I do a dma_map_pages() for each page, I check it with dma_mapping_error() and write the returned physical DMA address into the buffer descriptors of the DMA controller. Then I kick off the DMA. (We also can see the transfer starting in the FPGA tracer).

驱动程序从VME获取数据到PCIe FPGA,并将其写入主存。我为每个页面做一个dma_map_pages(),我使用dma_mapping_error()检查它,并将返回的物理地址写入DMA控制器的缓冲区描述符中。然后我启动DMA。(我们还可以看到从FPGA示踪器开始的传输)。

However, when I get the DMA finish IRQ I see no data. For control, I have the same VME address space accessible via PIO mode and that works. I also tried writing values to page_address(page) of the userpages and the application can see these. All ok.

但是,当我获得DMA finish IRQ时,我没有看到任何数据。对于控件,我有相同的通过PIO模式访问的VME地址空间,这是可行的。我还尝试将值写入用户页的page_address(page),应用程序可以看到这些。所有的好。

Digging deeper into the matter I checked the usual documentation like DMA-API.txt, but I could not find any other approach, also not in other drivers.

深入研究这个问题,我查看了通常的文档,如DMA-API。txt,但是我找不到其他的方法,也没有其他的驱动。

My a kernel is a self compiled 4.4.59 64bit with all kinds of debugs (debug DMA-API etc..) set to yes.

我的a内核是自编译的4.4.4.59 64位,所有类型的调试(调试DMA-API等)设置为yes。

I also tried to dig through drivers/iommu/ to see debug possibilities here but just a few pr_debugs there.

我还尝试挖掘驱动程序/iommu/以查看这里的调试可能性,但这里只有几个pr_debugs。

The interesting thing: I have another driver, an ethernet driver, which supports a NIC connected to PCI. This one works without problems!

有趣的是:我有另一个驱动程序,一个以太网驱动程序,它支持连接到PCI的NIC。这个工作没有问题!

When dumping and comparing the retrieved DMA dma_addr_t's I see this:

当转储和比较检索到的DMA dma_addr_t时,我看到:

The NIC driver allocates memory via dma_alloc_coherent() for buffer descriptors etc., it's addresses are in the "lower 4 GB":

NIC驱动程序通过dma_alloc_coherence()来分配缓冲描述符等,其地址在“4 GB以下”:

 [ 3127.800567] dma_alloc_coherent: memVirtDma = ffff88006eeab000, memPhysDma = 000000006eeab000
 [ 3127.801041] dma_alloc_coherent: memVirtDma = ffff880035d9b000, memPhysDma = 0000000035d9b000
 [ 3127.801373] dma_alloc_coherent: memVirtDma = ffff88006ecd4000, memPhysDma = 000000006ecd4000

The VME driver, dma_map_page'ing the user space pages are > 4GB, the DMA address looks different: 0xffffe010 (with an offset from the application).

用户空间页面的VME驱动程序dma_map_page'是> 4GB, DMA地址看起来不同:0xffffe010(带有来自应用程序的偏移量)。

pageAddr=ffff88026b4b1000 off=10 dmaAddr=00000000ffffe010 length=100

DMA_BIT_MASK(32) is set in both drivers, our FPGA cores are 32bit wide.

两个驱动程序都设置了DMA_BIT_MASK(32),我们的FPGA内核宽32位。

Question: do I have to have special prerequisites in order for this DMA to work? I read that highmem memory can not be used for DMA, is this still so?

问:要让这个DMA工作,我必须有特殊的先决条件吗?我读到highmem内存不能用于DMA,是吗?

Part of dmesg:

dmesg命令的一部分:

[    0.539839] debug: unmapping init [mem 0xffff880037576000-0xffff880037ab2fff]
[    0.549502] DMA-API: preallocated 65536 debug entries
[    0.549509] DMA-API: debugging enabled by kernel config
[    0.549545] DMAR: Host address width 46
[    0.549550] DMAR: DRHD base: 0x000000fbffc000 flags: 0x1
[    0.549573] DMAR: dmar0: reg_base_addr fbffc000 ver 1:0 cap     8d2078c106f0466 ecap f020df
[    0.549580] DMAR: RMRR base: 0x0000007bc14000 end: 0x0000007bc23fff
[    0.549585] DMAR: ATSR flags: 0x0
[    0.549590] DMAR: RHSA base: 0x000000fbffc000 proximity domain: 0x0
[    0.549779] DMAR: dmar0: Using Queued invalidation
[    0.549784] DMAR: dmar0: Number of Domains supported <65536>
[    0.549796] DMAR: Setting RMRR:
[    0.549809] DMAR: Set context mapping for 00:14.0
[    0.549812] DMAR: Setting identity map for device 0000:00:14.0     [0x7bc14000 - 0x7bc23fff]
[    0.549820] DMAR: Mapping reserved region 7bc14000-7bc23fff
[    0.549829] DMAR: Set context mapping for 00:1d.0
[    0.549831] DMAR: Setting identity map for device 0000:00:1d.0     [0x7bc14000 - 0x7bc23fff]
[    0.549838] DMAR: Mapping reserved region 7bc14000-7bc23fff
[    0.549845] DMAR: Prepare 0-16MiB unity mapping for LPC
[    0.549853] DMAR: Set context mapping for 00:1f.0
[    0.549855] DMAR: Setting identity map for device 0000:00:1f.0 [0x0 -     0xffffff]
[    0.549861] DMAR: Mapping reserved region 0-ffffff
[    0.549892] DMAR: Intel(R) Virtualization Technology for Directed I/O
...
[    0.551725] iommu: Adding device 0000:00:00.0 to group 10
[    0.551753] iommu: Adding device 0000:00:01.0 to group 11
[    0.551780] iommu: Adding device 0000:00:01.1 to group 12
[    0.551806] iommu: Adding device 0000:00:02.0 to group 13
[    0.551833] iommu: Adding device 0000:00:02.2 to group 14
[    0.551860] iommu: Adding device 0000:00:03.0 to group 15
[    0.551886] iommu: Adding device 0000:00:03.2 to group 16
[    0.551962] iommu: Adding device 0000:00:05.0 to group 17
[    0.551995] iommu: Adding device 0000:00:05.1 to group 17
[    0.552027] iommu: Adding device 0000:00:05.2 to group 17
[    0.552059] iommu: Adding device 0000:00:05.4 to group 17
[    0.552083] iommu: Adding device 0000:00:14.0 to group 18
[    0.552134] iommu: Adding device 0000:00:16.0 to group 19
[    0.552166] iommu: Adding device 0000:00:16.1 to group 19
[    0.552191] iommu: Adding device 0000:00:19.0 to group 20
[    0.552216] iommu: Adding device 0000:00:1d.0 to group 21
[    0.552272] iommu: Adding device 0000:00:1f.0 to group 22
[    0.552305] iommu: Adding device 0000:00:1f.3 to group 22
[    0.552332] iommu: Adding device 0000:01:00.0 to group 23
[    0.552360] iommu: Adding device 0000:03:00.0 to group 24
[    0.552437] iommu: Adding device 0000:04:00.0 to group 25
[    0.552473] iommu: Adding device 0000:04:00.1 to group 25
[    0.552510] iommu: Adding device 0000:04:00.2 to group 25
[    0.552546] iommu: Adding device 0000:04:00.3 to group 25
[    0.552575] iommu: Adding device 0000:05:00.0 to group 26
[    0.552605] iommu: Adding device 0000:05:00.1 to group 27

1 个解决方案

#1


2  

for completeness here the answer, we found it. Totally different reason: PCIe protocol bug in the FPGA PCIe core...

为了完整起见,我们找到了答案。完全不同的原因:FPGA PCIe内核中的PCIe协议错误……

#1


2  

for completeness here the answer, we found it. Totally different reason: PCIe protocol bug in the FPGA PCIe core...

为了完整起见,我们找到了答案。完全不同的原因:FPGA PCIe内核中的PCIe协议错误……