在SATA AHCI中禁用命令完成合并(CCC)。

I'm working on 2.6.35.9 version of the Linux kernel and am trying to disable Command Completion Coalescing.

我正在处理2.6.35.9版本的Linux内核，并试图禁用命令完成合并。

The output of lspci is as shown below:

lspci的输出如下所示:

00:00.0 Host bridge: Intel Corporation 82P965/G965 Memory Controller Hub (rev 02)
00:01.0 PCI bridge: Intel Corporation 82P965/G965 PCI Express Root Port (rev 02)
00:19.0 Ethernet controller: Intel Corporation 82566DC Gigabit Network Connection (rev 02)
00:1a.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #4 (rev 02)
00:1a.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #5 (rev 02)
00:1a.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI Controller #2 (rev 02)
00:1c.0 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 1 (rev 02)
00:1c.4 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 5 (rev 02)
00:1d.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #2 (rev 02)
00:1d.2 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #3 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI Controller #1 (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev f2)
00:1f.0 ISA bridge: Intel Corporation 82801HH (ICH8DH) LPC Interface Controller (rev 02)
00:1f.2 RAID bus controller: Intel Corporation 82801 SATA RAID Controller (rev 02)
00:1f.3 SMBus: Intel Corporation 82801H (ICH8 Family) SMBus Controller (rev 02)
01:00.0 VGA compatible controller: nVidia Corporation G72 [GeForce 7300 LE] (rev a1)
04:03.0 Mass storage controller: Promise Technology, Inc. PDC20268 (Ultra100 TX2) (rev 02)

I have Native Command Queuing enabled on my drives.

我的驱动器上启用了本机命令队列。

I was looking at the Serial ATA AHCI 1.3 Specification and found on page 115 that -

我看了一系列ATA AHCI 1.3规范，发现在115页

The CCC feature is only in use when CCC_CTL.EN is set to ‘1’. If CCC_CTL.EN is set to ‘0’, no CCC interrupts shall be generated.

只有在CCC_CTL时才使用CCC特性。EN被设为1。如果CCC_CTL。EN设置为“0”，不生成CCC中断。

Next, I had a look at the relevant code (namely, the files concerning AHCI) for this version of the kernel but wasn't able to make any progress. I found the following ~~macro~~ enum HOST_CAP_CCC = (1 << 7) in drivers/ata/ahci.h, but I'm not sure how this should be modified to disable command coalescing.

接下来，我查看了这个版本内核的相关代码(即关于AHCI的文件)，但是没有取得任何进展。我在驱动/ata/ahci中发现了下面的宏enum HOST_CAP_CCC =(1 < 7)。h，但是我不确定如何修改它以禁用命令合并。

Can someone please assist me in identifying how CCC can be disabled? Thank you!

请问有谁可以帮助我识别CCC如何被禁用?谢谢你！

In response to gby's comment:

回应gby的评论:

I conducted an experiment where I issued requests of size 64KB from my driver code. 64KB corresponds to 128 sectors (each sector = 512 bytes).

我进行了一个实验，从我的驱动程序代码发出64KB大小的请求。64KB对应128个扇区(每个扇区= 512字节)。

When I look at the response timestamp differences, here is what I find:

当我看到响应时间戳的差异时，我发现:

Timestamp  | Timestamp  |  Difference 
   at      |     at     |  in microsecs
Sector 255 - Sector 127 =  510
Sector 383 - Sector 255 =  3068
Sector 511 - Sector 383 =  22
Sector 639 - Sector 511 =  22
Sector 767 - Sector 639 =  12
Sector 895 - Sector 767 =  19
Sector 1023 - Sector 895 =  13
Sector 1151 - Sector 1023 =  402

As you can see, the response timestamp differences seem to suggest that the write completion interrupts are being batched into one and then one single interrupt is being raised, which might explain the really low numbers in tens of microseconds.

正如您所看到的，响应时间戳的差异似乎表明，写完成中断被打包成一个，然后一个单独的中断被触发，这可能解释了在几十微秒内出现的非常低的数字。

Also, when conducting this experiment, the on-disk write cache was disabled using hdparm.

此外，在执行此实验时，使用hdparm禁用磁盘上的写缓存。

Clearly, there is some interrupt batching involved here which I need to disable so that an interrupt is raised for each and every write request.

显然，这里涉及到一些中断批处理，我需要禁用它们，以便为每个写入请求都引发一个中断。

UPDATE: Here is another experiment that I tried.

更新:这是我尝试过的另一个实验。

Create a bio structure in my driver and call the __make_request() function of the lower level driver. Only one 2560 bytes write request is sent from my driver.

在驱动程序中创建一个生物结构，并调用底层驱动程序的__make_request()函数。我的驱动程序只发送了一个2560字节的写请求。

Once this write is serviced, an interrupt is generated which is intercepted by do_IRQ(). Finally, the function blk_complete_request() is called. Keep in mind that we are still in the top half of the interrupt handler (i.e., interrupt context, not kernel context). Now, we compose another struct bio in blk_complete_request() and call the __make_request() function of the lower level driver. We record a timestamp at this point (say T_0). When the request completion callback is obtained, we record another timestamp (call it T_1). The difference - T_1 - T_0 - is always above 1 millisec. This experiment was repeated numerous times, and each time, the destination sector affected this difference - T_1 - T_0. It was observed that if the destination sectors are separated by approximately 350 sectors, the time difference is about 1.2 millisec for requests of size 2560 bytes.

处理完这个写入之后，会生成一个中断，由do_IRQ()拦截。最后，调用函数blk_complete_request()。请记住，我们仍然位于中断处理程序的上半部分(即。，中断上下文，而不是内核上下文。现在，我们在blk_complete_request()中构建另一个struct bio，并调用底层驱动程序的__make_request()函数。我们在这一点记录一个时间戳(比如T_0)。当获得请求完成回调时，我们记录另一个时间戳(称为T_1)。T_1 - T_0的差值总是在1毫秒以上。这个实验重复了很多次，每次，目标扇区都会影响这个差——T_1 - T_0。观察到，如果目标扇区被大约350个扇区分开，对于大小为2560字节的请求，时间差大约为1.2毫秒。

Every time, the next write request is sent only when the previous request has been serviced. So, all these requests are chained and the disk has to service only one request at a time.

每次，下一个写请求只在前一个请求被服务时才发送。因此，所有这些请求都是链式的，磁盘一次只能服务一个请求。

My understanding is that since the destination sectors of consecutive requests have been separated by a fairly large amount, by the time the next request is issued, the requested sector would be almost below the disk head and thus the write should happen immediately and T_1 - T_0 should be small (at least < 1 millisec).

目标行业以来,我的理解是,连续请求已经由一个相当大的数量,分开的时候发出下一个请求时,请求部门将几乎低于磁头,因此立即写信应该发生和T_1 T_0应该至少小(< 1毫秒)。

The Serial ATA AHCI 1.3 Specification (page 114) states that:

本系列ATA AHCI 1.3规范(第114页)规定:

When a software specified number of commands have completed or a software specified timeout has expired, an interrupt is generated by hardware to allow software to process completed commands.

当软件指定的命令数量已完成或软件指定的超时已过期时，硬件将生成一个中断，以允许软件处理已完成的命令。

My guess is that this timer maybe the reason why the latency of each request is above 1 millisec. That's why I need to disable CCC.

我的猜测是这个计时器可能是每个请求延迟超过1毫秒的原因。这就是为什么我需要禁用CCC。

I did mail the author - Jeff Garzik - but I haven't heard from him yet. Is he a registered user on *? If yes, I could PM him...

我确实给作者杰夫·加齐克发了邮件，但是我还没有收到他的信。他是*网站的注册用户吗?如果是的话，我可以跟他说……

The HDD we are using is: WD Caviar Black (Model number - WD1001FALS).

我们正在使用的HDD是:WD鱼子酱黑(型号- WD1001FALS)。

Anyone? :-(

有人知道吗?:-(

1 个解决方案

#1

AFAIK, HBA capabilities bit7(CCC supported) is RO and you can check it first to see if CCC supported. Then by spec you can disable CCC by setting CCC_CTL.EN because it is RW

AFAIK, HBA功能bit7(CCC support)是RO，您可以先检查一下它是否支持CCC。然后根据spec，您可以通过设置CCC_CTL来禁用CCC。因为它是RW

Do you try to clear it then conduct your experiment ?

你试着清除它然后进行你的实验吗?

#1