记录一下,备忘,
一、准备
1.开机启动项
/etc/default/grub文件
在GRUB_CMDLINE_LINUX参数加入intel_iommu=on
2.加载模块
modprobe vfio
modprobe vfio-pci
3.准备一台虚拟机
e.g 一台虚拟机vfio_test
4. bios开启iommu groups
Navigate through each BIOS screen using the "arrow" keys and find the "IOMMU," "I/O Memory Management Unit," or "Intel ® VT-d"setting (usually located under the "Advanced" or "Chipset/Northbridge/Tylersburg IOH/Intel VT for Directed I/O Configuration"settings menu). Move the cursor over the setting selection box using the "arrow" keys and press the "Page Up" or "Page Down" or specified key to select "Enabled."
二、iommu_group单个设备
1.选择需要passthrough的pci设备,其pci地址是0000:0b:00.0
[[email protected] libvirt]# lspci | grep -i fibre
0b:00.0 Fibre Channel: QLogic Corp.ISP2432-based 4Gb Fibre Channel to PCI Express HBA (rev 03)
2. 查询其所处的iommu_group,看到iommu_group是19,并且整个组就一个设备0000:0b:00.0
[[email protected] libvirt]# readlink /sys/bus/pci/devices/0000\:0b\:00.0/iommu_group
../../../../kernel/iommu_groups/19
[[email protected] libvirt]# ls /sys/bus/pci/devices/0000\:0b\:00.0/iommu_group/devices/
0000:0b:00.0
3.将iommu_group下的设备unbind,将设备unbind之后,会发现目录下面的driver目录消失了
echo 0000:0b:00.0 >>/sys/bus/pci/devices/0000\:0b\:00.0/driver/unbind
注意:这条指令执行后/sys/bus/pci/devices/0000\:0b\:00.0/下的driver目录会消失
4.将设备地vendorid和deviceid绑定到new_id
[[email protected] libvirt]# lspci -n -s 0000:0b:00.0
0b:00.0 0c04: 1077:2432 (rev 03)
[[email protected] libvirt]# echo 1077 2432 >/sys/bus/pci/drivers/vfio-pci/new_id
5.检测iommu_group是否绑定成功,会发现/dev/vfio下多了19这个组号
[[email protected] vfio]# ls /dev/vfio
19 vfio
6.构建xml,将设备绑定到虚拟机
[[email protected] ljl]# cat net2.xml
<hostdev mode='subsystem' type='pci'managed='yes'>
<driver name='vfio'/>
<source>
<address type='pci' domain='0x0000' bus='0x0b' slot='0x00'function='0x0'/>
</source>
</hostdev>
[[email protected] ljl]# virsh attach-device vfio_test net2.xml --config
Device attached successfully
7.启动虚拟机
[[email protected] ljl]# virsh start vfio_test
Domain vfio_test started
三、iommu_group下多个设备(比如,双口的网卡)
[[email protected] 0000:02:00.0]# ls /sys/bus/pci/devices/0000\:02\:00.0/iommu_group/devices/
0000:02:00.0 0000:02:00.1
多个设备的时候和单个设备类似,只是需要将iommu_group组下所有的设备都unbind掉,比如我想passthrough0000:02:00.0这个设备
echo 0000:02:00.1 >>/sys/bus/pci/devices/0000:02:00.0/driver/ubind
echo 0000:02:00.0 >>/sys/bus/pci/devices/0000:02:00.0/driver/ubind
注意先将0000:02:00.1给unbind掉,因为将0000:02:00.0给unbind之后,…/devices/0000:02:00.0/driver这个目录会消失,这时候再unbind掉0000:02:00.1显然不太可能。
这样就可以像之前那样passthrough掉02:00.0这个设备了。
绑定过程可以通过shell脚本进行完成:
- #!/bin/sh
- if [ $# -ne 1 ]
- then
- echo "usage: $(basename $0) pciaddr"
- exit 1
- fi
- pciaddr=$1
- prefullpath="/sys/bus/pci/devices/"$pciaddr
- fullpath=$prefullpath"/iommu_group/devices"
- driverfullpath=$prefullpath"/driver/unbind"
- for i in `ls $fullpath`
- do
- if [ $i != $pciaddr ]
- then
- # unbind
- echo $i >>$driverfullpath
- # new_id
- prdmsg=`/usr/sbin/lspci -n -s $pciaddr`
- prdmsg=`echo $prdmsg | /usr/bin/sed 's/:/ /g' | /usr/bin/awk -F' ' '{print $4 " " $5 }'`
- echo $prdmsg >/sys/bus/pci/drivers/vfio-pci/new_id
- fi
- done
- # unbind and new_id itself
- echo $pciaddr >>$driverfullpath
- prdmsg=`/usr/sbin/lspci -n -s $pciaddr`
- prdmsg=`echo $prdmsg | /usr/bin/sed 's/://g' | /usr/bin/awk -F' ' '{ print $4 " " $5 }'`
- echo $prdmsg>/sys/bus/pci/drivers/vfio-pci/new_id
- exit 0
四、iommu_group下不仅是多个设备,而且还有bridge
[[email protected] ljl]# ls /sys/bus/pci/devices/0000\:03\:00.0/iommu_group/devices/
0000:00:1c.0 0000:00:1c.4 0000:02:00.2 0000:03:00.0 0000:04:00.0
0000:00:1c.2 0000:02:00.0 0000:02:00.4 0000:03:00.1 0000:04:00.1
[[email protected] ljl]# lspci -s 0000:00:1c.0
00:1c.0 PCI bridge: Intel Corporation 82801JI (ICH10Family) PCI Express Root Port 1
[[email protected] ljl]# lspci -s 0000:00:1c.2
00:1c.2 PCI bridge: Intel Corporation 82801JI (ICH10Family) PCI Express Root Port 3
[[email protected] ljl]# lspci -s 0000:03:00.0
03:00.0 Ethernet controller: BroadcomCorporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20)
主要的区别还是在unbind和new_id的时候,需要将iommu_group下的每个设备在自己的driver/unbind把自己给unbind掉。
比如:
for i in $(ls /sys/kernel/iommu_groups/8/devices/)
do
echo $i >>/sys/kernel/iommu_groups/devices/$i/driver/unbind
ven=$(cat /sys/kernel/iommu_groups/8/devices/$i/vendor
dev=$(cat /sys/kernel/iommu_groups/8/devices/$i/device
echo $ven $dev >/sys/bus/pci/drivers/vfio-pci/new_id
done
五、问题记录
1.启动虚拟机报错
error: internal error: process exited whileconnecting to monitor: 2015-08-11T06:11:06.627255Zqemu-kvm: -device vfio-pci,host=0e:00.0,id=hostdev0,bus=pci.2,addr=0x6: vfio:failed to open /dev/vfio/vfio: Operation not permitted
2015-08-11T06:11:06.627315Z qemu-kvm: -devicevfio-pci,host=0e:00.0,id=hostdev0,bus=pci.2,addr=0x6: vfio: failed to setupcontainer for group 19
2015-08-11T06:11:06.627331Z qemu-kvm: -devicevfio-pci,host=0e:00.0,id=hostdev0,bus=pci.2,addr=0x6: vfio: failed to get group19
2015-08-11T06:11:06.627351Z qemu-kvm: -devicevfio-pci,host=0e:00.0,id=hostdev0,bus=pci.2,addr=0x6: Device initialization failed.
2015-08-11T06:11:06.627371Z qemu-kvm: -devicevfio-pci,host=0e:00.0,id=hostdev0,bus=pci.2,addr=0x6: Device 'vfio-pci' couldnot be initialized
解决方法:
/etc/libvirt/qemu.conf文件,在cgroup_device_acl结构体里加上“/dev/vfio/vfio"这条记录
2.直通了一张显卡,虚拟机不能开机
pci 0000:08:00.0 is not assignable
原因:qemu默认不能直通没有ACS(Access Control Service)控制的设备,要想直通这些设备,需要放开qemu的校验
/etc/libvirt/qemu.conf
开启relaxed_acs_check = 1的选项,这样这些设备就可以被虚拟机直通使用了
3.虚拟机不能开机,failed to set iommu for Container
可能的原因,vfio_iommu_type1.allow_unsafe_interrupts没有开启
开启vfio_iommu_type1.allow_unsafe_interrupts之后,虚拟机可以启动了
也可以在系统启动的时候,就开启vfio_iommu_type1.allow_unsafe_interrupts
4.虚拟机不能开机,/dev/vfio/16 Operation not permitted
/etc/libvirt/qemu.conf文件,在cgroup_device_acl结构体里加上“/dev/vfio/16"这条记录
六、参考引用
http://www.redhat.com/archives/libvir-list/2013-March/msg00514.html