Project

General

Profile

KVM PCI Passthrough and Omni-Path

A KVM guest can use OPA hardware when configured for PCI passthrough. This document is OPA and Debian-centric, but the concepts should apply to other Linux host operating systems and PCI devices.

BIOS Settings

  1. Intel VT must be enabled.
  2. Integrated IO / IntelVT must be enabled.

Kernel Command Line

Add this to the host's kernel command line and reboot the host:

intel_iommu=on iommu=pt

When configured properly, /sys/kernel/iommu_groups/ will contain many subdirectories. If that path is empty, IOMMU is not working.

Install KVM

$ sudo apt install qemu-kvm libvirt-clients libvirt-daemon-system virtinst libosinfo-bin 

Disable hfi1 on host

The hfi1 driver must not be loaded on the host machine, in order to use PCI passthrough. In /etc/modprobe.d/hfi1.conf:

blacklist hfi1

Also, there is no reason to have IFS installed on the host. The host machine should have no OPA functionality enabled.

Configure PCI Passthrough

The hfi1 device must be setup for PCI passthrough. Find the device's port in the output of lspci:

$ lspci -vnn | grep Omni | cut -f1 '-d '

For the scripts below, prepend the port with 0000:, like "0000:80:02.0".

Use the following script, replace PCI_PORT with the port of the hfi1:

#!/bin/bash

PCI_PORT=0000:80:02.0
DEV_VENDOR=8086
DEV_MODEL=24f0

rmmod vfio_pci
rmmod vfio
echo "$PCI_PORT" > /sys/bus/pci/devices/$PCI_PORT/driver/unbind
modprobe vfio
modprobe vfio_pci
echo $DEV_VENDOR $DEV_MODEL > /sys/bus/pci/drivers/vfio-pci/new_id

Create Guest

While it is possible to manage guests for an unprivileged user, they get a non-functional network setup in the default config.

Use virsh as root.

$ systemctl start libvirtd
$ virt-install --name GUEST_NAME \
    --vcpus=4 --virt-type kvm --cdrom $HOME/kvm-guest/debian-8.7.0-amd64-DVD-1.iso \
    -v --os-variant debian8 \
    --disk path=PATH_TO_CREATE_DISK,size=16 --memory 4096 --graphics vnc

Connect a VNC client to a tunneled connection to the host.

From the workstation:

$ ssh -L5910:localhost:5900 YOU@HOST

Now connect a VNC client to localhost:5910 and complete the install.

Import Existing Disk to New Guest

To import an existing guest disk image, use the following command:

$ sudo virt-install --virt-type kvm --name GUEST_NAME \
    --vcpus=4 --import \
    -v --os-variant debian8 \
    --disk PATH_TO_DISK_IMAGE,device=disk,bus=virtio --memory 4096 --graphics vnc

Connect to Guest, Configure DNS

The default network for KVM is 192.168.122.0/24 and the guest should be assigned a DHCP address when it boots. Use the VNC connection to execute `$ ip addr. ssh should be able to connect to the guest from the host.

Unfortunately, dnsmasq doesn't appear to set the search domain properly. For Debian, configure a search domain in the guest's /etc/network/interfaces.

allow-hotplug eth0
iface eth0 inet dhcp
    dns-search MYDOMAIN

Configure Guest for PCI Passthrough

Shutdown the guest if it is running.

$ virsh shutdown GUEST_NAME

Look for the PCI device in virsh. Look for a pci device that matches the port found via lspci.

$ virsh nodedev-list --tree 

Detach the device. Use the child device of the one that matches the device you found via lspci.

$ virsh nodedev-detach pci_0000_81_00_0

Dump the device info.

$ virsh nodedev-dumpxml pci_0000_81_00_0

Convert bus, slot and function to hex. The printf utility may be used to do this.

$ printf %x VALUE

Edit the guest and add a hostdev section:

$ virsh edit GUEST_NAME

<hostdev mode='subsystem' type='pci' managed='yes'>
  <source>
      <address domain='0x0000' bus='0x81' slot='0x0' function='0x0'/>
  </source>
</hostdev>

Boot the guest

$ virsh start GUEST_NAME

Upon booting the guest, the passthrough device should be present in the guest's lspci output. The passthrough device should be usable by the guest's kernel drivers.

Note: the PCI device may have different capabilities in the VM than it has on the physical host. Hopefully, the driver takes this into account. Refer to https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4c009af473b2026caaa26107e34d7cc68dad7756 for a patch that fixes one such problem in hfi1. Hope it helps.

Using a Bridged Network

Configuring a bridged network allows the kvm guest to reside upon the same network as the bridged interface. See https://jamielinux.com/docs/libvirt-networking-handbook/bridged-network.html for a good overview of how this is configured.

References

  1. https://wiki.debian.org/KVM
  2. https://jamielinux.com/docs/libvirt-networking-handbook/nat-based-network.html
  3. https://www.linux-kvm.org/page/How_to_assign_devices_with_VT-d_in_KVM
  4. https://wiki.archlinux.org/index.php/PCI_passthrough_via_OVMF
  5. https://wiki.debian.org/VGAPassthrough

Brian T. Smith
Senior Technical Staff
System Fabric Works, Inc.
bsmith@systemfabricworks.com