[pve-devel] applied: [PATCH docs] add documentation for pci passthrough and sr-iov

Thomas Lamprecht t.lamprecht at proxmox.com
Tue Nov 13 09:29:52 CET 2018


On 11/12/18 4:00 PM, Dominik Csapak wrote:
> explain what it is and how to use it, especially the steps necessary
> on the host and the various options under one chapter
> 
> most of this is also found on the wiki in the Pci_passthrough
> article
> 
> we may want to condense the information there and link it as
> 'notes and examples'

Applied and added a followup fixing a few typos, formatting and a bit of rewording.

Thank you very much for this!

> 
> Signed-off-by: Dominik Csapak <d.csapak at proxmox.com>
> ---
>  qm-pci-passthrough.adoc | 237 ++++++++++++++++++++++++++++++++++++++++++++++++
>  qm.adoc                 |   3 +
>  2 files changed, 240 insertions(+)
>  create mode 100644 qm-pci-passthrough.adoc
> 
> diff --git a/qm-pci-passthrough.adoc b/qm-pci-passthrough.adoc
> new file mode 100644
> index 0000000..95e4ae1
> --- /dev/null
> +++ b/qm-pci-passthrough.adoc
> @@ -0,0 +1,237 @@
> +[[qm_pci_passthrough]]
> +PCI(e) Passthrough
> +------------------
> +
> +PCI(e) passthrough is a mechanism to give a virtual machine control over
> +a pci device usually only available for the host. This can have some
> +advantages over using virtualized hardware, for example lower latency,
> +higher performance, or more features (e.g., offloading).
> +
> +If you pass through a device to a virtual machine, you cannot use that
> +device anymore on the host or in any other VM.
> +
> +General Requirements
> +~~~~~~~~~~~~~~~~~~~~
> +
> +Since passthrough is a feature which also needs hardware support, there are
> +some requirements and steps before it can work.
> +
> +Hardware
> +^^^^^^^^
> +
> +Your hardware has to support IOMMU interrupt remapping, this includes CPU and
> +Mainboard.
> +
> +Generally Intel systems with VT-d, and AMD systems with AMD-Vi support this,
> +but it is not guaranteed that everything will work out of the box, due
> +to bad hardware implementation or missing/low quality drivers.
> +
> +In most cases, server grade hardware has better support than consumer grade
> +hardware, but even then, many modern system can support this.
> +
> +Please refer to your hardware vendor if this is a feature that is supported
> +under Linux.
> +
> +Configuration
> +^^^^^^^^^^^^^
> +
> +To enable PCI(e) passthrough, there are some configurations needed.
> +
> +First, the iommu has to be activated on the kernel commandline.
> +The easiest way is to enable it in */etc/default/grub*. Just add
> +
> + intel_iommu=on
> +
> +or if you have AMD hardware:
> +
> + amd_iommu=on
> +
> +to GRUB_CMDLINE_LINUX_DEFAULT
> +
> +After that, make sure you run 'update grub' to update grub.
> +
> +Second, you have to make sure the following modules are loaded.
> +This can be achieved by adding them to */etc/modules*
> +
> + vfio
> + vfio_iommu_type1
> + vfio_pci
> + vfio_virqfd
> +
> +After changing anything modules related, you need to refresh your
> +initramfs with
> +
> +----
> +update-initramfs -u -k all
> +----
> +
> +Finally reboot and check that it is indeed enabled.
> +
> +----
> +dmesg -e DMAR -e IOMMU -e AMD-Vi
> +----
> +
> +should display that IOMMU, Directed I/O or Interrupt Remapping is enabled.
> +(The exact message can vary, depending on hardware and kernel version)
> +
> +It is also important that the device(s) you want to pass through
> +are in a seperate IOMMU group.  This can be checked with:
> +
> +----
> +find /sys/kernel/iommu_groups/ -type l
> +----
> +
> +It is okay if the device is in an IOMMU group together with its functions
> +(e.g. a GPU with the HDMI Audio device) or with its root port or PCI(e) bridge.
> +
> +.PCI(e) slots
> +[NOTE]
> +====
> +Some platforms handle their PCI(e) slots differently, so if you
> +do not get the desired IOMMU group separation, it may be helpful to
> +try to put the card in a another PCI(e) slot.
> +====
> +
> +.Unsafe interrupts
> +[NOTE]
> +====
> +For some platforms, it may be necessary to allow unsafe interrupts.
> +This can most easily enabled with adding the following line
> +in a .conf file in */etc/modprobe.d/*.
> +
> + options vfio_iommu_type1 allow_unsafe_interrupts=1
> +
> +Please be aware that this option can make your system unstable.
> +====
> +
> +Host Device Passhtrough
> +~~~~~~~~~~~~~~~~~~~~~~~
> +
> +The most used variant of PCI(e) passthrough is to pass through a whole
> +PCI(e) card, for example a GPU or network card.
> +
> +Host Configuration
> +^^^^^^^^^^^^^^^^^^
> +
> +In this case, the host can not use the card. This can be achieved by two
> +methods:
> +
> +Either add the ids to the options of the vfio-pci modules. This works
> +with adding
> +
> + options vfio-pci ids=1234:5678,4321:8765
> +
> +to a .conf file in */etc/modprobe.d/* where 1234:5678 and 4321:8765 are
> +the vendor and device ids obtained by:
> +
> +----
> +lcpci -nn
> +----
> +
> +Or simply blacklist the driver completely on the host with
> +
> + blacklist DRIVERNAME
> +
> +also in a .conf file in */etc/modprobe.d/*.  Again update the initramfs
> +and reboot after that.
> +
> +VM Configuration
> +^^^^^^^^^^^^^^^^
> +
> +To pass through the device you set *hostpciX* on the VM with
> +
> +----
> +qm set VMID -hostpci0 00:02.0
> +----
> +
> +If your device has multiple functions, you can pass them through all together
> +with the shortened syntax
> +
> + 00:02
> +
> +There are some options to which may be necessary, depending on the device
> +and guest OS.
> +
> +* *x-vga=on|off* marks the PCI(e) device the primary GPU of the VM.
> +With this enabled the *vga* parameter of the config will be ignored.
> +* *pcie=on|off* tells {pve} to use a PCIe or PCI port. Some guests/device
> +combination require PCIe rather than PCI (only available for q35 machine types).
> +* *rombar=on|off* makes the firmware ROM visible for the guest. Default is on.
> +Some PCI(e) devices need this disabled.
> +* *romfile=<path>*, is an optional path to a ROM file for the device to use.
> +this is a relative path under */usr/share/kvm/*.
> +
> +An example of PCIe passthrough with a GPU set to primary:
> +
> +----
> +qm set VMID -hostpci0 02:00,pcie=on,x-vga=on
> +----
> +
> +Other considerations
> +^^^^^^^^^^^^^^^^^^^^
> +
> +When passing through a GPU, the best compatibility is reached when using
> +q35 as machine type, OVMF instead of SeaBIOS and PCIe instead of PCI.
> +Note that if you want to use OVMF for GPU passthrough, the GPU needs
> +to have an EFI capable ROM, otherwise use SeaBIOS instead.
> +
> +SR-IOV
> +~~~~~~
> +
> +Another variant of passing through PCI(e) devices, is to use the hardware
> +virtualization features of your devices.
> +
> +SR-IOV (Single-root input/output virtualization) enables a single device
> +to provide multiple vf (virtual functions) to the system, so that each
> +vf can be used in a different VM, with full hardware features, better
> +performance and lower latency than software virtualized devices.
> +
> +The most used devices for this are NICs with SR-IOV which can provide
> +multiple vf per physical port, allowing features such as
> +checksum offloading, etc. to be used inside a VM, reducing CPU overhead.
> +
> +Host Configuration
> +^^^^^^^^^^^^^^^^^^
> +
> +Generally there are 2 methods for enabling virtual functions on a device.
> +
> +In some cases there is an option for the driver module e.g. for some
> +Intel drivers
> +
> + max_vfs=4
> +
> +which could be put in a file in a .conf file in */etc/modprobe.d/*.
> +(Do not forget to update your initramfs after that)
> +
> +Please refer to your driver module documentation for the exact
> +parameters and options.
> +
> +The second (more generic) approach is via the sysfs.
> +If a device and driver supports this you can change the number of vfs on
> +the fly. For example 4 vfs on device 0000:01:00.0 with:
> +
> +----
> +echo 4 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs
> +----
> +
> +To make this change persistent you can use sysfsutils.
> +Just install them via
> +
> +----
> +apt install sysfsutils
> +----
> +
> +and configure it via */etc/sysfs.conf* or */etc/sysfs.d/*.
> +
> +VM Configuration
> +^^^^^^^^^^^^^^^^
> +
> +After creating vfs, you should see them as seperate PCI(e) devices, which
> +can be passed through like a normal PCI(e) device.
> +
> +Other considerations
> +^^^^^^^^^^^^^^^^^^^^
> +
> +For this feature, platform support is especially important. It may be necessary
> +to enable this feature in the BIOS or to use a specific PCI(e) port for it
> +to work. In doubt, consult the manual of the platform or contact the vendor.
> diff --git a/qm.adoc b/qm.adoc
> index 5cf672d..0d453c8 100644
> --- a/qm.adoc
> +++ b/qm.adoc
> @@ -1021,6 +1021,9 @@ ifndef::wiki[]
>  include::qm-cloud-init.adoc[]
>  endif::wiki[]
>  
> +ifndef::wiki[]
> +include::qm-pci-passthrough.adoc[]
> +endif::wiki[]
>  
>  
>  Managing Virtual Machines with `qm`
> 





More information about the pve-devel mailing list