[pve-devel] [PATCH common/qemu-server/manager] adapt to nvidia vgpu api changes

Thu Oct 17 12:16:30 CEST 2024

Tested this entire series (+ the one prerequisite patch) using an RTX
A5000. Everything applied cleanly on latest master on each respective
repo. Tested on (latest) kernel 6.8.12-2-pve for reference.

vGPU datacenter resource mapping and VM PCI device setup worked fine as
it should, once I've got the drivers & co properly set up.

Verified the available vGPU PCI devices on the host as well as in the
VM, as well as using `nvidia-smi`. Further I've tested running some CUDA
workloads in the VM, to ensure everything is properly set up.

The only "regression" I've noticed is that the Nvidia devices now have
an empty description in the mdev device list, but it's not critically
and can be improved in the future - as also duly noted in the code.

Also looked through the code, looks good IMO. Having to special-case
it for "normal" mdevs and nvidia is rather unfortunate, but - also
talked off-list a bit with Dominik about it - implementing it as a
plugin system would be way more work than justifiable in this case.

So please consider the entire series:

Tested-by: Christoph Heiss <c.heiss at proxmox.com>
Reviewed-by: Christoph Heiss <c.heiss at proxmox.com>

On Tue, Aug 06, 2024 at 02:21:57PM GMT, Dominik Csapak wrote:
> For many new cards, nvidia changed the kernel interface since kernel
> verion 6.8. Instead of using mediated devices, they provide their own
> api.
>
> This series adapts to that, with no required change to the vm config,
> and only minimal changes to our api.
>
> The biggest change is that the mdev types can now be queried on
> /nodes/NODE/hardware/pci/<pciid-or-mapping/mdev either via a pci id
> (like it was before) or via the name of a pci mapping (now checks all
> local devices from that mapping)
>
> A thing to improve could be to parse the available vgpu types from
> nvidia-smi instead of the sysfs, since that not always contains all
> types (see the common patch 1/2 for details)
>
> We could abstract the code that deals with different types probably a
> bit more, but for me it seems Ok for now, and finding a good API for
> that is hard with only 3 modes that are very different from each other
> (raw/mdev/nvidia).
>
> qemu-server patches depend on the common patches, but the manager patch
> does not rely on any other in this series. It is required though
> for the user to be able to select types (in certain conditions).
>
> note that this series requires my previous patch to the sysfstools to
> improve write reliability[0], otherwise the cleanup or creation may
> fail.
>
> 0: https://lists.proxmox.com/pipermail/pve-devel/2024-July/064814.html
>
> pve-common:
>
> Dominik Csapak (2):
>   SysFSTools: handle new nvidia syfsapi as mdev
>   SysFSTools: lscpi: move mdev and iommugroup check outside of verbose
>
>  src/PVE/SysFSTools.pm | 83 ++++++++++++++++++++++++++-----------------
>  1 file changed, 51 insertions(+), 32 deletions(-)
>
> qemu-server:
>
> Dominik Csapak (3):
>   pci: choose devices: don't reserve pciids when vm is already running
>   pci: remove pci reservation: optionally give list of ids to remove
>   pci: mdev: adapt to nvidia interface with kernel >= 6.8
>
>  PVE/QemuServer.pm                | 30 +++++++++--
>  PVE/QemuServer/PCI.pm            | 92 +++++++++++++++++++++++++++++---
>  test/run_config2command_tests.pl |  8 ++-
>  3 files changed, 118 insertions(+), 12 deletions(-)
>
> pve-manager:
>
> Dominik Csapak (1):
>   api/ui: improve mdev listing for pci mappings
>
>  PVE/API2/Hardware/PCI.pm     | 45 +++++++++++++++++++++++++++++-------
>  www/manager6/qemu/PCIEdit.js | 12 +---------
>  2 files changed, 38 insertions(+), 19 deletions(-)
>
> --
> 2.39.2
>
>
> _______________________________________________
> pve-devel mailing list
> pve-devel at lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>
>