[pve-devel] [PATCH common/qemu-server/manager] improve vGPU (mdev) usage for NVIDIA

Tue Aug 9 10:39:06 CEST 2022

On 8/9/22 09:59, DERUMIER, Alexandre wrote:
> Le 26/07/22 à 08:55, Dominik Csapak a écrit :
>> so maybe someone can look at that and give some feedback?
>> my idea there would be to allow multiple device mappings per node
>> (instead of one only) and the qemu code would select one automatically
> Hi Dominik,
> 
> do you want to create some kind of pool of pci devices in your ""add cluster-wide hardware device mapping" patches series ?
> 
> Maybe in hardwaremap, allow to define multiple pci address on same node ?
> 
> Then, for mdev, look if a mdev already exist in 1 of the device.
> If not, try to create the mdev if 1 device, if it's failing (max number of mdev reached), try to create mdev on the other device,...
> 
> if not mdev, choose a pci device in the pool not yet detached from host.
> 

yes i plan to do this in my next iteration of the mapping series
(basically what you describe)

my (rough) idea:

have a list of pci paths in mapping (e.g. 01:00.0;01:00.4;...)
(should be enough, i don't think grouping unrelated devices (different vendor/product) makes much 
sense?)

* non mdev:
   qemu-server checks the pci reservations (which we already have)
   and takes the first not yet reserved path

* mdev
   qemu-server iterates over the devices until it finds one
   with the given mdev type available

if none is found, error out

(relevant bug for this: https://bugzilla.proxmox.com/show_bug.cgi?id=3574)