[pve-devel] [PATCH guest-common/qemu-server/manager/docs v5 0/3] implement experimental vgpu live migration

Eneko Lacunza elacunza at binovo.es
Wed Mar 5 11:34:55 CET 2025


Hi Dominik,

It is very likely we'll have access to a suitable cluster to test this 
before summer, provided these patches are in published packages.

I can test and report back if that's helpful.

Regards

El 20/1/25 a las 15:51, Dominik Csapak escribió:
> and some useful cleanups
>
> This is implemented for mapped resources. This requires driver and
> hardware support, but aside from nvidia vgpus there don't seem to be
> many drivers (if any) that do support that.
>
> qemu already supports that for vfio-pci devices, so nothing to be
> done there besides actively enabling it.
>
> Since we currently can't properly test it here and very much depends on
> hardware/driver support, mark it as experimental everywhere (docs/api/gui).
> (though i tested the live-migration part manually here by using
> "exec:cat > /tmp/test" for the migration target, and "exec: cat
> /tmp/test" as the 'incoming' parameter for a new vm start, which worked ;) )
>
> i opted for marking them migratable at the mapping level, but we could
> theoretically also put it in the hostpciX config instead.
> (though imho it fits better in the cluster-wide resource mapping config)
>
> also the naming/texts could probably be improved, but i think
> 'live-migration-capable' is very descriptive and i didn't want to
> use an overly short name for it (which can be confusing, see the
> 'shared' flag for storages)
>
> should mostly be the same as v4 functionality/code-wise but still a bit
> changed due to the recent nvidia changes from our side, so probably
> warrants a bit of a closer look in any case
>
> changes from v4:
> * rebased on master (some work due to the recent nvidia changes)
> * incorporated thomas/alexanders feedback from v4
>
> changes from v3:
> * rebased on master
> * split first guest-common patch into 3
> * instead of merging keys, just write all expected keys in to expected_props
> * made $cfg optional so it does not break callers that don't call it
> * added patch to fix the cfg2cmd tests for mdev check
> * added patch to show vfio state transferred for migration
> * incorporated fionas feedback (mostly minor stuff)
>
> for more details see the individual patches
>
> changes from v2:
> * rebased on master
> * rework the rework of the properties check (pve-guest-common 1/4)
> * properly check mdev in the gui (pve-manager 1/5)
>
> manager patches depend on pve-guest-common/qemu-server patches
> qemu-server depends on pve-guest-common patches
>
> guest-common 3/3 breaks older qemu-server version before applying
> qemu-server patches 1&2
>
> pve-guest-common:
>
> Dominik Csapak (3):
>    mapping: pci: check the mdev configuration on the device too
>    mapping: pci: add 'live-migration-capable' flag to mappings
>    mapping: remove find_on_current_node
>
>   src/PVE/Mapping/PCI.pm | 27 +++++++++++++++------------
>   src/PVE/Mapping/USB.pm | 10 ----------
>   2 files changed, 15 insertions(+), 22 deletions(-)
>
> qemu-server:
>
> Dominik Csapak (11):
>    usb: mapping: move implementation of find_on_current_node here
>    pci: mapping: move implementation of find_on_current_node here
>    pci: mapping: check mdev config against hardware
>    vm stop-cleanup: allow callers to decide error behavior
>    migrate: call vm_stop_cleanup after stopping in phase3_cleanup
>    pci: set 'enable-migration' to on for live-migration marked mapped
>      devices
>    check_local_resources: add more info per mapped device and return as
>      hash
>    api: enable live migration for marked mapped pci devices
>    api: include not mapped resources for running vms in migrate
>      preconditions
>    tests: cfg2cmd: fix mdev tests
>    migration: show vfio state transferred too
>
>   PVE/API2/Qemu.pm                 | 55 ++++++++++++++++++++------------
>   PVE/CLI/qm.pm                    |  2 +-
>   PVE/QemuMigrate.pm               | 44 +++++++++++++++++--------
>   PVE/QemuServer.pm                | 30 ++++++++++-------
>   PVE/QemuServer/PCI.pm            | 24 ++++++++++++--
>   PVE/QemuServer/USB.pm            | 17 ++++++++--
>   test/MigrationTest/Shared.pm     |  3 ++
>   test/run_config2command_tests.pl |  2 +-
>   8 files changed, 123 insertions(+), 54 deletions(-)
>
> pve-manager
>
> Dominik Csapak (5):
>    mapping: pci: include mdev in config checks
>    bulk migrate: improve precondition checks
>    bulk migrate: include checks for live-migratable local resources
>    ui: adapt migration window to precondition api change
>    fix #5175: ui: allow configuring and live migration of mapped pci
>      resources
>
>   PVE/API2/Cluster/Mapping/PCI.pm   |  2 +-
>   PVE/API2/Nodes.pm                 | 27 ++++++++++++++--
>   www/manager6/dc/PCIMapView.js     |  6 ++++
>   www/manager6/window/Migrate.js    | 51 ++++++++++++++++++++-----------
>   www/manager6/window/PCIMapEdit.js | 12 ++++++++
>   5 files changed, 76 insertions(+), 22 deletions(-)
>
> pve-docs:
>
> Dominik Csapak (2):
>    qm: resource mapping: add description for `mdev` option
>    qm: resource mapping: document `live-migration-capable` setting
>
>   qm.adoc | 18 ++++++++++++++++++
>   1 file changed, 18 insertions(+)
>

Eneko Lacunza
Zuzendari teknikoa | Director técnico
Binovo IT Human Project

Tel. +34 943 569 206 | https://www.binovo.es
Astigarragako Bidea, 2 - 2º izda. Oficina 10-11, 20180 Oiartzun

https://www.youtube.com/user/CANALBINOVO
https://www.linkedin.com/company/37269706/




More information about the pve-devel mailing list