[pve-devel] [PATCH guest-common/qemu-server/manager/docs v4] implement experimental vgpu live migration
Dominik Csapak
d.csapak at proxmox.com
Thu Jun 6 11:21:55 CEST 2024
and some useful cleanups
This is implemented for mapped resources. This requires driver and
hardware support, but aside from nvidia vgpus there don't seem to be
many drivers (if any) that do support that.
qemu already supports that for vfio-pci devices, so nothing to be
done there besides actively enabling it.
Since we currently can't properly test it here and very much depends on
hardware/driver support, mark it as experimental everywhere (docs/api/gui).
(though i tested the live-migration part manually here by using
"exec:cat > /tmp/test" for the migration target, and "exec: cat
/tmp/test" as the 'incoming' parameter for a new vm start, which worked ;) )
i opted for marking them migratable at the mapping level, but we could
theoretically also put it in the hostpciX config instead.
(though imho it fits better in the cluster-wide resource mapping config)
also the naming/texts could probably be improved, but i think
'live-migration-capable' is very descriptive and i didn't want to
use an overly short name for it (which can be confusing, see the
'shared' flag for storages)
guest-common 6/6 is optional and breaks qemu-server versions without
qemu-server patches 1&2
guest-common 1-4; qemu-server 1-6; pve-manager 1,2
are preparations/cleanups mostly and could be applied independently
changes from v3:
* rebased on master
* split first guest-common patch into 3
* instead of merging keys, just write all expected keys in to expected_props
* made $cfg optional so it does not break callers that don't call it
* added patch to fix the cfg2cmd tests for mdev check
* added patch to show vfio state transferred for migration
* incorporated fionas feedback (mostly minor stuff)
for more details see the individual patches
changes from v2:
* rebased on master
* rework the rework of the properties check (pve-guest-common 1/4)
* properly check mdev in the gui (pve-manager 1/5)
pve-guest-common:
Dominik Csapak (6):
mapping: pci: assert_valid: rename cfg to mapping
mapping: pci: assert_valid: reword error messages
mapping: pci: make sure all desired properties are checked
mapping: pci: check the mdev configuration on the device too
mapping: pci: add 'live-migration-capable' flag to mappings
mapping: remove find_on_current_node
src/PVE/Mapping/PCI.pm | 60 ++++++++++++++++++++++++------------------
src/PVE/Mapping/USB.pm | 10 -------
2 files changed, 34 insertions(+), 36 deletions(-)
qemu-server:
Dominik Csapak (12):
usb: mapping: move implementation of find_on_current_node here
pci: mapping: move implementation of find_on_current_node here
pci: mapping: check mdev config against hardware
stop cleanup: remove unnecessary tpmstate cleanup
vm_stop_cleanup: add noerr parameter
migrate: call vm_stop_cleanup after stopping in phase3_cleanup
pci: set 'enable-migration' to on for live-migration marked mapped
devices
check_local_resources: add more info per mapped device and return as
hash
api: enable live migration for marked mapped pci devices
api: include not mapped resources for running vms in migrate
preconditions
tests: cfg2cmd: fix mdev tests
migration: show vfio state transferred too
PVE/API2/Qemu.pm | 55 ++++++++++++++++++++------------
PVE/CLI/qm.pm | 2 +-
PVE/QemuMigrate.pm | 44 +++++++++++++++++--------
PVE/QemuServer.pm | 38 +++++++++++-----------
PVE/QemuServer/PCI.pm | 14 ++++++--
PVE/QemuServer/USB.pm | 5 ++-
test/MigrationTest/Shared.pm | 3 ++
test/run_config2command_tests.pl | 2 +-
8 files changed, 104 insertions(+), 59 deletions(-)
pve-manager:
Dominik Csapak (5):
mapping: pci: include mdev in config checks
bulk migrate: improve precondition checks
bulk migrate: include checks for live-migratable local resources
ui: adapt migration window to precondition api change
fix #5175: ui: allow configuring and live migration of mapped pci
resources
PVE/API2/Cluster/Mapping/PCI.pm | 2 +-
PVE/API2/Nodes.pm | 27 ++++++++++++++--
www/manager6/dc/PCIMapView.js | 6 ++++
www/manager6/window/Migrate.js | 51 ++++++++++++++++++++-----------
www/manager6/window/PCIMapEdit.js | 12 ++++++++
5 files changed, 76 insertions(+), 22 deletions(-)
pve-docs:
Dominik Csapak (2):
qm: resource mapping: add description for `mdev` option
qm: resource mapping: document `live-migration-capable` setting
qm.adoc | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
--
2.39.2
More information about the pve-devel
mailing list