[pve-devel] [PATCH v2 qemu-server] dbus-vmstate: fix method call on dbus object resolving to wrong instance
Fiona Ebner
f.ebner at proxmox.com
Wed Dec 10 13:19:11 CET 2025
As reported in the community forum [0] and then later by Thomas,
who provided the relevant system logs, parallel migration with
'--with-conntrack-state' of multiple VMs may currently lead to a
crash upon handover:
> kvm: Unknown savevm section or instance 'dbus-vmstate/dbus-vmstate' 0.
> Make sure that your current VM setup matches your saved VM setup,
> including any hotplugged devices
> kvm: load of migration failed: Invalid argument
In particular, the following sequence (on my test node)
pvesh create /nodes/pve9a1/qemu/104/dbus-vmstate --action start
pvesh create /nodes/pve9a1/qemu/105/dbus-vmstate --action start
pvesh create /nodes/pve9a1/qemu/105/dbus-vmstate --action stop
results in the wrong service being shut down (note the unexpected ID
in the last line!):
Dec 10 10:07:40 pve9a1 pvesh[30453]: starting dbus-vmstate helper for VM 104
Dec 10 10:07:40 pve9a1 systemd[1]: Starting pve-dbus-vmstate at 104.service - PVE DBus VMState Helper (VM 104)...
Dec 10 10:07:41 pve9a1 dbus-vmstate[30456]: pve-vmstate-104 listening on :1.55
Dec 10 10:07:41 pve9a1 systemd[1]: Started pve-dbus-vmstate at 104.service - PVE DBus VMState Helper (VM 104).
Dec 10 10:07:44 pve9a1 pvesh[30511]: starting dbus-vmstate helper for VM 105
Dec 10 10:07:44 pve9a1 systemd[1]: Starting pve-dbus-vmstate at 105.service - PVE DBus VMState Helper (VM 105)...
Dec 10 10:07:45 pve9a1 dbus-vmstate[30573]: pve-vmstate-105 listening on :1.58
Dec 10 10:07:45 pve9a1 systemd[1]: Started pve-dbus-vmstate at 105.service - PVE DBus VMState Helper (VM 105).
Dec 10 10:07:48 pve9a1 pvesh[30595]: stopping dbus-vmstate helper for VM 105
Dec 10 10:07:48 pve9a1 dbus-vmstate[30456]: shutting down gracefully ..
Dec 10 10:07:48 pve9a1 systemd[1]: pve-dbus-vmstate at 104.service: Deactivated successfully.
So the dbus-vmstate object is removed from the wrong VM before loading
the migration state. Note that the crash is still racy, because if the
dbus-vmstate is removed on the source side for the same wrong VM before
the migration handover, the QEMU objects for both instances will still
match.
To fix the issue, introduce a dbus_call_method() helper similar to the
already existing dbus_get_property() one. Like, this the owner is
respected even if there are multiple (queued) owners on the DBus.
[0]: https://forum.proxmox.com/threads/176821/post-820775
Reported-by: Thomas Lamprecht <t.lamprecht at proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner at proxmox.com>
---
Changes in v2:
* Introduce a helper for calling dbus methods which respects the owner
src/PVE/QemuServer/DBusVMState.pm | 27 ++++++++++++++++++++++++++-
1 file changed, 26 insertions(+), 1 deletion(-)
diff --git a/src/PVE/QemuServer/DBusVMState.pm b/src/PVE/QemuServer/DBusVMState.pm
index a72d6dd2..f1766035 100644
--- a/src/PVE/QemuServer/DBusVMState.pm
+++ b/src/PVE/QemuServer/DBusVMState.pm
@@ -39,6 +39,30 @@ my sub dbus_get_property {
return $reply[0];
}
+# Call a method for an object from a specific interface name.
+# In contrast to calling the method directly by using $obj->Method(), this
+# actually respects the owner of the object and thus can be used for interfaces
+# with might have multiple (queued) owners on the DBus.
+my sub dbus_call_method {
+ my ($obj, $interface, $method, $params, $timeout) = @_;
+
+ $timeout = 10 if !$timeout;
+
+ my $con = $obj->{service}->get_bus()->get_connection();
+
+ my $call = $con->make_method_call_message(
+ $obj->{service}->get_service_name(),
+ $obj->{object_path},
+ $interface,
+ $method,
+ );
+
+ $call->set_destination($obj->get_service()->get_owner_name());
+ $call->append_args_list($params->@*) if $params;
+
+ return $con->send_with_reply_and_block($call, $timeout * 1000)->get_args_list();
+}
+
# Starts the dbus-vmstate helper D-Bus service daemon and adds the needed
# object to the appropriate QEMU instance for the specified VM.
sub qemu_add_dbus_vmstate {
@@ -114,7 +138,8 @@ sub qemu_del_dbus_vmstate {
$num_entries = eval {
dbus_get_property($object, 'com.proxmox.VMStateHelper', 'NumMigratedEntries');
};
- eval { $object->Quit() };
+ # Quit() does QMP object-del which has a timeout of 60 seconds
+ eval { dbus_call_method($object, 'com.proxmox.VMStateHelper', 'Quit', [], 70); };
if (my $err = $@) {
syslog('warn', "failed to call quit on dbus-vmstate for VM $vmid: $err\n")
if !$params{quiet};
--
2.47.3
More information about the pve-devel
mailing list