[pve-devel] [PATCH qemu-server 2/3] migration: conntrack: work around systemd issue where scope for VM might become blocked

Fiona Ebner f.ebner at proxmox.com
Mon Sep 29 14:24:47 CEST 2025


Because of a systemd issue [0], when a service that's 'partOf' a scope
fails, the scope itself might end up being left-over, even after all
processes in the scope exit. In particular, this can happen for the
'$vmid.scope' when the 'pve-dbus-vmstate@$vmid.service' fails.

Doing a 'reset-failed' of the failed 'partOf' service leads to the
left-over scope being cleaned up too. Without that users in that
situation would get a difficult-to-make-sense-of "timeout waiting on
systemd" error message.

[0]: https://github.com/systemd/systemd/issues/39141

Signed-off-by: Fiona Ebner <f.ebner at proxmox.com>
---
 src/PVE/QemuServer.pm | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/src/PVE/QemuServer.pm b/src/PVE/QemuServer.pm
index 7d5ab718..8e2f03dc 100644
--- a/src/PVE/QemuServer.pm
+++ b/src/PVE/QemuServer.pm
@@ -5802,6 +5802,12 @@ sub vm_start_nolock {
     }
 
     my %silence_std_outs = (outfunc => sub { }, errfunc => sub { });
+    eval { # See systemd GH #39141, need to reset failed PartOf units too, or scope might be blocked
+        run_command(
+            ['/bin/systemctl', 'reset-failed', "pve-dbus-vmstate\@$vmid.service"],
+            %silence_std_outs,
+        );
+    };
     eval { run_command(['/bin/systemctl', 'reset-failed', "$vmid.scope"], %silence_std_outs) };
     eval { run_command(['/bin/systemctl', 'stop', "$vmid.scope"], %silence_std_outs) };
     # Issues with the above 'stop' not being fully completed are extremely rare, a very low
-- 
2.47.3





More information about the pve-devel mailing list