[pve-devel] [PATCH qemu-server] drive mirror: prevent wrongly logging success when completion fails differently

Fiona Ebner f.ebner at proxmox.com
Tue Jul 23 14:07:59 CEST 2024


Currently, when completing a drive mirror job, only errors matching
"cannot be completed" will be handled. Other errors are ignored and
a wrong message that the job was completed successfully will be
printed to the log. An instance of this popped up in the community
forum [0].

The QMP command used for completing the job is either
'block-job-complete' or 'block-job-cancel'. The former causes the VM
to switch to the target drive, the latter doesn't, e.g. migration uses
the latter to not switch the source instance over to the target drive.
The 'block-job-cancel' command doesn't even have the same "cannot be
completed" message, but returns immediately.

The timeout for both 'block-job-cancel' and 'block-job-complete' is
set to 10 minutes in the QMPClient module, which should be enough.

[0]: https://forum.proxmox.com/threads/151518/

Signed-off-by: Fiona Ebner <f.ebner at proxmox.com>
---
 PVE/QemuServer.pm | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index bf59b091..beabb6df 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -8112,10 +8112,13 @@ sub qemu_drive_mirror_monitor {
 			    die "invalid completion value: $completion\n";
 			}
 			eval { mon_cmd($vmid, $op, device => $job_id) };
-			if ($@ =~ m/cannot be completed/) {
+			my $err = $@;
+			if ($err && $err =~ m/cannot be completed/) {
 			    print "$job_id: block job cannot be completed, trying again.\n";
 			    $err_complete++;
-			}else {
+			} elsif ($err) {
+			    die "$job_id: block job cannot be completed - $err\n";
+			} else {
 			    print "$job_id: Completed successfully.\n";
 			    $jobs->{$job_id}->{complete} = 1;
 			}
-- 
2.39.2





More information about the pve-devel mailing list