[pve-devel] [PATCH qemu-server v2] Catch qmp socket connections errors, so we can output a more specific error message
Fabian Grünbichler
f.gruenbichler at proxmox.com
Mon Jul 31 15:00:35 CEST 2017
On Thu, Jul 27, 2017 at 11:25:41AM +0200, Emmanuel Kasper wrote:
> It can happen that the qmp connection gets lost while mirroring a disk.
> In that case the current block job get cancelled, but the real cause of the failure
> is lost, becase we die() at a later step with the generic message
> "die "$job: mirroring has been cancelled\n"
I am not quite sure I can follow.. see below
>
> example:
> ...
> drive-scsi0: transferred: 5524946944 bytes remaining: 918355968 bytes total: 6443302912 bytes progression: 85.75 % busy: 1 ready: 0
> drive-scsi0: Cancelling block job
> drive-scsi0: Done.
> 2017-07-26 15:39:56 ERROR: online migrate failure - mirroring error: drive-scsi0: mirroring has been cancelled
> 2017-07-26 15:39:56 aborting phase 2 - cleanup resources
> 2017-07-26 15:39:56 migrate_cancel
> ...
but this must be from dying in line 6054 (caught by the eval in 6030),
not from dying in line 6036? which means that query-block-jobs maybe
returned an empty array (or undef?)..
>
> after patch applied:
> 2017-07-27 09:43:37 ERROR: online migrate failure - mirroring error: lost connection to qemu machine protocol: VM 600 not running
> 2017-07-27 09:43:37 aborting phase 2 - cleanup resources
but this would mean vm_qmp_command (called by vm_mon_cmd) died in line
4798, because check_running returned false??
I'd rather fix check_running returning false then, because obviously the
VM IS running isn't it? ;)
> ---
> changes since v1:
> * declare and assign my $stats directly. No need to have three lines here
> when one is clear enough
> PVE/QemuServer.pm | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
> index 1f34101..3086375 100644
> --- a/PVE/QemuServer.pm
> +++ b/PVE/QemuServer.pm
> @@ -6033,7 +6033,8 @@ sub qemu_drive_mirror_monitor {
> while (1) {
> die "storage migration timed out\n" if $err_complete > 300;
>
> - my $stats = vm_mon_cmd($vmid, "query-block-jobs");
> + my $stats = eval { vm_mon_cmd($vmid, "query-block-jobs"); };
> + die "lost connection to qemu machine protocol socket: $@\n" if $@;
>
> my $running_mirror_jobs = {};
> foreach my $stat (@$stats) {
> --
> 2.11.0
>
>
> _______________________________________________
> pve-devel mailing list
> pve-devel at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
More information about the pve-devel
mailing list