[pve-devel] applied: [PATCH qemu-server 2/2] qm nbdstop: cope graceful with errors

Fabian Grünbichler f.gruenbichler at proxmox.com
Thu Apr 30 08:45:16 CEST 2020


On April 29, 2020 4:24 pm, Thomas Lamprecht wrote:
> as the nbd server could have been stopped by something else.
> Further, it makes no sense to die and mark the migration thus as
> failed, just because of a NBD server stop issue.
> 
> At this point the migration hand off to the target was done already,
> so normally we're good, if it fails we have other (followup) problems
> anyway.

the second paragraph is not really true, nbd_stop is part of the 
hand off, not afterwards - the sequence is

- migration converges (source VM suspends)
- block mirror converges
- block mirror get's completed via cancel (NBD client connection closed)
- nbd stop on target (to release write blocker)
- resume on target (needs to obtain write blocker)

so if NBD was stopped earlier already it is very likely that something 
fishy is going one. we've passed the point of no return and can't go 
back to the source VM (especially not when we don't know in which error 
state we are ;)), so we might as well try to go ahead and attempt to 
resume, but I'd assume that the most likely reason for nbd_stop to fail 
is that the qemu process has crashed already (or a re-introduction of 
the bug from your first patch that we mistakenly assume an NBD server is 
running but haven't actually started one ;)).

so I guess dieing here would actually be the more safe choice (it's 
unlikely we'll add much stuff on the source side between the nbd_stop 
and the resume calls since it's THE critical section of the whole 
migraiton process, but if we do that might operate under the assumption 
of "every remote operation has worked or has died"). alternatively, move 
the eval to the other side to at least make it explicit that we ignore 
failure of nbd_stop and proceed anyway?

(this would all be nicer if we had a stateful mtunnel on the other end 
that would just know whether it started an NBD server or not :-P)

> 
> Signed-off-by: Thomas Lamprecht <t.lamprecht at proxmox.com>
> ---
>  PVE/CLI/qm.pm | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/PVE/CLI/qm.pm b/PVE/CLI/qm.pm
> index f99d401..282fa86 100755
> --- a/PVE/CLI/qm.pm
> +++ b/PVE/CLI/qm.pm
> @@ -274,7 +274,8 @@ __PACKAGE__->register_method ({
>  
>  	my $vmid = $param->{vmid};
>  
> -	PVE::QemuServer::nbd_stop($vmid);
> +	eval { PVE::QemuServer::nbd_stop($vmid) };
> +	warn $@ if $@;
>  
>  	return undef;
>      }});
> -- 
> 2.20.1
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 




More information about the pve-devel mailing list