[pve-devel] [PATCH v2 qemu-server++ 0/15] remote migration

Fabian Grünbichler f.gruenbichler at proxmox.com
Thu Dec 2 16:36:36 CET 2021

On November 30, 2021 3:06 pm, Fabian Ebner wrote:
> Am 11.11.21 um 15:07 schrieb Fabian Grünbichler:
>> this series adds remote migration for VMs.
>> both live and offline migration including NBD and storage-migrated disks
>> should work.
> Played around with it for a while. Biggest issue is that migration fails 
> if there is no 'meta' property in the config. Most other things I wish 
> for are better error handling, but it seems to be in good shape otherwise!

thanks for testing and reviewing!

> Error "storage does not exist" if the real issue is missing access 
> rights. But that error also appears if missing access for 
> /cluster/resources or if the target node does not exists.

hmm, yeah, those are from the "abort early" checks that I'd like to punt 
to some pre-condition API endpoint anyway, or to the client itself.. but 
"missing access" and "doesn't exist" can be rather similar from that 
perspective, so probably a more generic error message is best :)

> For the 'config' command, 'Sys.Modify' seems to be required
>      failed to handle 'config' command - 403 Permission check failed (/, 
> Sys.Modify)

did you maybe have 'startup' set in the VM config? that requires 
Sys.Modify on / to change, and we pass the full config (well, most of 
it) to update_vm_api.. I don't think there is an intentional check 
otherwise for that privilege, so it's likely from something in the 

> but it does create an empty configuration file, leading to
>      target_vmid: Guest with ID '5678' already exists on remote cluster
> on the next attempt.
> It also already allocates the disks, but doesn't clean them up, because 
> it gets the wrong lock (since the config is empty) and aborts the 'quit' 
> command.

yeah, that is related:
- we remove the lock prio to calling update_vm_api
- if updating fails, the config is therefor unlocked (and empty, since 
  it only contains the lock prior to this command)
- mtunnel still expects the create lock to be there and errors out

the fix is to not remove the lock, but set skiplock for the 
update_vm_api call (this is in an flocked context with the config lock 
checked after entering the locked scope, so it is as safe as can be).

that way if update_vm_api fails, the config still has the 'create' lock, 
and cleanup triggered by the client works.

> If the config is not recent enough to have a 'meta' property:
>      failed to handle 'config' command - unable to parse value of 'meta' 
> - got undefined value
> Same issue with disk+config cleanup as above.

yeah, that should be skipped if undef/false. the missing cleanup is 
fixed by the changes above.

> The local VM stayes locked with 'migrate'. Is that how it should be?

unless you pass `--delete`, yes. the idea was that this prevents 
starting the VM on source and target cluster by accident, but using a 
different lock value might be more appropriate to signify 'this guest 
moved to another cluster'.

> Also the __migration__ snapshot will stay around, resulting in an error 
> when trying to migrate again.

I'll take a look, that shouldn't happen and probably indicates some 
missing cleanup somewhere ;)

> For live migration I always got a (cosmetic?) "WS closed 
> unexpectedly"-error:
> tunnel: -> sending command "quit" to remote
> tunnel: <- got reply
> tunnel: Tunnel to 
> ticket=PVETUNNEL%3A<SNIP>&socket=%2Frun%2Fqemu-server%2F5678.mtunnel 
> failed - WS closed unexpectedly
> 2021-11-30 13:49:39 migration finished successfully (duration 00:01:02)
> UPID:pve701:0000D8AD:000CB782:61A61DA5:qmigrate:111:root at pam:

yeah, this is cosmetic. we send the quit, the remote end closes the 
socket, the tunnel doesn't know that the websocket closing is expected 
and prints that error. I'll think about whether we can handle this 
somehow with the existing commands, or we need to do a proper close 
command from the client side as well (I dropped the start of something 
like this in v2 ;))

-> quit
<- done (server expects WS to close)
-> close (ctrl cmd)
-> close WS


-> close (ctrl cmd, sets flag to expect WS to be closed)
-> quit
<- done
<- close WS

I'd like to avoid having the tunnel parse the quit command and having 
that serve a double-purpose, to keep the tunnel re-usable and agnostic.

> Fun fact: the identity storage mapping will be used for storages that 
> don't appear in the explicit mapping. E.g. it's possible to migrate a VM 
> that only has disks on storeA with --target-storage storeB:storeB (if 
> storeA exists on the target of course). But the explicit identity 
> mapping is prohibited.

thanks, that should be caught as well.

> When a target bridge is not present (should that be detected ahead of 
> starting the migration?) and likely for any other startup failure the 
> only error in the log is:

yeah, the targetbridge should probably be part of the initial checks 
like the storages (and then be re-checked anyway for the actual start 
command, but at least for non-misbehaving clients they get an early 

> 2021-11-30 14:43:10 ERROR: online migrate failure - error - tunnel 
> command '{"cmd":"star<SNIP>
> failed to handle 'start' command - start failed: QEMU exited with code 1
> For non-remote migration we are more verbose in this case and log the 
> QEMU output.

yeah, that should be easily improvable. the query-disk-import could 
similarly be extended with an optional 'msg' that we can use to print 
progress if there is any..

> Can/should an interrupt be handled more gracefully, so that remote 
> cleanup still happens?
> ^CCMD websocket tunnel died: command 'proxmox-websocket-tunnel' failed: 
> interrupted by signal
> 2021-11-30 14:39:07 ERROR: interrupted by signal
> 2021-11-30 14:39:07 aborting phase 1 - cleanup resources
> 2021-11-30 14:39:08 ERROR: writing to tunnel failed: broken pipe
> 2021-11-30 14:39:08 ERROR: migration aborted (duration 00:00:10): 
> interrupted by signal

yeah, we should probably leave it up to the regular error-handling and 
cleanup logic to cleanly close the tunnel, instead of it being directly 
terminated by the signal..

More information about the pve-devel mailing list