[pve-devel] [PATCH qemu-server] migrate: keep VM paused after migration if it was before
Fabian Ebner
f.ebner at proxmox.com
Thu Apr 21 09:44:42 CEST 2022
Am 20.04.22 um 14:43 schrieb Fabian Grünbichler:
> On March 18, 2022 8:51 am, Fabian Ebner wrote:
>> Also cannot issue a guest agent command in that case.
>>
>> Reported in the community forum:
>> https://forum.proxmox.com/threads/106618
>>
>> Signed-off-by: Fabian Ebner <f.ebner at proxmox.com>
>> ---
>>
>> Best viewed with -w.
>>
>> PVE/QemuMigrate.pm | 54 ++++++++++++++++++++++++++--------------------
>> 1 file changed, 31 insertions(+), 23 deletions(-)
>
> patch looks good to me - it might make sense to restructure the
> conditionals a bit to log that resuming/fstrim was skipped though to
> reduce confusion (user that paused VM and user doing the migration might
> not be the same entity after all)?
>
> one other thing I noticed (pre-existing, but the changes here made me
> look and my search came up short), inside phase2:
>
> - start block job(s) without autocompletion and wait for them to
> converge
> - start RAM/state migration without autocompletion and wait for it to
> converge
> X both source and target VMs are paused now with "identical" state,
> irrespective of the source being paused or not initially
> - cancel block job(s) (to close NBD writer(s) so that switchover can
> proceed in phase3_cleanup)
>
> if something happens after X in phase2, we enter phase2_cleanup, and
> attempt to cancel the migration
If migrate_cancel actually cancels the migration, the VM will be running
on the source node again :)
If migrate_cancel fails, resume might also fail?
There is an edge case however:
If migration actually finished, but we aborted because of e.g. too many
query-migrate failures, then migrate_cancel will succeed (because there
is no active migration) and the VM will be in post-migrate state on the
source node. Here, resume would help.
> , remove the lock, cancel the block jobs
> again, clean up bitmaps, stop the target VM, clean up remote disks, tear
> down the tunnel, and effectively exit the migration at that point BUT -
> we don't handle the paused state? is there a resume source (with this
> patch, guarded by source was not paused) missing or am I missing
> something?
>
More information about the pve-devel
mailing list