[pve-devel] Bug 1458 - PVE 5 live migration downtime degraded to several seconds (compared to PVE 4)
Alexandre DERUMIER
aderumier at odiso.com
Thu Jul 27 16:30:37 CEST 2017
looking at user migration log:
Jul 24 18:12:37 start migrate command to unix:/run/qemu-server/100.migrate
Jul 24 18:12:39 migration speed: 256.00 MB/s - downtime 39 ms
Seem that the vm have very low memory, as migration take 2second between the begin and the end.
so maybe the usleep lowering is not working here.
----- Mail original -----
De: "aderumier" <aderumier at odiso.com>
À: "pve-devel" <pve-devel at pve.proxmox.com>
Envoyé: Jeudi 27 Juillet 2017 16:08:35
Objet: Re: [pve-devel] Bug 1458 - PVE 5 live migration downtime degraded to several seconds (compared to PVE 4)
Thanks for the explain Fabian. (I'm always using migration insecure, so I didn't notice this bug)
>>when live-migrating over a unix socket, PVE 5 takes up to a few seconds
>>between completing the RAM transfer and pausing the source VM, and
>>resuming the target VM. in PVE 4, the same migration has a downtime of
>>almost 0.
few seconds seem so huge ... (user talk about 4s)....
>>AFAICT, the reason for this is a bug fix in PVE 5's qemu-server which
>>was required to support storage live migration in Qemu 2.9.
any commit reference ?
>>originally in PVE 4, the target VM in a live migration was started in
>>incoming migration mode and NOT continued on startup (whereas VMs rolled
>>back to a RAM snapshot where started in the same mode, but immediately
>>continued).
>>in June 2016[3], migration over ssh-forwarded unix sockets was
>>implemented. the check for skipping the continue command on startup of
>>the target VM was overlooked, so now VMs migrated over unix sockets were
>>started in incoming migration mode, but continued on startup.
But this seem to be a bug, fixed later here ?
https://git.proxmox.com/?p=qemu-server.git;a=commit;h=b37ecfe6ae7f7b557db7712ee6988cb0397306e9
>>I wonder whether going the "immediately cont" route for live migrations
>>without local storage can cause any issues besides the obvious "moving
>>the conf file failed and VM is now active on the wrong node" one?
I don't known if it could be great to have some kind of temporary conf file where a kvm process is running.
(here we could see vm on source host with state running, and vm on source target with state migrating for example).
Like this if something bad happen at the end of migration, user could still stop the target kvm process with gui.
But maybe it's too complex to implement, don't known...
>>if not, I propose doing just that. otherwise, we could think about lowering
>>the polling interval when waiting for RAM migration to complete (in
>>phase2) - that should shave off a bit of the downtime as well.
I wonder where exactly it take so much time..
$downtime seem to be low, but as it's coming from status, maybe are we missing some query migrate .
Also I think we already try to lowering usleep at the end
#reduce sleep if remainig memory if lower than the everage transfert
$usleep = 300000 if $avglstat && $rem < $avglstat;
maybe this don't work correctly ?
I think a proper way could be catch qemu events, instead pooling status. (but require maybe lot of work)
----- Mail original -----
De: "Fabian Grünbichler" <f.gruenbichler at proxmox.com>
À: "pve-devel" <pve-devel at pve.proxmox.com>
Envoyé: Jeudi 27 Juillet 2017 14:45:43
Objet: [pve-devel] Bug 1458 - PVE 5 live migration downtime degraded to several seconds (compared to PVE 4)
the following issue was reported on the forum[1] and as bug #1458[2],
moving this here for further discussion of potential fixes.
when live-migrating over a unix socket, PVE 5 takes up to a few seconds
between completing the RAM transfer and pausing the source VM, and
resuming the target VM. in PVE 4, the same migration has a downtime of
almost 0.
AFAICT, the reason for this is a bug fix in PVE 5's qemu-server which
was required to support storage live migration in Qemu 2.9.
originally in PVE 4, the target VM in a live migration was started in
incoming migration mode and NOT continued on startup (whereas VMs rolled
back to a RAM snapshot where started in the same mode, but immediately
continued).
in June 2016[3], migration over ssh-forwarded unix sockets was
implemented. the check for skipping the continue command on startup of
the target VM was overlooked, so now VMs migrated over unix sockets were
started in incoming migration mode, but continued on startup. this does
not change the behaviour on startup, as a VM in incoming migration mode
is not actually running until a migration has happened. this does mean
that the downtime is vastly reduced for such migrations, as Qemu will
continue the target VM automatically as soon as the migration job is
completed.
the only things that happen after this automatic resume is
- finish tunnel
- moving the conf file logically between nodes
- resuming on the target side (which is a no-op in this case)
so the risk for inconsistencies seems pretty small.
later on, we introduced live-storage migration. in those cases, we now
have the following scenario:
- start storage migration jobs
- start RAM migration
- wait for RAM to be completed
- finish tunnel
- finish block jobs
- update conf file
- move the conf file logically between the nodes
- resume on target node
so depending on whether the migration goes over tcp (OK) or over unix
(not so much) we have very different behaviour and risk for
inconsistencies.
with the introduction to PVE 5, this different behaviour was fixed /
made consistent, by adapting the "manual resume" stance. this was needed
because Qemu 2.9 does not allow the storage migration over NBD and the
target VM itself to have write access to the same disks at the same
time. this fix was not backported to PVE 4, which means that storage
live-migration is potentially buggy there, but live-migration over unix
sockets is faster.
I wonder whether going the "immediately cont" route for live migrations
without local storage can cause any issues besides the obvious "moving
the conf file failed and VM is now active on the wrong node" one? if
not, I propose doing just that. otherwise, we could think about lowering
the polling interval when waiting for RAM migration to complete (in
phase2) - that should shave off a bit of the downtime as well.
in any case, I think we need to backport the manual resume in case of
local storage live migration fix to PVE 4.
1: https://forum.proxmox.com/threads/pve-5-live-migration-downtime-degradation-2-4-sec.35890
2: https://bugzilla.proxmox.com/show_bug.cgi?id=1458
3: 1c9d54bfd05e0d017a6e2ac5524d75466b1a4455
_______________________________________________
pve-devel mailing list
pve-devel at pve.proxmox.com
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
_______________________________________________
pve-devel mailing list
pve-devel at pve.proxmox.com
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
More information about the pve-devel
mailing list