[pve-devel] Bug 1458 - PVE 5 live migration downtime degraded to several seconds (compared to PVE 4)

Alexandre DERUMIER aderumier at odiso.com
Fri Jul 28 13:22:31 CEST 2017


pvesr through ssh
-----------------
root at kvmtest1 ~ # time /usr/bin/ssh -o 'BatchMode=yes' -o 'HostKeyAlias=kvmtest2' root at 10.3.94.47 pvesr set-state 244 \''{}'\'

real	0m1.399s


locally
--------
root at kvmtest2:~# time pvesr set-state 244 {}
real	0m1.137s


so 40ms for ssh, and 1,137s for pvesr itself.

(I think we could simply skip call if state if empty, but reusing ssh could help too a little bit)


also , a simple

#time pvesr
real	0m1.098s

(same for qm or other command)




>>that does not make sense - are you sure you haven't removed anything 
>>else? qemu does not know or care about pvesr, so why should it resume 
>>automatically? 

no it's not resume automatically. This is the log of an external script, calling qmp status  in loop
to see how much time it's really paused.
removing pvesr in phase3, reduce the pause time (between the end of phase2 and qm resume).





----- Mail original -----
De: "Fabian Grünbichler" <f.gruenbichler at proxmox.com>
À: "pve-devel" <pve-devel at pve.proxmox.com>
Envoyé: Vendredi 28 Juillet 2017 12:33:19
Objet: Re: [pve-devel] Bug 1458 - PVE 5 live migration downtime degraded to several seconds (compared to PVE 4)

On Fri, Jul 28, 2017 at 11:21:29AM +0200, Alexandre DERUMIER wrote: 
> >>I wonder wether reusing (/extending) the existing SSH tunnel for the 
> >>commands we run on the target node might reduce the overhead as well? 
> >>for cleanup in error cases opening a new connection is probably still 
> >>advisable. 
> 
> yes maybe. Don't known if the time is to fork the qm process, or established the ssh tunnel or get response. I'll try to add timer on this. 

establishing an SSH connection takes about 1s here, so that would be 2s 
for both commands over SSH 

qm resume takes ~0.3, so less than that until the VM is active again 

> 
> another idea, why not use https api call through pveproxy directly ? 

that would take a while as well, reusing the already open SSH connection 
would be faster for sure. 

> 
> I have verified with qmp status, 
> 
> without pvesr call , around 20ms 
> 
> 2017-07-28 10:24:45,184 -- VM status: paused (inmigrate) 
> 2017-07-28 10:24:45,208 -- VM status: running 
> 
> 
> with pvesr call , around 4s 
> 
> 2017-07-28 10:38:28,711 -- VM status: paused (inmigrate) 
> 2017-07-28 10:38:28,745 -- VM status: paused 
> 2017-07-28 10:38:28,799 -- VM status: paused 
> 2017-07-28 10:38:28,818 -- VM status: paused 
> 2017-07-28 10:38:28,837 -- VM status: paused 
> .... 
> 2017-07-28 10:38:33,912 -- VM status: running 

that does not make sense - are you sure you haven't removed anything 
else? qemu does not know or care about pvesr, so why should it resume 
automatically? 

_______________________________________________ 
pve-devel mailing list 
pve-devel at pve.proxmox.com 
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 




More information about the pve-devel mailing list