[pve-devel] HA: vm shutdown/stop behaviour and other HA questions

Wed Mar 6 16:33:06 CET 2019

> Is is possible to implement a true "vm stop" without shutdown ? 

>>it could be, but do you need this often? We assumed that more often HA VM/CTs 
>>are wanted to be shutdown gracefully.

It's more in case of a kernel panic, filesystem hang, or unresponsive vm... where the shutdown will take some minutes.

> Is is possible to reduce this sleep for manual actions ? (not sure if it's related to watchdog ?). 

>>theoretically yes, but we then probably want do nothing if there's no change 
>>(e.g., no new CRM command) as some users already complained about writing out 
>>the LRM status every 10 seconds (they though it was bad for the pmxcfs DB backing 
>>storage, but IMHO this really shouldn't do much for current gen hardware which is 
>>able to write >100GB per day and still achieve lifespans for >5years.) 

Isn't it possible to keep the default 10s, and when manual action is done, 
talk to crm/lrm (socket,api,...) to fast execute theses commands ?

(I just learning the code, so I really don't known if it could be possible)

----- Mail original -----
De: "Thomas Lamprecht" <t.lamprecht at proxmox.com>
À: "pve-devel" <pve-devel at pve.proxmox.com>, "aderumier" <aderumier at odiso.com>
Envoyé: Mercredi 6 Mars 2019 08:21:06
Objet: Re: [pve-devel] HA: vm shutdown/stop behaviour and other HA questions

Hi! 

On 3/6/19 7:59 AM, Alexandre DERUMIER wrote: 
> Hi, 
> 
> I'm finally going use HA on my cluster when proxmox 6.0 will be released (waiting for corosync 3.X). 

great. 

> 
> and, I have notice than shutdown or stop on vm, call both "HA stop" , which call "vm shutdown" then stop HA. 
> 
> 
> Is is possible to implement a true "vm stop" without shutdown ? 

it could be, but do you need this often? We assumed that more often HA VM/CTs 
are wanted to be shutdown gracefully. 

> 
> Also, I have notice than when we start/stop/migrate vm manually, it can take 10-20 second between the HA action, 
> and the real vm action. (Seem to come from the 10s sleep in crm + lrm, between each loop). 

yes, exactly, you have worst-case 10 seconds until the current Master (CRM) picks 
the migrate/relocate command up and after that, worst-case additional 10 seconds 
until the LRM sees the new state, coming in at ~20 seconds in the double worst-case. 

> Is is possible to reduce this sleep for manual actions ? (not sure if it's related to watchdog ?). 

theoretically yes, but we then probably want do nothing if there's no change 
(e.g., no new CRM command) as some users already complained about writing out 
the LRM status every 10 seconds (they though it was bad for the pmxcfs DB backing 
storage, but IMHO this really shouldn't do much for current gen hardware which is 
able to write >100GB per day and still achieve lifespans for >5years.) 

> 
> 
> 
> In the futur, I would like to add some kind of balancing/scheduling of vm (memory/cpu balancing), 
> I think it's the right place to do it ? 
> 
> I have looked in /usr/share/perl5/PVE/HA/Manager.pm, sub select_service_node, 
> seem pretty basic for now. (try node by priority, and by nodeid, try_next if it's failing). 
> I think they are lot of improvment possible here. 

yes, this was a bit prepared for that use case, just didn't come around actually doing it.