[PVE-User] Locking HA during UPS shutdown

Thu Mar 10 18:03:57 CET 2022

Thanks to everone who took the time to respond, especially to Fabian for the detailed answer.
With your help I am starting to see complexity and pitfalls of the “simple” approach I intended to take.

Time is and sequence is a critical factor it seems.
With just a few Linux VMs running I’ll probably follow dORSY’s advice

1.	Let NUT on the VMs monitor the UPS and shut them down _before_ hosts
2.	Shutdown sequence based on battery level
	X%	starting with VMs that might take longer to stop or are not essential (e.g. PBS)
	Y%	application VMs that depend on other VMs
	Z%	essential services (database, firewall/proxy, DNS, ..)
3.	let NUT on the nodes monitor the UPS and shut them down when battery goes low (they’ll poweroff quickly).

Thanks Fabian for the eye-opening details. At this stage and pace of the project I’m not in the mood for tinkering ;)

Regards

Stefan

> On Mar 10, 2022, at 17:24, Fabian Grünbichler <f.gruenbichler at proxmox.com> wrote:
> 
> On March 10, 2022 2:48 pm, admins at telehouse.solutions wrote:
>> That was actually a really BAD ADVICE…. as when node initiate maintenance mode it will try to migrate hosted vms … and eventually ends up in the same Lock loop..
>> what you really need is to remove started vms from ha-manager, so when the node initiate shutdown it will do firstly do regular shutdown vm per vm.
>> 
>> So, do something like below as first command in your NUT command sequence:
>> 
>> for a in `ha-manager status | grep started|awk '{print $2}'|sed 's/vm://g'`; do ha-manager remove $a;done
> 
> what you should do is just change the policy to freeze or fail-over 
> before triggering the shutdown. and once power comes back up and your 
> cluster has booted, switch it back to migrate.
> 
> that way, the shutdown will just stop and freeze the resources, similar 
> to what happens when rebooting using the default conditional policy.
> 
> note that editing datacenter.cfg (where the shutdown_policy is 
> configured) is currently not exposed in any CLI tool, but you can update 
> it using pvesh or the API.
> 
> there is still one issue though - if the whole cluster is shutdown at 
> the same time, at some point during the shutdown a non-quorate partition 
> will be all that's left, and at that point certain actions won't work 
> anymore and the node probably will get fenced. fixing this effectively 
> would require some sort of conditional delay at the right point in the 
> shutdown sequence that waits for all guests on all nodes(!) to stop 
> before proceeding with stopping the PVE services and corosync (nodes 
> still might get fenced if they take too long shutting down after the 
> last guest has exited, but that shouldn't cause much issues other than 
> noise). one way to do this would be for your NUT script to set a flag 
> file in /etc/pve, and some systemd service with the right Wants/After 
> settings that blocks the shutdown if the flag file exists and any guests 
> are still running. probably requires some tinkering, but can be safely 
> tested in a virtual cluster before moving to production ;)
> 
> this last problem is not related to HA though (other than HA introducing 
> another source of trouble courtesy of fencing being active) - you will 
> also potentially hit it with your approach. the 'stop all guests on 
> node' logic that PVE has on shutdown is for shutting down one node 
> without affecting quorum, it doesn't work reliably for full-cluster 
> shutdowns (you might not see problems if timing works out, but it's 
> based on chance).
> 
> an alternative approach would be to request all HA resources to be stopped 
> or disabled (`ha-manager set .. --state ..`), wait for that to be done 
> cluster-wide (e.g. by polling /cluster/resources API path), and then 
> trigger the shutdown. disadvantage of that is you have to remember the 
> pre-shutdown state and restore that afterwards for each resource..
> 
> https://pve.proxmox.com/pve-docs/pve-admin-guide.html#_node_maintenance
> 
>>> On Mar 10, 2022, at 2:48 PM, admins at telehouse.solutions wrote:
>>> 
>>> I don’t remember, search into pvecm and pve[tab][tab] related commands man pages 
>>> 
>>>> On Mar 10, 2022, at 2:19 PM, Stefan Radman <stefan.radman at me.com> wrote:
>>>> 
>>>> Hi Sto
>>>> 
>>>> Thanks for the suggestions.
>>>> 
>>>> The second option is what I was looking for.
>>>> 
>>>> How do I initiate “pve node maintenance mode”?
>>>> 
>>>> The “Node Maintenance” paragraph in the HA documentation is quite brief and does not refer to any command or GUI component.
>>>> 
>>>> Thank you
>>>> 
>>>> Stefan
>>>> 
>>>> 
>>>>> On Mar 10, 2022, at 14:50, admins at telehouse.solutions <mailto:admins at telehouse.solutions> wrote:
>>>>> 
>>>>> Hi, 
>>>>> 
>>>>> here are two ideas: shutdown sequence -and- command sequence
>>>>> 1: shutdown sequence you may achieve when you set NUT’s on each node to only monitor the UPS power, then configure each node to shutdown itself on a different ups power levels, ex: node1 on 15% battery, node2 on 10% battery and so on
>>>>> 2: you can set a cmd sequence to firstly execute  pve node maintenance mode , and then execute shutdown -> this way HA will not try to migrate vm to node in maintenance, and the chance all nodes to goes into maintenance in exactly same second seems to be not a risk at all.
>>>>> 
>>>>> hope thats helpful.
>>>>> 
>>>>> Regards,
>>>>> Sto.
>>>>> 
>>>>>> On Mar 10, 2022, at 1:10 PM, Stefan Radman via pve-user <pve-user at lists.proxmox.com <mailto:pve-user at lists.proxmox.com>> wrote:
>>>>>> 
>>>>>> 
>>>>>> From: Stefan Radman <stefan.radman at me.com <mailto:stefan.radman at me.com>>
>>>>>> Subject: Locking HA during UPS shutdown
>>>>>> Date: March 10, 2022 at 1:10:09 PM GMT+2
>>>>>> To: PVE User List <pve-user at pve.proxmox.com <mailto:pve-user at pve.proxmox.com>>
>>>>>> 
>>>>>> 
>>>>>> Hi 
>>>>>> 
>>>>>> I am configuring a 3 node PVE cluster with integrated Ceph storage.
>>>>>> 
>>>>>> It is powered by 2 UPS that are monitored by NUT (Network UPS Tools).
>>>>>> 
>>>>>> HA is configured with 3 groups:
>>>>>> group pve1 nodes pve1:1,pve2,pve3
>>>>>> group pve2 nodes pve1,pve2:1,pve3
>>>>>> group pve3 nodes pve1,pve2,pve3:1
>>>>>> 
>>>>>> That will normally place the VMs in each group on the corresponding node, unless that node fails.
>>>>>> 
>>>>>> The cluster is configured to migrate VMs away from a node before shutting it down (Cluster=>Options=>HA Settings: shutdown_policy=migrate).
>>>>>> 
>>>>>> NUT is configured to shut down the serves once the last of the two UPS is running low on battery.
>>>>>> 
>>>>>> My problem:
>>>>>> When NUT starts shutting down the 3 nodes, HA will first try to live-migrate them to another node.
>>>>>> That live migration process gets stuck because all the nodes are shutting down simultaneously.
>>>>>> It seems that the whole process runs into a timeout, finally “powers off” all the VMs and shuts down the nodes.
>>>>>> 
>>>>>> My question:
>>>>>> Is there a way to “lock” or temporarily de-activate HA before shutting down a node to avoid that deadlock?
>>>>>> 
>>>>>> Thank you
>>>>>> 
>>>>>> Stefan
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> pve-user mailing list
>>>>>> pve-user at lists.proxmox.com <mailto:pve-user at lists.proxmox.com>
>>>>>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user>
>>>>> 
>>>>> 
>>>>> Best Regards,
>>>>> 
>>>>> Stoyan Stoyanov Sto | Solutions Manager
>>>>> | Telehouse.Solutions | ICT Department
>>>>> | phone/viber:  +359 894774934 <tel:+359 894774934>
>>>>> | telegram:  @prostoSto <https://mysignature.io/redirect/skype:prosto.sto?chat>
>>>>> | skype:  prosto.sto <https://mysignature.io/redirect/skype:prosto.sto?chat>
>>>>> | email:  sto at telehouse.solutions <mailto:sto at telehouse.solutions>
>>>>> | website: www.telehouse.solutions <https://mysig.io/MTRmMTg>
>>>>> | address: Telepoint #2, Sofia, Bulgaria
>>>>> <https://mysignature.io/editor/?utm_source=freepixel><356841.png>
>>>>> 
>>>>> <https://mysig.io/ZDNkNWY>
>>>>> Save paper. Don’t print
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> Best Regards,
>>>>> 
>>>>> Stoyan Stoyanov Sto | Solutions Manager
>>>>> | Telehouse.Solutions | ICT Department
>>>>> | phone/viber:  +359 894774934 <tel:+359 894774934>
>>>>> | telegram:  @prostoSto <https://mysignature.io/redirect/skype:prosto.sto?chat>
>>>>> | skype:  prosto.sto <https://mysignature.io/redirect/skype:prosto.sto?chat>
>>>>> | email:  sto at telehouse.solutions <mailto:sto at telehouse.solutions>
>>>>> | website: www.telehouse.solutions <https://mysig.io/MTRmMTg>
>>>>> | address: Telepoint #2, Sofia, Bulgaria
>>>>> <https://mysignature.io/editor/?utm_source=freepixel><356841.png>
>>>>> 
>>>>> <https://mysig.io/ZDNkNWY>
>>>>> Save paper. Don’t print
>>>> 
>>> 
>>> 
>>> Best Regards,
>>> 
>>> Stoyan Stoyanov Sto | Solutions Manager
>>> | Telehouse.Solutions | ICT Department
>>> | phone/viber:  +359 894774934 <tel:+359 894774934>
>>> | telegram:  @prostoSto <https://mysignature.io/redirect/skype:prosto.sto?chat>
>>> | skype:  prosto.sto <https://mysignature.io/redirect/skype:prosto.sto?chat>
>>> | email:  sto at telehouse.solutions <mailto:sto at telehouse.solutions>
>>> | website: www.telehouse.solutions <https://mysig.io/MTRmMTg>
>>> | address: Telepoint #2, Sofia, Bulgaria
>>> <https://mysignature.io/editor/?utm_source=freepixel>
>>> 
>>> <https://mysig.io/ZDNkNWY>
>>> Save paper. Don’t print
>>> 
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> pve-user mailing list
>>> pve-user at lists.proxmox.com
>>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>> 
>> 
>> Best Regards,
>> 
>> Stoyan Stoyanov Sto | Solutions Manager
>> | Telehouse.Solutions | ICT Department
>> | phone/viber:  +359 894774934 <tel:+359 894774934>
>> | telegram:  @prostoSto <https://mysignature.io/redirect/skype:prosto.sto?chat>
>> | skype:  prosto.sto <https://mysignature.io/redirect/skype:prosto.sto?chat>
>> | email:  sto at telehouse.solutions <mailto:sto at telehouse.solutions>
>> | website: www.telehouse.solutions <https://mysig.io/MTRmMTg>
>> | address: Telepoint #2, Sofia, Bulgaria
>> <https://mysignature.io/editor/?utm_source=freepixel>
>> 
>> <https://mysig.io/ZDNkNWY>
>> Save paper. Don’t print
>> 
>> 
>> 
>> 
>> _______________________________________________
>> pve-user mailing list
>> pve-user at lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>> 
> 
> 
> _______________________________________________
> pve-user mailing list
> pve-user at lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user