HA VMs - and timeout
abonilla at suse.com
Thu Dec 3 18:20:15 CET 2020
I’ve just implemented an HA Group and then added my VMs as resources to be managed across my 3-node group. After struggling with ha-manager to disable/enable and unlocking VMs due to stuck migrations at first, I feel I can clear the usual issues as VMs get stuck.
My question comes from the fact that I use Proxmox for my Lab, therefore I script a few things and start my servers in the morning but the HA VMs always come up in an error state - likely due to Ceph or the cluster not being fully ready. I have implemented a delay start of 60 seconds which used to be enough. Is this delay also respected when the HA resources/VMs are managed by HA?
Which log can I see to identify why these VMs never started and errored?
A separate question - is there an easier way to test/simulate a dead/node failure besides actually killing my hosts?
More information about the pve-user