[PVE-User] HA VMs - and timeout
a.lauterer at proxmox.com
Fri Dec 4 08:56:05 CET 2020
On 12/3/20 6:20 PM, Alejandro Bonilla via pve-user wrote:
> Hello -
> I’ve just implemented an HA Group and then added my VMs as resources to be managed across my 3-node group. After struggling with ha-manager to disable/enable and unlocking VMs due to stuck migrations at first, I feel I can clear the usual issues as VMs get stuck.
> My question comes from the fact that I use Proxmox for my Lab, therefore I script a few things and start my servers in the morning but the HA VMs always come up in an error state - likely due to Ceph or the cluster not being fully ready. I have implemented a delay start of 60 seconds which used to be enough. Is this delay also respected when the HA resources/VMs are managed by HA?
You mean the delay when you configure the guests to boot when the host is starting? I haven't tested it explicitly but HA should not take that delay into account.
> Which log can I see to identify why these VMs never started and errored?
In the task log you should see the start jobs for each VM and if there is a problem starting it, those would be the first place to look. Otherwise the syslog.
> A separate question - is there an easier way to test/simulate a dead/node failure besides actually killing my hosts?
Take down/disconnect the interface over which corosync communicates. The isolated node will fence itself after it lost connection to the quorum part of the cluster.
> pve-user mailing list
> pve-user at lists.proxmox.com
More information about the pve-user