[PVE-User] HA VMs - and timeout

Aaron Lauterer a.lauterer at proxmox.com
Fri Dec 4 08:56:05 CET 2020


On 12/3/20 6:20 PM, Alejandro Bonilla via pve-user wrote:

> Hello -
>
> I’ve just implemented an HA Group and then added my VMs as resources to be managed across my 3-node group. After struggling with ha-manager to disable/enable and unlocking VMs due to stuck migrations at first, I feel I can clear the usual issues as VMs get stuck.
>
> My question comes from the fact that I use Proxmox for my Lab, therefore I script a few things and start my servers in the morning but the HA VMs always come up in an error state - likely due to Ceph or the cluster not being fully ready. I have implemented a delay start of 60 seconds which used to be enough. Is this delay also respected when the HA resources/VMs are managed by HA?


You mean the delay when you configure the guests to boot when the host is starting? I haven't tested it explicitly but HA should not take that delay into account.

>
> Which log can I see to identify why these VMs never started and errored?

In the task log you should see the start jobs for each VM and if there is a problem starting it, those would be the first place to look. Otherwise the syslog.

>
> A separate question - is there an easier way to test/simulate a dead/node failure besides actually killing my hosts?


Take down/disconnect the interface over which corosync communicates. The isolated node will fence itself after it lost connection to the quorum part of the cluster.

>
> Thanks
> _______________________________________________
> pve-user mailing list
> pve-user at lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> 





More information about the pve-user mailing list