[PVE-User] Reboot on psu failure in redundant setup

Mark Adams mark at openvs.co.uk
Fri Nov 8 16:22:08 CET 2019

Hi All,

This cluster is on 5.4-11.

This is most probably a hardware issue either with ups or server psus, but
wanted to check if there is any default watchdog or auto reboot in a
proxmox HA cluster.

Explanation of what happened:

All servers have redundant psu, being fed from separate ups in
separate racks on separate feeds. One of the UPS went out, and when it did
all nodes rebooted. They were functioning normally after the reboot, but I
wasn't expecting the reboot to occur.

When the UPS went down, it also took down all of the core network because
the power was not connected up in a redundant fashion. Ceph and "LAN"
traffic was blocked because of this. Did a watchdog reboot each node
because it lost contact with its cluster peers? I didn't configure it to do
this myself, so is this an automatic feature? Everything I have read says
it should be configured manually.

Thanks in advance.


