[PVE-User] Expected fencing behavior on a bifurcated 4-node HA cluster

Wed May 3 09:41:54 CEST 2017

Hi,

On 05/02/2017 05:40 PM, Adam Carheden wrote:
> What's supposed to happen if two nodes in a 4-node HA cluster go offline?

If all of them have HA services configured then there may happen a full 
cluster reset.
If two nodes go offline the whole cluster looses quorum, so all nodes 
with an active watchdog (i.e. all nodes with active services (in the 
past)) will reset.

For such situation, where there's a tie an external voting arbitrator 
would help, this could be a fifth (tiny) node or an corosync QDevices.
QDevices have the advantage that they can run on any newer Linux Distro 
which ship corosync (2.4 and newer AFAIK) independent of the PVE stack.
They can provide arbitrator votes to multiple cluster, and have less 
constraints regarding network setup latency as the communication happens 
over TCP.
This is usable from PVE but we haven't documented it, which I started to 
do and need to pick up again soon.
Just a note for any other reader, while this can boost reliability and 
recovery in Clusters with an even vote count (you can only 'win' there),
it can do the reverse in Clusters with uneven Node counts.

>
> I have a 4-node test cluster, two nodes are in one server room and the
> other two in another server room. I had HA inadvertently tested for me
> this morning due to an unexpected network issue and watchdog rebooted
> two of the nodes.
>
> I think this is the expected behavior, and certainly seems like what I
> want to happen. However, quorum is 3, not 2, so why didn't all 4 nodes
> reboot?

Because, if the `ha-manager status` still mirrors the same setup (i.e. 
same services on same nodes configured) as when the network failure 
happened, I see that just one node hast active services running.
We do not fence nodes which have no configured HA services, or if all of 
there configured HA services are disabled.
As we think that this would just lower reliability for non-ha services 
but bring no increase in reliability for HA services.

>
> # pvecm status
> Quorum information
> ------------------
> Date:             Tue May  2 09:35:23 2017
> Quorum provider:  corosync_votequorum
> Nodes:            4
> Node ID:          0x00000001
> Ring ID:          4/524
> Quorate:          Yes
>
> Votequorum information
> ----------------------
> Expected votes:   4
> Highest expected: 4
> Total votes:      4
> Quorum:           3
> Flags:            Quorate
>
> Membership information
> ----------------------
>      Nodeid      Votes Name
> 0x00000004          1 192.168.0.11
> 0x00000003          1 192.168.0.203
> 0x00000001          1 192.168.0.204 (local)
> 0x00000002          1 192.168.0.206
>
> # ha-manager status
> quorum OK
> master node3 (active, Tue May  2 09:35:24 2017)
> lrm node1 (idle, Tue May  2 09:35:27 2017)
> lrm node2 (active, Tue May  2 09:35:26 2017)
> lrm node3 (idle, Tue May  2 09:35:23 2017)
> lrm node3 (idle, Tue May  2 09:35:23 2017)
>
> Somehow proxmox was smart enough to keep two of the nodes online, but
> with a quorum of 3 neither group should have had quorum. How does it
> decide which group to keep online?

see above

Cheers,
Thomas