[PVE-User] Expected fencing behavior on a bifurcated 4-node HA cluster

Thomas Lamprecht t.lamprecht at proxmox.com
Thu May 4 08:39:24 CEST 2017


On 05/03/2017 05:29 PM, Adam Carheden wrote:
> On 05/03/2017 01:46 AM, Alexandre DERUMIER wrote:
>> Maybe this is because node only reboot if an HA vm is present on the node ?
>>
>>
>> if you had HA vm on all 4 nodes, I think that all nodes should be reboot by watchdog. (as you lost quorum on 4 nodes)
> That must be it. I have one HA VM and a few non-HA VMs, all just
> testing. I think the takeaway is not to run an even number of HA nodes
> in production (or to use a corosync QDevice as Thomas suggests).

Yes, as my answer stated. ;-)

>
> Am I correct that all PVE nodes contribute to the quorum voting even if
> they're not part of an HA group?
Yes, because quorum is not just used for HA, quorum is there for all 
cluster activities as they all need to be consistent and reliably 
synchronized.

>
> My production cluster will have 6 nodes (more redundancy, same
> datacetner, less network risk). To prevent cluster shutdown in
> production when I'll have lots more HA VMs, I can just add an old
> cheap-o box as 7th node for quorum and not put it in any HA groups?

Yes, that would be an good option. But you certainly need to but some 
thought on where you place this machine.
If it is, for example, in room A and room B gets on fire (just for the 
examples sake :) ) then the nodes in Room A
are still quorate, but not vice versa. I.e. if Room A gets cut off Room 
B will not have quorum.
An option would be to place it in a third room so that is an independent 
arbitrator.
But that is naturally not an option for all..

>
> Alternatively, is there a way exclude one of my 6 nodes from the HA
> quorum voting? In a CEPH cluster, quorum is determined by nodes running
> the monitor service, and not all nodes have to run the monitor service.
> Is there an equivalent "no monitor" configuration in PVE?

As Dietmar hinted: you can configure how many votes a node provides 
(must be >= 1).
This can be configured either on node addition or by editing the 
corosync configuration file in:
/etc/pve/corosync.conf

So you could just give one node in the 'more reliable' room two votes 
and you achieve the same
as with a additional machine in the same room.

See:
http://pve.proxmox.com/pve-docs/chapter-pvecm.html#edit-corosync-conf
# man corosync.conf

cheers,
Thomas

> Thanks
>
>> ----- Mail original -----
>> De: "Adam Carheden" <carheden at ucar.edu>
>> À: "proxmoxve" <pve-user at pve.proxmox.com>
>> Envoyé: Mardi 2 Mai 2017 17:40:37
>> Objet: [PVE-User] Expected fencing behavior on a bifurcated 4-node HA	cluster
>>
>> What's supposed to happen if two nodes in a 4-node HA cluster go offline?
>>
>>
>> I have a 4-node test cluster, two nodes are in one server room and the
>> other two in another server room. I had HA inadvertently tested for me
>> this morning due to an unexpected network issue and watchdog rebooted
>> two of the nodes.
>>
>> I think this is the expected behavior, and certainly seems like what I
>> want to happen. However, quorum is 3, not 2, so why didn't all 4 nodes
>> reboot?
>>
>> # pvecm status
>> Quorum information
>> ------------------
>> Date: Tue May 2 09:35:23 2017
>> Quorum provider: corosync_votequorum
>> Nodes: 4
>> Node ID: 0x00000001
>> Ring ID: 4/524
>> Quorate: Yes
>>
>> Votequorum information
>> ----------------------
>> Expected votes: 4
>> Highest expected: 4
>> Total votes: 4
>> Quorum: 3
>> Flags: Quorate
>>
>> Membership information
>> ----------------------
>> Nodeid Votes Name
>> 0x00000004 1 192.168.0.11
>> 0x00000003 1 192.168.0.203
>> 0x00000001 1 192.168.0.204 (local)
>> 0x00000002 1 192.168.0.206
>>
>> # ha-manager status
>> quorum OK
>> master node3 (active, Tue May 2 09:35:24 2017)
>> lrm node1 (idle, Tue May 2 09:35:27 2017)
>> lrm node2 (active, Tue May 2 09:35:26 2017)
>> lrm node3 (idle, Tue May 2 09:35:23 2017)
>> lrm node3 (idle, Tue May 2 09:35:23 2017)
>>
>> Somehow proxmox was smart enough to keep two of the nodes online, but
>> with a quorum of 3 neither group should have had quorum. How does it
>> decide which group to keep online?
>>
>> Thanks
>>
> _______________________________________________
> pve-user mailing list
> pve-user at pve.proxmox.com
> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user






More information about the pve-user mailing list