[PVE-User] 2node cluster
Thomas Lamprecht
t.lamprecht at proxmox.com
Thu Jun 21 10:46:58 CEST 2018
On 6/21/18 10:04 AM, ronny+pve-user at aasen.cx wrote:
> On 20. juni 2018 12:44, Tonči Stipičević wrote:
>> Hello to all
>>
>> I'm testing pve-storage-zsync and it works pretty well on my 3node cluster . One VM from origin node is replicating to other two nodes and when I shutdown origin node , I can start VM from it's replicated image on one of the other nodes w/o any issues/problems. If I make some changes to this replicated image , after starting "dead" node back again, replication overrides those changes and so on ...
>>
>> So this is possible because we still have quorum . However real scenario will have only two nodes in cluster and the problem arises when one node shuts down and there is no quorum any more.
>>
>> Then we have to lower "votes" on the surviving node.
>>
>> Since I want to make zfs-sync (disaster recovery) test with 2node cluster , the question would be: Is it possible to start "dead" node back again and restore "votes" value on the surviving node ? Will the cluster accept this "dead" node back ?
>>
>> I have never done this before and just want to be prepared for reinstalling dead node if it cannot get back into the cluster again ....
>>
>> Thank you very much in advance
>>
>
> personally i would just run a raspberry or similar proxmox node as a third tiebreaker node.
>
Would work. I often recommend qdevices[1] in this case,
instead of doing a full PVE cluster stack on a raspberry...
It's a simple daemon which gets polled when quroum changes
to act as a tie breaker.
You would need to install corosync-qnetd on the tie breaker,
e.g. Raspberry (it's packaged in Debian Stretch).
Then ensure you can access the external device (raspi) via
ssh: ssh-copy-id root at qdev-address
Then you can setup the connection with:
# corosync-qdevice-net-certutil -Q -n <clustername> <qdev-address> <cluster-node1-addr> <cluster-node2-addr>
(with <xyz> replaced with the respective values)
Ensure edevice is started on external:
# systemctl restart corosync-qnetd.service
The edit corosync.conf [2] and add a device section to the quorum
section, e.g., my quorum section looks like:
[...]
quorum {
device {
model: net
net {
algorithm: ffsplit
host: 192.168.30.15
tls: on
}
votes: 1
}
provider: corosync_votequorum
}
Then on all cluster nodes ensure that the qdevice service is running:
# systemctl restart corosync-qdeviece.service
And you should be done.
It may not look to easy to setup, but it is not hard either, especially if
you read through the man page a bit. Note that only cluster with even node
count should do this, you will reduce reliability with uneven count, as there
ffsplit algorithm cannot be used and last-man-standing has a few very
non-ideal properties...
For even node count you can only win, if setup correctly.
Just typing this up because I find it a nice workaround for 2 + 1 clusters,
e.g. 2 nodes PVE plus a Linux storage/network box or non-powerful tie
breaker, and it seems to be not that known :)
cheers,
Thomas
[1]: https://www.mankier.com/8/corosync-qdevice
[2]: https://pve.proxmox.com/pve-docs/chapter-pvecm.html#edit-corosync-conf
More information about the pve-user
mailing list