[pve-devel] Corosync-qdevice
Thomas Lamprecht
t.lamprecht at proxmox.com
Tue Jul 18 06:58:50 CEST 2017
Hi,
On 07/17/2017 08:08 PM, Gilberto Nunes wrote:
> Sorry but, as far as I understanding, the qdevice still need a third part
> to work properly or I can use one of the nodes???
Yes, three real votes are always needed in for an arbitrary quorum
service to work [1][2]
(Side note for other readers: yes in specialized environments you can
mostly be good with two,
depending on the software and its needs, but uniform consensus paired
with arbitrary resource
control isn't such one! The deciding factor is if there is a sane merge
strategy with no bad
outcomes in the case of a partitioning. See [3] for the corosync
approach to this problem
(the reference section from [3] host some good literature too))
> I don't understand
> <qnetd-server>
> and
> * start the services everywhere (corosync-qdevice on the PVE nodes and the
> corosync-qnetd...
> on the qdevice serving host)
>
> What qdevice serving host?? Is it a separate server???
Yes, its a separate server. The main "feature" here is that it can be
any Linux Box around –
there are also some corosync ports to BSD variants, so even such a box
could possibly do it.
This can be useful for often found setups, where there are two PVE
cluster nodes hosting VMs/CTs and one big redundant storage box.
> So still is a 3 node cluster, if you need a third server to serve qdevice,
> right???
Yes, but not necessarily a Proxmox VE one.
Oh and yes, you could use a PVE node, but it won't make sense to use one
of the cluster for itself!
(this can be done way easier by increasing the vote count of this note,
with the same effect but no setup)
For example, If you have two two-node clusters around your company, one
in building A and one in building B you could both configure to serve a
qdevice for the other cluster. The nice thing is that this has no
implications, assumed hat each cluster has a even node count you (e.g. two).
You can only win here:
Without qdevice:
Expected Votes: 2
Needed For Quorum: 2
With Qdev:
Expected Votes: 3
Needed for Quorum: 2 (stayed the same)
So in the worst case where the quorum device fails your just as good as
if no qdevice configured at all – no loss.
But, If just one node fails the other one can operate (and even recover
HA services) - big win.
Also, QDevices operate over TCP and are stateless, thus they can get
away with much more "harsh" conditions than a cluster nodes corosync.
I.e., they may have bigger latencies than 2ms, can be outside of LAN, ...
But there are not perfect, they may (currently) fail quite big on
uneven-node counts (three, five, ... nodes).
> I am confuse here!
Then please wait for software helpers and our reference documentation,
they should make it a little bit easier – hopefully.
Else use a virtual PVE setup and test it there, I advice to *not* deploy
it in production if you're not sure about effects or implications.
Hope I could help.
cheers,
Thomas
--
[1] <https://en.wikipedia.org/wiki/Two_Generals%27_Problem>
[2] <https://en.wikipedia.org/wiki/Byzantine_fault_tolerance>
[3] <https://en.wikipedia.org/wiki/Paxos_(computer_science)>
More information about the pve-devel
mailing list