[pve-devel] [PATCH docs] pvecm.adoc: qdevice: Adapt, update and make it clearer

Aaron Lauterer a.lauterer at proxmox.com
Tue Mar 24 15:52:55 CET 2020


Naming the whole mechanism and one of the daemons the same makes it easy
to mix up the two. This patch aims to make the whole understanding of
the QDevice and it's parts easier by

* describing use cases at the beginning
* making the distinction between the QDevice mechanism and qdevice
daemon clear
* removing the `(N-1) qdevice votes` section in odd clusters because it
does not behave like that anymore. Only 1 vote is provided.
* adding two more items to the FAQ section to troubleshoot
* fix small grammer and sentence structures.

Signed-off-by: Aaron Lauterer <a.lauterer at proxmox.com>
---

I had this patch in the pipeline for a while and finally got around to
fix it up.

The (N-1) votes section does seem to not be relevant anymore. In my
tests with odd sized clusters it would add only 1 vote.

Feedback regarding the understandability, grammer and spelling mistakes
is welcome :)

 pvecm.adoc | 161 +++++++++++++++++++++++++++--------------------------
 1 file changed, 82 insertions(+), 79 deletions(-)

diff --git a/pvecm.adoc b/pvecm.adoc
index f65f94d..42b0a8f 100644
--- a/pvecm.adoc
+++ b/pvecm.adoc
@@ -872,107 +872,85 @@ If you see a healthy cluster state, it means that your new link is being used.
 Corosync External Vote Support
 ------------------------------
 
-This section describes a way to deploy an external voter in a {pve} cluster.
-When configured, the cluster can sustain more node failures without
-violating safety properties of the cluster communication.
+It is possible to add an external voter to a {pve} cluster. This enables a
+cluster to suffer more node failures without losing quorum. There are two
+prominent use cases.
+
+The first are small two node clusters. If one node fails, the remaining node
+cannot know if the other host is really down and HA guests need to be started,
+or if the cluster communication is lost and a so called 'split brain'
+footnote:[https://en.wikipedia.org/wiki/Split-brain_(computing)] situation has
+occurred. Adding an external voting device can help mitigate such a situation.
+
+The second use case are larger clusters with an even number of nodes. In case of
+problems with the cluster communication it is possible to have two partitions
+with the same number of nodes, a 'split brain' situation. Adding an external
+voting device to tip the number of possible votes to an odd number ensures that
+there will always be a majority in one of the partitions.
+
+QDevice technical overview
+~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-For this to work there are two services involved:
+Two parts form the 'QDevice' mechanism:
 
-* a so called qdevice daemon which runs on each {pve} node
+* The `qdevice` daemon runs on each {pve} node.
 
-* an external vote daemon which runs on an independent server.
+* The `qnetd` daemon runs on the external, independent server. It can deal with
+multiple clusters.
 
-As a result you can achieve higher availability even in smaller setups (for
-example 2+1 nodes).
+The `qdevice` and `qnetd` daemons use TCP/IP for their communication. Low
+latency is not such a big issue as with corosync itself. This means that the
+`qnetd` service can even be placed outside of the clusters LAN.
 
-QDevice Technical Overview
-~~~~~~~~~~~~~~~~~~~~~~~~~~
+The 'QDevice' shows up as its own device with it's own vote in the cluster when
+`pvecm status` is run. In case of a partitioned cluster the `qnetd` daemon
+decides which partition gets the 'QDevice' vote. All nodes in a partition must
+be reachable from the `qnetd` service on the external server to get the vote. At
+any time only one partition of a cluster gets the vote.
 
-The Corosync Quorum Device (QDevice) is a daemon which runs on each cluster
-node. It provides a configured number of votes to the clusters quorum
-subsystem based on an external running third-party arbitrator's decision.
-Its primary use is to allow a cluster to sustain more node failures than
-standard quorum rules allow. This can be done safely as the external device
-can see all nodes and thus choose only one set of nodes to give its vote.
-This will only be done if said set of nodes can have quorum (again) when
-receiving the third-party vote.
-
-Currently only 'QDevice Net' is supported as a third-party arbitrator. It is
-a daemon which provides a vote to a cluster partition if it can reach the
-partition members over the network. It will give only votes to one partition
-of a cluster at any time.
-It's designed to support multiple clusters and is almost configuration and
-state free. New clusters are handled dynamically and no configuration file
-is needed on the host running a QDevice.
-
-The external host has the only requirement that it needs network access to the
-cluster and a corosync-qnetd package available. We provide such a package
-for Debian based hosts, other Linux distributions should also have a package
-available through their respective package manager.
-
-NOTE: In contrast to corosync itself, a QDevice connects to the cluster over
-TCP/IP. The daemon may even run outside of the clusters LAN and can have longer
-latencies than 2 ms.
+NOTE: The naming of 'QDevice', the mechanism, and the `qdevice` daemon can be
+confusing at times. The `qdevice` daemon is the service running on each {pve}
+node and in combination with the `qnetd` daemon running on an external machine
+forms the 'QDevice' mechanism.
 
 Supported Setups
 ~~~~~~~~~~~~~~~~
 
-We support QDevices for clusters with an even number of nodes and recommend
-it for 2 node clusters, if they should provide higher availability.
-For clusters with an odd node count we discourage the use of QDevices
-currently. The reason for this, is the difference of the votes the QDevice
-provides for each cluster type. Even numbered clusters get single additional
-vote, with this we can only increase availability, i.e. if the QDevice
-itself fails we are in the same situation as with no QDevice at all.
-
-Now, with an odd numbered cluster size the QDevice provides '(N-1)' votes --
-where 'N' corresponds to the cluster node count. This difference makes
-sense, if we had only one additional vote the cluster can get into a split
-brain situation.
-This algorithm would allow that all nodes but one (and naturally the
-QDevice itself) could fail.
-There are two drawbacks with this:
-
-* If the QNet daemon itself fails, no other node may fail or the cluster
-  immediately loses quorum.  For example, in a cluster with 15 nodes 7
-  could fail before the cluster becomes inquorate. But, if a QDevice is
-  configured here and said QDevice fails itself **no single node** of
-  the 15 may fail. The QDevice acts almost as a single point of failure in
-  this case.
-
-* The fact that all but one node plus QDevice may fail sound promising at
-  first, but this may result in a mass recovery of HA services that would
-  overload the single node left. Also ceph server will stop to provide
-  services after only '((N-1)/2)' nodes are online.
-
-If you understand the drawbacks and implications you can decide yourself if
-you should use this technology in an odd numbered cluster setup.
+{pve} supports 'QDevices' for clusters with an even number of nodes to add one
+additional vote and avoid a 'split brain' situation.
+
+In a two node cluster an additional vote will allow it to stay operational if
+one of the two nodes is down, making it possible to provide high availability.
+
+The use in a cluster with an odd number of nodes is discouraged. Adding one more
+vote will result in an even number of votes and can lead to a 'split brain'
+situation.
 
 QDevice-Net Setup
 ~~~~~~~~~~~~~~~~~
 
 We recommend to run any daemon which provides votes to corosync-qdevice as an
-unprivileged user. {pve} and Debian provides a package which is already
+unprivileged user. {pve} and Debian provide a package which is already
 configured to do so.
 The traffic between the daemon and the cluster must be encrypted to ensure a
 safe and secure QDevice integration in {pve}.
 
-First install the 'corosync-qnetd' package on your external server and
+First install the 'corosync-qnetd' package on the external server and
 the 'corosync-qdevice' package on all cluster nodes.
 
-After that, ensure that all your nodes on the cluster are online.
+Next ensure that all nodes nodes in the cluster are online an can ping the
+external server.
 
-You can now easily set up your QDevice by running the following command on one
-of the {pve} nodes:
+To set up the QDevice run the the following command on one of the {pve} nodes:
 
 ----
 pve# pvecm qdevice setup <QDEVICE-IP>
 ----
 
-The SSH key from the cluster will be automatically copied to the QDevice. You
-might need to enter an SSH password during this step.
+The SSH key from the cluster will be automatically copied to the external
+server. You might need to enter an SSH password at this step.
 
-After you enter the password and all the steps are successfully completed, you
+After the password is entered and all the steps are successfully completed, you
 will see "Done". You can check the status now:
 
 ----
@@ -1006,8 +984,8 @@ Tie Breaking
 ^^^^^^^^^^^^
 
 In case of a tie, where two same-sized cluster partitions cannot see each other
-but the QDevice, the QDevice chooses randomly one of those partitions and
-provides a vote to it.
+but the QDevice, the QDevice will randomly choose one of the partitions and
+provide a vote to it.
 
 Possible Negative Implications
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -1020,20 +998,45 @@ Adding/Deleting Nodes After QDevice Setup
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 If you want to add a new node or remove an existing one from a cluster with a
-QDevice setup, you need to remove the QDevice first. After that, you can add or
-remove nodes normally. Once you have a cluster with an even node count again,
-you can set up the QDevice again as described above.
+QDevice set up, you need to remove the QDevice first. After that, you can add or
+remove nodes normally. You can set up the QDecive again should the cluster have
+an even node count after the changes.
 
 Removing the QDevice
 ^^^^^^^^^^^^^^^^^^^^
 
 If you used the official `pvecm` tool to add the QDevice, you can remove it
-trivially by running:
+by running:
 
 ----
 pve# pvecm qdevice remove
 ----
 
+SSH Password is not accepted
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+In case that the external server does not accept the password during the setup
+phase, make sure that the SSH daemon on the external server is configured to
+allow the root login with password.
+
+QDevice Daemon does not start
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Should the `corosync-qdevice` daemon not start automatically and
+
+----
+pve# systemctl status corosync-qdevice.service
+----
+
+report it as inactive and disabled
+run the following commands:
+
+----
+pve# rm /etc/init.d/corosync-qdevice
+pve# systemctl enable corosync-qdevice.service
+pve# systemctl start corosync-qdevice.service
+----
+
 //Still TODO
 //^^^^^^^^^^
 //There is still stuff to add here
-- 
2.20.1





More information about the pve-devel mailing list