[pve-devel] [PATCH docs v4] pvecm, network: add section on corosync over bonds
Mira Limbeck
m.limbeck at proxmox.com
Mon Aug 4 16:09:29 CEST 2025
On 7/30/25 10:59, Friedrich Weber wrote:
> Testing has shown that running corosync (only) over a bond can be
> problematic in some failure scenarios and for certain bond modes. The
> documentation only discourages bonds for corosync because corosync can
> switch between available networks itself, but does not mention other
> caveats when using bonds for corosync.
>
> Hence, extend the documentation with recommendations and caveats
> regarding bonds for corosync.
>
> Signed-off-by: Friedrich Weber <f.weber at proxmox.com>
> ---
>
> Notes:
> Aaron suggested we could expose the bond-lacp-rate in the GUI to
> make it easier to change the setting on the PVE side. I'd open a
> feature report for this.
>
> Changes since v3:
> - describe recommendations first, and further details for interested
> readers below. Consequently, rephrase failure scenario description
> (thx HD!)
>
> Changes since v2:
> - fix wording in the failure scenario description
> - explain that load-balancing bond modes are affected and why
> - clarify that the caveats apply whenver a bond is used for Corosync
> traffic (even if only as a redundant link)
>
> Changes since v1:
> - move to its own section under "Cluster Network"
> - reword remarks about bond-lacp-rate fast
> - reword remark under "Requirements"
>
> pve-network.adoc | 4 ++-
> pvecm.adoc | 68 +++++++++++++++++++++++++++++++++++++++++++++---
> 2 files changed, 67 insertions(+), 5 deletions(-)
>
> diff --git a/pve-network.adoc b/pve-network.adoc
> index 2dec882..b361f97 100644
> --- a/pve-network.adoc
> +++ b/pve-network.adoc
> @@ -495,7 +495,9 @@ use the active-backup mode.
>
> For the cluster network (Corosync) we recommend configuring it with multiple
> networks. Corosync does not need a bond for network redundancy as it can switch
> -between networks by itself, if one becomes unusable.
> +between networks by itself, if one becomes unusable. Some bond modes are known
> +to be problematic for Corosync, see
> +xref:pvecm_corosync_over_bonds[Corosync over Bonds].
>
> The following bond configuration can be used as distributed/shared
> storage network. The benefit would be that you get more speed and the
> diff --git a/pvecm.adoc b/pvecm.adoc
> index 312a26f..3af1a06 100644
> --- a/pvecm.adoc
> +++ b/pvecm.adoc
> @@ -89,10 +89,8 @@ NOTE: To ensure reliable Corosync redundancy, it is essential to have at least
> another link on a different physical network. This enables Corosync to keep the
> cluster communication alive should the dedicated network be down.
> +
> -NOTE: A single link backed by a bond is not enough to provide Corosync
> -redundancy. When a bonded interface fails and Corosync cannot fall back to
> -another link, it can lead to asymmetric communication in the cluster, which in
> -turn can lead to the cluster losing quorum.
> +NOTE: A single link backed by a bond can be problematic in certain failure
> +scenarios, see xref:pvecm_corosync_over_bonds[Corosync Over Bonds].
>
> * The root password of a cluster node is required for adding nodes.
>
> @@ -606,6 +604,68 @@ transport to `udp` or `udpu` in your xref:pvecm_edit_corosync_conf[corosync.conf
> but keep in mind that this will disable all cryptography and redundancy support.
> This is therefore not recommended.
>
> +[[pvecm_corosync_over_bonds]]
> +Corosync Over Bonds
> +~~~~~~~~~~~~~~~~~~~
> +
> +Recommendations
> +^^^^^^^^^^^^^^^
> +
> +We recommend at least one dedicated physical NIC for the primary Corosync link,
> +see xref:pvecm_cluster_requirements[Requirements].
> +xref:sysadmin_network_bond[Bonds] may be used as additional links for increased
> +redundancy. The following caveats apply *whenever a bond is used for Corosync
> +traffic*:
> +
> +* Bond mode *active-backup* may not provide the expected redundancy in certain
> + failure scenarios, see below for details.
> +
> +* We *advise against* using bond modes *balance-rr*, *balance-xor*,
> + *balance-tlb*, or *balance-alb* for Corosync traffic. They are known to be
> + problematic in certain failure scenarios, see below for details.
> +
> +* *IEEE 802.3ad (LACP)*: If LACP bonds are used for corosync traffic, we
> + strongly recommend setting `bond-lacp-rate fast` *on the Proxmox VE node and
> + the switch*! With the default setting `bond-lacp-rate slow`, this mode is
Looking at the rendered version, having the `bond-lacp-rate fast` and
then the bold sentence afterwards seems a bit much. Maybe we could limit
the bold parts to just `Proxmox VE` and `switch` here instead?
> + known to be problematic in certain failure scenarios, see below for details.
> +
> +Background
> +^^^^^^^^^^
> +
> +Using a xref:sysadmin_network_bond[bond] as a Corosync link can be problematic
> +in certain failure scenarios. Consider the failure scenario where one of the
> +bonded interfaces fails and stops transmitting packets, but its link state
> +stays up, and there are no other Corosync links available. In this scenario,
> +some bond modes may cause a state of asymmetric connectivity where cluster
> +nodes can only communicate with different subsets of other nodes. Affected are
> +bond modes that provide load balancing, as these modes may still try to send
> +out a subset of packets via the failed interface. In case of asymmetric
> +connectivity, Corosync may not be able to form a stable quorum in the cluster.
> +If this state persists and HA is enabled, even nodes whose bond does not have
> +any issues may fence themselves. In the worst case, the whole cluster may fence
> +itself.
> +
> +The bond mode *active-backup* will not cause asymmetric connectivity in the
Maybe we can make the `not` here bold as well, to better differentiate
its behavior from the other bond modes?
> +failure scenario described above. However, the bond with the interface failure
> +may not switch over to the backup link. The node may lose connection to the
> +cluster and, if HA is enabled, fence itself.
> +
> +Bond modes *balance-rr*, *balance-xor*, *balance_tlb*, or *balance-alb* may
> +cause asymmetric connectivity in the failure scenario above, which can lead to
> +unexpected fencing if HA is enabled.
> +
> +Bond mode *IEEE 802.3ad (LACP)* can cause asymmetric connectivity in the
> +failure scenario above, but it can recover from this state, as each side of the
> +bond (Proxmox VE node and switch) can stop using a bonded interface if it has
> +not received three LACPDUs in a row on it. However, with default settings,
> +LACPDUs are only sent every 30 seconds, yielding a failover time of 90 seconds.
> +This is too long, as nodes with HA resources will fence themselves already
> +after roughly one minute without a stable quorum. If LACP bonds are used for
> +corosync traffic, we recommend setting `bond-lacp-rate fast` on the Proxmox VE
> +node and the switch! Setting this option on one side requests the other side to
This should match the part above and be bold as well.
> +send an LACPDU every second. Setting this option on both sides can reduce the
> +failover time in the scenario above to 3 seconds and thus prevent fencing.
> +
> Separate Cluster Network
> ~~~~~~~~~~~~~~~~~~~~~~~~
>
The changes look good to me, so consider this:
Reviewed-by: Mira Limbeck <m.limbeck at proxmox.com>
More information about the pve-devel
mailing list