[pve-devel] [PATCH docs v3 1/1] ha: add documentation about ha resource affinity rules
Friedrich Weber
f.weber at proxmox.com
Wed Jul 9 08:19:02 CEST 2025
On 08/07/2025 18:08, Shannon Sterz wrote:
> On Fri Jul 4, 2025 at 8:20 PM CEST, Daniel Kral wrote:
>> Add documentation about HA Resource Affinity rules, what effects those
>> have on the CRS scheduler, and what users can expect when those are
>> changed.
>>
>> There are also a few points on the rule conflicts/errors list which
>> describe some conflicts that can arise from a mixed usage of HA Node
>> Affinity rules and HA Resource Affinity rules.
>>
>> Signed-off-by: Daniel Kral <d.kral at proxmox.com>
>> ---
>> Makefile | 1 +
>> gen-ha-rules-resource-affinity-opts.pl | 20 ++++
>> ha-manager.adoc | 133 +++++++++++++++++++++++++
>> ha-rules-resource-affinity-opts.adoc | 8 ++
>> 4 files changed, 162 insertions(+)
>> create mode 100755 gen-ha-rules-resource-affinity-opts.pl
>> create mode 100644 ha-rules-resource-affinity-opts.adoc
>>
>> diff --git a/Makefile b/Makefile
>> index c5e506e..4d9e2f0 100644
>> --- a/Makefile
>> +++ b/Makefile
>> @@ -51,6 +51,7 @@ GEN_SCRIPTS= \
>> gen-ha-resources-opts.pl \
>> gen-ha-rules-node-affinity-opts.pl \
>> gen-ha-rules-opts.pl \
>> + gen-ha-rules-resource-affinity-opts.pl \
>> gen-datacenter.cfg.5-opts.pl \
>> gen-pct.conf.5-opts.pl \
>> gen-pct-network-opts.pl \
>> diff --git a/gen-ha-rules-resource-affinity-opts.pl b/gen-ha-rules-resource-affinity-opts.pl
>> new file mode 100755
>> index 0000000..5abed50
>> --- /dev/null
>> +++ b/gen-ha-rules-resource-affinity-opts.pl
>> @@ -0,0 +1,20 @@
>> +#!/usr/bin/perl
>> +
>> +use lib '.';
>> +use strict;
>> +use warnings;
>> +use PVE::RESTHandler;
>> +
>> +use Data::Dumper;
>> +
>> +use PVE::HA::Rules;
>> +use PVE::HA::Rules::ResourceAffinity;
>> +
>> +my $private = PVE::HA::Rules::private();
>> +my $resource_affinity_rule_props = PVE::HA::Rules::ResourceAffinity::properties();
>> +my $properties = {
>> + resources => $private->{propertyList}->{resources},
>> + $resource_affinity_rule_props->%*,
>> +};
>> +
>> +print PVE::RESTHandler::dump_properties($properties);
>> diff --git a/ha-manager.adoc b/ha-manager.adoc
>> index ec26c22..8d06885 100644
>> --- a/ha-manager.adoc
>> +++ b/ha-manager.adoc
>> @@ -692,6 +692,10 @@ include::ha-rules-opts.adoc[]
>> | HA Rule Type | Description
>> | `node-affinity` | Places affinity from one or more HA resources to one or
>> more nodes.
>> +| `resource-affinity` | Places affinity between two or more HA resources. The
>> +affinity `separate` specifies that HA resources are to be kept on separate
>> +nodes, while the affinity `together` specifies that HA resources are to be kept
>> +on the same node.
>
> here it's calleged "together" (or "separate")...
>
>> |===========================================================
>>
>> [[ha_manager_node_affinity_rules]]
>> @@ -758,6 +762,88 @@ Node Affinity Rule Properties
>>
>> include::ha-rules-node-affinity-opts.adoc[]
>>
>> +[[ha_manager_resource_affinity_rules]]
>> +Resource Affinity Rules
>> +^^^^^^^^^^^^^^^^^^^^^^^
>> +
>> +Another common requirement is that two or more HA resources should run on
>> +either the same node, or should be distributed on separate nodes. These are
>> +also commonly called "Affinity/Anti-Affinity constraints".
>> +
>> +For example, suppose there is a lot of communication traffic between the HA
>> +resources `vm:100` and `vm:200`, e.g., a web server communicating with a
>
> nit: just a small heads up, we recommend avoid "e.g." as it often gets
> confused with "i.e." [1]. you could use `for example` instead to make
> this a bit clearer (same below)
>
> [1]: https://pve.proxmox.com/wiki/Technical_Writing_Style_Guide#Abbreviations
>
>> +database server. If those HA resources are on separate nodes, this could
>> +potentially result in a higher latency and unnecessary network load. Resource
>> +affinity rules with the affinity `positive` implement the constraint to keep
>> +the HA resources on the same node:
>> +
>> +----
>> +# ha-manager rules add resource-affinity keep-together \
>> + --affinity positive --resources vm:100,vm:200
>> +----
>
> ... here it is specified as "positive"? did i miss something or is that
> incorrect?
Good catch, but I think it should be "positive"/"negative", so it's
"together" and "separate" that are outdated. A lot of the naming was
changed between v2 and v3, including "together"->"positive", and
"separate"->"negative" [1] so they're probably leftovers from before the
rename.
[1]
https://lore.proxmox.com/pve-devel/7fb94369-d8b6-47c6-b36c-428db5bb85de@proxmox.com/
>
>> +
>> +NOTE: If there are two or more positive resource affinity rules, which have
>> +common HA resources, then these are treated as a single positive resource
>> +affinity rule. For example, if the HA resources `vm:100` and `vm:101` and the
>> +HA resources `vm:101` and `vm:102` are each in a positive resource affinity
>> +rule, then it is the same as if `vm:100`, `vm:101` and `vm:102` would have been
>> +in a single positive resource affinity rule.
>> +
>> +However, suppose there are computationally expensive, and/or distributed
>> +programs running on the HA resources `vm:200` and `ct:300`, e.g., sharded
>> +database instances. In that case, running them on the same node could
>> +potentially result in pressure on the hardware resources of the node and will
>> +slow down the operations of these HA resources. Resource affinity rules with
>> +the affinity `negative` implement the constraint to spread the HA resources on
>> +separate nodes:
>> +
>> +----
>> +# ha-manager rules add resource-affinity keep-separate \
>> + --affinity negative --resources vm:200,ct:300
>> +----
>
> ... same here with "separate" or "negative"
>
>> +
>> +Other than node affinity rules, resource affinity rules are strict by default,
>> +i.e., if the constraints imposed by the resource affinity rules cannot be met
>> +for a HA resource, the HA Manager will put the HA resource in recovery state in
>> +case of a failover or in error state elsewhere.
>> +
>> +The above commands created the following rules in the rules configuration file:
>> +
>> +.Resource Affinity Rules Configuration Example (`/etc/pve/ha/rules.cfg`)
>> +----
>> +resource-affinity: keep-together
>> + resources vm:100,vm:200
>> + affinity positive
>> +
>> +resource-affinity: keep-separate
>> + resources vm:200,ct:300
>> + affinity negative
>> +----
>> +
>> +Interactions between Positive and Negative Resource Affinity Rules
>> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> +
>> +If there are HA resources in a positive resource affinity rule, which are also
>> +part of a negative resource affinity rule, then all the other HA resources in
>> +the positive resource affinity rule are in negative affinity with the HA
>> +resources of these negative resource affinity rules as well.
>> +
>> +For example, if the HA resources `vm:100`, `vm:101`, and `vm:102` are in a
>> +positive resource affinity rule, and `vm:100` is in a negative resource affinity
>> +rule with the HA resource `ct:200`, then `vm:101` and `vm:102` are each in
>> +negative resource affinity with `ct:200` as well.
>> +
>> +Note that if there are two or more HA resources in both a positive and negative
>> +resource affinity rule, then those will be disabled as they cause a conflict:
>> +Two or more HA resources cannot be kept on the same node and separated on
>> +different nodes at the same time. For more information on these cases, see the
>> +section about xref:ha_manager_rule_conflicts[rule conflicts and errors] below.
>> +
>> +Resource Affinity Rule Properties
>> ++++++++++++++++++++++++++++++++++
>> +
>> +include::ha-rules-resource-affinity-opts.adoc[]
>> +
>> [[ha_manager_rule_conflicts]]
>> Rule Conflicts and Errors
>> ~~~~~~~~~~~~~~~~~~~~~~~~~
>> @@ -774,6 +860,43 @@ Currently, HA rules are checked for the following feasibility tests:
>> total. If two or more HA node affinity rules specify the same HA resource,
>> these HA node affinity rules will be disabled.
>>
>> +* A HA resource affinity rule must specify at least two HA resources to be
>> + feasible. If a HA resource affinity rule does specify only one HA resource,
>
> nit: get rid of the "does" it makes this already very long and hard to
> parse sentence ven hader to read.
>
>> + the HA resource affinity rule will be disabled.
>> +
>> +* A HA resource affinity rule must specify no more HA resources than there are
>> + nodes in the cluster. If a HA resource affinity rule does specify more HA
>
> same here
>
>> + resources than there are in the cluster, the HA resource affinity rule will be
>> + disabled.
>> +
>> +* A positive HA resource affinity rule cannot specify the same two or more HA
>> + resources as a negative HA resources affinity rule. That is, two or more HA
>> + resources cannot be kept together and separate at the same time. If any pair
>> + of positive and negative HA resource affinity rules do specify the same two or
>> + more HA resources, both HA resource affinity rules will be disabled.
>> +
>> +* A HA resource, which is already constrained by a HA node affinity rule, can
>> + only be referenced by a HA resource affinity rule, if the HA node affinity
>> + rule does only use a single priority group. That is, the specified nodes in
>
> and here
>
>> + the HA node affinity rule have the same priority. If one of the HA resources
>> + in a HA resource affinity rule is constrainted by a HA node affinity rule with
>
> typo: constrainted -> constrained
>
>> + multiple priority groups, the HA resource affinity rule will be disabled.
>> +
>> +* The HA resources of a positive HA resource affinity rule, which are
>> + constrained by HA node affinity rules, must have at least one common node,
>> + where the HA resources are allowed to run on. Otherwise, the HA resources
>> + could only run on separate nodes. In other words, if two or more HA resources
>> + of a positive HA resource affinity rule are constrained to different nodes,
>> + the positive HA resource affinity rule will be disabled.
>> +
>> +* The HA resources of a negative HA resource affinity rule, which are
>> + constrained by HA node affinity rules, must have at least enough nodes to
>> + separate these constrained HA resources on. Otherwise, the HA resources do not
>
> nit: the "on" here is not necessary.
>
>> + have enough nodes to be separated on. In other words, if two or more HA
>
> same here.
>
>> + resources of a negative HA resource affinity rule are constrained to less
>> + nodes than needed to separate them on, the negative HA resource affinity rule
>
> and here
>
>> + will be disabled.
>> +
>> [[ha_manager_fencing]]
>> Fencing
>> -------
>> @@ -1205,6 +1328,16 @@ The CRS is currently used at the following scheduling points:
>> algorithm to ensure that these HA resources are assigned according to their
>> node and priority constraints.
>>
>> +** Positive resource affinity rules: If a positive resource affinity rule is
>> + created or HA resources are added to an existing positive resource affinity
>> + rule, the HA stack will use the CRS algorithm to ensure that these HA
>> + resources are moved to a common node.
>> +
>> +** Negative resource affinity rules: If a negative resource affinity rule is
>> + created or HA resources are added to an existing negative resource affinity
>> + rule, the HA stack will use the CRS algorithm to ensure that these HA
>> + resources are moved to separate nodes.
>> +
>> - HA service stopped -> start transition (opt-in). Requesting that a stopped
>> service should be started is an good opportunity to check for the best suited
>> node as per the CRS algorithm, as moving stopped services is cheaper to do
>> diff --git a/ha-rules-resource-affinity-opts.adoc b/ha-rules-resource-affinity-opts.adoc
>> new file mode 100644
>> index 0000000..596ec3c
>> --- /dev/null
>> +++ b/ha-rules-resource-affinity-opts.adoc
>> @@ -0,0 +1,8 @@
>> +`affinity`: `<negative | positive>` ::
>> +
>> +Describes whether the HA resources are supposed to be kept on the same node ('positive'), or are supposed to be kept on separate nodes ('negative').
>> +
>> +`resources`: `<type>:<name>{,<type>:<name>}*` ::
>> +
>> +List of HA resource IDs. This consists of a list of resource types followed by a resource specific name separated with a colon (example: vm:100,ct:101).
>> +
>
>
>
> _______________________________________________
> pve-devel mailing list
> pve-devel at lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>
>
More information about the pve-devel
mailing list