[pve-devel] [PATCH ha-manager v2 05/18] rules: add merged positive resource affinity info in global checks
Daniel Kral
d.kral at proxmox.com
Fri Oct 31 11:01:23 CET 2025
On Wed Sep 10, 2025 at 7:35 PM CEST, Thomas Lamprecht wrote:
> Am 09.09.25 um 10:36 schrieb Daniel Kral:
>> The node affinity and positive resource affinity rule subset is checked
>> whether the HA resources in a positive resource affinity rule are in
>> more than one node affinity rule in total.
>>
>> This check has the assumption that each positive resource affinity
>> rule's resource set is disjoint from each other, but this is only done
>> in the later transformation stage when positive resource affinity with
>> overlapping HA resources in them are merged to one rule.
>>
>> For example, the following inconsistent rules are not pruned:
>>
>> - positive resource affinity rule between vm:101 and vm:102
>> - positive resource affinity rule between vm:102 and vm:103
>> - node affinity rule for vm:101 on node1
>> - node affinity rule for vm:103 on node3
>
> This is only a real problem if both node affinity rules are configured
> to be strict. Your test case (and FWICT code) acts that way, so mostly
> relevant for the commit message to avoid potential confusion about what
> rules get/needs to be pruned. Can be improved on applying though, no need
> for a v3 just for that, just wanted to note it to avoid forgetting it in
> case I do not get around to finish review here soonish.
I assumed this to be true too when I read it in September, but as I
reviewed this again for sending a new revision for this series now I
noticed that it also prunes non-strict node affinity rules.
We follow the priority classes quite strictly for node affinity rules/HA
groups (for both non-strict and strict ones) with only respecting the
highest nodes in the highest priority class. As the non-member nodes for
non-strict node affinity rules are added with priority -1, it depends on
whether none of the higher priority nodes are online and so we cannot
verify here whether this will be the case.
A more revealing example would be a cluster with the 3 nodes node1,
node2, and node3 and the following rules (based on the example above):
- positive resource affinity rule between vm:101 and vm:102
- positive resource affinity rule between vm:102 and vm:103
- non-strict node affinity rule for vm:101 on node1:3,node2:2
- non-strict node affinity rule for vm:103 on node3:3,node2:2
This rule set would only be consistent if node1 and node3 would be both
down and would never fallback to node3 and node1 respectively as then
the whole cluster must be offline.
I'll clarify this in the commit message and test cases for the v3.
FWIW it might be worth to check out to loosen up this behavior a bit
with counting priorities as weights, as we briefly talked off-list as
far as I can remember, but as I've seen at least a few users who depend
on the current behavior we'd need to make this a (per-rule?) flag.
>>
>> Therefore build the same disjoint positive resource affinity resource
>> sets as the merge_connected_positive_resource_affinity_rules(...)
>> subroutine, so that the inconsistency check has the necessary
>> information in advance.
>>
More information about the pve-devel
mailing list