[pve-devel] [PATCH cluster/docs/ha-manager/manager v3 00/20] HA Rules
Daniel Kral
d.kral at proxmox.com
Fri Jul 4 20:16:39 CEST 2025
RFC v1: https://lore.proxmox.com/pve-devel/20250325151254.193177-1-d.kral@proxmox.com/
RFC v2: https://lore.proxmox.com/pve-devel/20250620143148.218469-1-d.kral@proxmox.com/
I've separated the core HA Rules module and the transformation from HA
groups to HA Node Affinity rules (formerly known as HA Location rules)
in this patch series, to reduce the overhead for reviewers and strive
for a better version history, as changing two things at a time is rather
confusing.
The main things that have changed since the last version (v2):
- split up the patch series (ofc)
- rebased on newest available master
- renamed "HA Location Rule" to "HA Node Affinity Rule"
- renamed any reference of a 'HA service' to 'HA resource' (e.g. rules
property 'services' is now 'resources')
- converted tri-state property 'state' to a binary 'disable' flag on HA
rules and expose the 'contradictory' state with an 'errors' hash
- remove the "use-location-rules" feature flag and implement a more
straightforward ha groups migration (more on that below)
- remove any reference of ha groups from the web interface
As before, HA groups are migrated to HA node affinity rules in each HA
Manager round where something has changed about the HA groups / HA
resources config file, but these are now unconditionally done as soon as
a HA Manager runs with that version. It will also try to persistently
migrate these, but that will only be successful as soon as all other
nodes are upgraded (i.e. every node can run at least the HA Manager
version that can successfully parse and apply the HA rules).
There are still some things left to do, which I didn't get the time to
come around to do for this revision:
- Testing, testing, testing
- I've ran out of time on the persistent HA groups migration part, which
has at least the two TODOs, which are mentioned in the patch itself,
and I haven't tested them on any real PVE upgrade yet; It's more of a
draft on how the migration should potentially work
- Also, the last patch for the persistent HA groups migration part will
fail the tests but the two that have been added, because of the way
the other tests are designed; that should be abstracted away in the HA
environment, e.g., a routine "have_groups_been_migrated" for PVE2/Sim.
- There might be a bit too many in-memory group migrations on the HA
Rules API side now, but better safe then sorry, maybe they can be
removed later; however, these shouldn't overwrite the rules that come
from the config, I haven't checked on that yet
- Should the HA Groups API (and the HA Resources 'group' property in the
HA Resources API) be removed now? Or should these stay and uses of
them make auto-migrations to the HA Rules?
As in the previous revisions, I've run a
git rebase master --exec 'make clean && make deb'
on the series, so the tests should work for every patch.
cluster:
Daniel Kral (1):
cfs: add 'ha/rules.cfg' to observed files
src/PVE/Cluster.pm | 1 +
src/pmxcfs/status.c | 1 +
2 files changed, 2 insertions(+)
base-commit: 60e36c87b0fffe6dbdd5b1be72a9273b6f7cec2b
prerequisite-patch-id: 50b1021d35ecf86562d33dc6068c90e219557ab7
prerequisite-patch-id: 0374f409a039eebe9dd7587d6c018ef71ac2c67d
prerequisite-patch-id: d17849368da2aa61fcab9e08235f8673a2d0258e
ha-manager:
Daniel Kral (15):
tree-wide: make arguments for select_service_node explicit
manager: improve signature of select_service_node
introduce rules base plugin
rules: introduce node affinity rule plugin
config, env, hw: add rules read and parse methods
config: delete services from rules if services are deleted from config
manager: read and update rules config
test: ha tester: add test cases for future node affinity rules
resources: introduce failback property in ha resource config
manager: migrate ha groups to node affinity rules in-memory
manager: apply node affinity rules when selecting service nodes
test: add test cases for rules config
api: introduce ha rules api endpoints
cli: expose ha rules api endpoints to ha-manager cli
manager: persistently migrate ha groups to ha rules
.gitignore | 1 +
debian/pve-ha-manager.install | 3 +
src/PVE/API2/HA/Makefile | 2 +-
src/PVE/API2/HA/Resources.pm | 9 +
src/PVE/API2/HA/Rules.pm | 391 +++++++++++++++
src/PVE/API2/HA/Status.pm | 11 +-
src/PVE/CLI/ha_manager.pm | 32 ++
src/PVE/HA/Config.pm | 58 ++-
src/PVE/HA/Env.pm | 30 ++
src/PVE/HA/Env/PVE2.pm | 40 ++
src/PVE/HA/Groups.pm | 48 ++
src/PVE/HA/Makefile | 3 +-
src/PVE/HA/Manager.pm | 259 ++++++----
src/PVE/HA/Resources.pm | 9 +
src/PVE/HA/Resources/PVECT.pm | 1 +
src/PVE/HA/Resources/PVEVM.pm | 1 +
src/PVE/HA/Rules.pm | 455 ++++++++++++++++++
src/PVE/HA/Rules/Makefile | 6 +
src/PVE/HA/Rules/NodeAffinity.pm | 296 ++++++++++++
src/PVE/HA/Sim/Env.pm | 44 ++
src/PVE/HA/Sim/Hardware.pm | 44 ++
src/PVE/HA/Tools.pm | 46 ++
src/test/Makefile | 4 +-
.../defaults-for-node-affinity-rules.cfg | 22 +
...efaults-for-node-affinity-rules.cfg.expect | 60 +++
...e-resource-refs-in-node-affinity-rules.cfg | 31 ++
...rce-refs-in-node-affinity-rules.cfg.expect | 63 +++
src/test/test-group-migrate1/README | 10 +
src/test/test-group-migrate1/cmdlist | 3 +
src/test/test-group-migrate1/groups | 7 +
src/test/test-group-migrate1/hardware_status | 5 +
src/test/test-group-migrate1/log.expect | 306 ++++++++++++
src/test/test-group-migrate1/manager_status | 1 +
src/test/test-group-migrate1/service_config | 5 +
src/test/test-group-migrate2/README | 10 +
src/test/test-group-migrate2/cmdlist | 3 +
src/test/test-group-migrate2/groups | 7 +
src/test/test-group-migrate2/hardware_status | 5 +
src/test/test-group-migrate2/log.expect | 47 ++
src/test/test-group-migrate2/manager_status | 1 +
src/test/test-group-migrate2/service_config | 5 +
src/test/test-node-affinity-nonstrict1/README | 10 +
.../test-node-affinity-nonstrict1/cmdlist | 4 +
src/test/test-node-affinity-nonstrict1/groups | 2 +
.../hardware_status | 5 +
.../test-node-affinity-nonstrict1/log.expect | 40 ++
.../manager_status | 1 +
.../service_config | 3 +
src/test/test-node-affinity-nonstrict2/README | 12 +
.../test-node-affinity-nonstrict2/cmdlist | 4 +
src/test/test-node-affinity-nonstrict2/groups | 3 +
.../hardware_status | 5 +
.../test-node-affinity-nonstrict2/log.expect | 35 ++
.../manager_status | 1 +
.../service_config | 3 +
src/test/test-node-affinity-nonstrict3/README | 10 +
.../test-node-affinity-nonstrict3/cmdlist | 4 +
src/test/test-node-affinity-nonstrict3/groups | 2 +
.../hardware_status | 5 +
.../test-node-affinity-nonstrict3/log.expect | 56 +++
.../manager_status | 1 +
.../service_config | 5 +
src/test/test-node-affinity-nonstrict4/README | 14 +
.../test-node-affinity-nonstrict4/cmdlist | 4 +
src/test/test-node-affinity-nonstrict4/groups | 2 +
.../hardware_status | 5 +
.../test-node-affinity-nonstrict4/log.expect | 54 +++
.../manager_status | 1 +
.../service_config | 5 +
src/test/test-node-affinity-nonstrict5/README | 16 +
.../test-node-affinity-nonstrict5/cmdlist | 5 +
src/test/test-node-affinity-nonstrict5/groups | 2 +
.../hardware_status | 5 +
.../test-node-affinity-nonstrict5/log.expect | 66 +++
.../manager_status | 1 +
.../service_config | 3 +
src/test/test-node-affinity-nonstrict6/README | 14 +
.../test-node-affinity-nonstrict6/cmdlist | 5 +
src/test/test-node-affinity-nonstrict6/groups | 3 +
.../hardware_status | 5 +
.../test-node-affinity-nonstrict6/log.expect | 52 ++
.../manager_status | 1 +
.../service_config | 3 +
src/test/test-node-affinity-strict1/README | 10 +
src/test/test-node-affinity-strict1/cmdlist | 4 +
src/test/test-node-affinity-strict1/groups | 3 +
.../hardware_status | 5 +
.../test-node-affinity-strict1/log.expect | 40 ++
.../test-node-affinity-strict1/manager_status | 1 +
.../test-node-affinity-strict1/service_config | 3 +
src/test/test-node-affinity-strict2/README | 11 +
src/test/test-node-affinity-strict2/cmdlist | 4 +
src/test/test-node-affinity-strict2/groups | 4 +
.../hardware_status | 5 +
.../test-node-affinity-strict2/log.expect | 40 ++
.../test-node-affinity-strict2/manager_status | 1 +
.../test-node-affinity-strict2/service_config | 3 +
src/test/test-node-affinity-strict3/README | 10 +
src/test/test-node-affinity-strict3/cmdlist | 4 +
src/test/test-node-affinity-strict3/groups | 3 +
.../hardware_status | 5 +
.../test-node-affinity-strict3/log.expect | 74 +++
.../test-node-affinity-strict3/manager_status | 1 +
.../test-node-affinity-strict3/service_config | 5 +
src/test/test-node-affinity-strict4/README | 14 +
src/test/test-node-affinity-strict4/cmdlist | 4 +
src/test/test-node-affinity-strict4/groups | 3 +
.../hardware_status | 5 +
.../test-node-affinity-strict4/log.expect | 54 +++
.../test-node-affinity-strict4/manager_status | 1 +
.../test-node-affinity-strict4/service_config | 5 +
src/test/test-node-affinity-strict5/README | 16 +
src/test/test-node-affinity-strict5/cmdlist | 5 +
src/test/test-node-affinity-strict5/groups | 3 +
.../hardware_status | 5 +
.../test-node-affinity-strict5/log.expect | 66 +++
.../test-node-affinity-strict5/manager_status | 1 +
.../test-node-affinity-strict5/service_config | 3 +
src/test/test-node-affinity-strict6/README | 14 +
src/test/test-node-affinity-strict6/cmdlist | 5 +
src/test/test-node-affinity-strict6/groups | 4 +
.../hardware_status | 5 +
.../test-node-affinity-strict6/log.expect | 52 ++
.../test-node-affinity-strict6/manager_status | 1 +
.../test-node-affinity-strict6/service_config | 3 +
src/test/test_failover1.pl | 27 +-
src/test/test_rules_config.pl | 100 ++++
127 files changed, 3398 insertions(+), 95 deletions(-)
create mode 100644 src/PVE/API2/HA/Rules.pm
create mode 100644 src/PVE/HA/Rules.pm
create mode 100644 src/PVE/HA/Rules/Makefile
create mode 100644 src/PVE/HA/Rules/NodeAffinity.pm
create mode 100644 src/test/rules_cfgs/defaults-for-node-affinity-rules.cfg
create mode 100644 src/test/rules_cfgs/defaults-for-node-affinity-rules.cfg.expect
create mode 100644 src/test/rules_cfgs/multiple-resource-refs-in-node-affinity-rules.cfg
create mode 100644 src/test/rules_cfgs/multiple-resource-refs-in-node-affinity-rules.cfg.expect
create mode 100644 src/test/test-group-migrate1/README
create mode 100644 src/test/test-group-migrate1/cmdlist
create mode 100644 src/test/test-group-migrate1/groups
create mode 100644 src/test/test-group-migrate1/hardware_status
create mode 100644 src/test/test-group-migrate1/log.expect
create mode 100644 src/test/test-group-migrate1/manager_status
create mode 100644 src/test/test-group-migrate1/service_config
create mode 100644 src/test/test-group-migrate2/README
create mode 100644 src/test/test-group-migrate2/cmdlist
create mode 100644 src/test/test-group-migrate2/groups
create mode 100644 src/test/test-group-migrate2/hardware_status
create mode 100644 src/test/test-group-migrate2/log.expect
create mode 100644 src/test/test-group-migrate2/manager_status
create mode 100644 src/test/test-group-migrate2/service_config
create mode 100644 src/test/test-node-affinity-nonstrict1/README
create mode 100644 src/test/test-node-affinity-nonstrict1/cmdlist
create mode 100644 src/test/test-node-affinity-nonstrict1/groups
create mode 100644 src/test/test-node-affinity-nonstrict1/hardware_status
create mode 100644 src/test/test-node-affinity-nonstrict1/log.expect
create mode 100644 src/test/test-node-affinity-nonstrict1/manager_status
create mode 100644 src/test/test-node-affinity-nonstrict1/service_config
create mode 100644 src/test/test-node-affinity-nonstrict2/README
create mode 100644 src/test/test-node-affinity-nonstrict2/cmdlist
create mode 100644 src/test/test-node-affinity-nonstrict2/groups
create mode 100644 src/test/test-node-affinity-nonstrict2/hardware_status
create mode 100644 src/test/test-node-affinity-nonstrict2/log.expect
create mode 100644 src/test/test-node-affinity-nonstrict2/manager_status
create mode 100644 src/test/test-node-affinity-nonstrict2/service_config
create mode 100644 src/test/test-node-affinity-nonstrict3/README
create mode 100644 src/test/test-node-affinity-nonstrict3/cmdlist
create mode 100644 src/test/test-node-affinity-nonstrict3/groups
create mode 100644 src/test/test-node-affinity-nonstrict3/hardware_status
create mode 100644 src/test/test-node-affinity-nonstrict3/log.expect
create mode 100644 src/test/test-node-affinity-nonstrict3/manager_status
create mode 100644 src/test/test-node-affinity-nonstrict3/service_config
create mode 100644 src/test/test-node-affinity-nonstrict4/README
create mode 100644 src/test/test-node-affinity-nonstrict4/cmdlist
create mode 100644 src/test/test-node-affinity-nonstrict4/groups
create mode 100644 src/test/test-node-affinity-nonstrict4/hardware_status
create mode 100644 src/test/test-node-affinity-nonstrict4/log.expect
create mode 100644 src/test/test-node-affinity-nonstrict4/manager_status
create mode 100644 src/test/test-node-affinity-nonstrict4/service_config
create mode 100644 src/test/test-node-affinity-nonstrict5/README
create mode 100644 src/test/test-node-affinity-nonstrict5/cmdlist
create mode 100644 src/test/test-node-affinity-nonstrict5/groups
create mode 100644 src/test/test-node-affinity-nonstrict5/hardware_status
create mode 100644 src/test/test-node-affinity-nonstrict5/log.expect
create mode 100644 src/test/test-node-affinity-nonstrict5/manager_status
create mode 100644 src/test/test-node-affinity-nonstrict5/service_config
create mode 100644 src/test/test-node-affinity-nonstrict6/README
create mode 100644 src/test/test-node-affinity-nonstrict6/cmdlist
create mode 100644 src/test/test-node-affinity-nonstrict6/groups
create mode 100644 src/test/test-node-affinity-nonstrict6/hardware_status
create mode 100644 src/test/test-node-affinity-nonstrict6/log.expect
create mode 100644 src/test/test-node-affinity-nonstrict6/manager_status
create mode 100644 src/test/test-node-affinity-nonstrict6/service_config
create mode 100644 src/test/test-node-affinity-strict1/README
create mode 100644 src/test/test-node-affinity-strict1/cmdlist
create mode 100644 src/test/test-node-affinity-strict1/groups
create mode 100644 src/test/test-node-affinity-strict1/hardware_status
create mode 100644 src/test/test-node-affinity-strict1/log.expect
create mode 100644 src/test/test-node-affinity-strict1/manager_status
create mode 100644 src/test/test-node-affinity-strict1/service_config
create mode 100644 src/test/test-node-affinity-strict2/README
create mode 100644 src/test/test-node-affinity-strict2/cmdlist
create mode 100644 src/test/test-node-affinity-strict2/groups
create mode 100644 src/test/test-node-affinity-strict2/hardware_status
create mode 100644 src/test/test-node-affinity-strict2/log.expect
create mode 100644 src/test/test-node-affinity-strict2/manager_status
create mode 100644 src/test/test-node-affinity-strict2/service_config
create mode 100644 src/test/test-node-affinity-strict3/README
create mode 100644 src/test/test-node-affinity-strict3/cmdlist
create mode 100644 src/test/test-node-affinity-strict3/groups
create mode 100644 src/test/test-node-affinity-strict3/hardware_status
create mode 100644 src/test/test-node-affinity-strict3/log.expect
create mode 100644 src/test/test-node-affinity-strict3/manager_status
create mode 100644 src/test/test-node-affinity-strict3/service_config
create mode 100644 src/test/test-node-affinity-strict4/README
create mode 100644 src/test/test-node-affinity-strict4/cmdlist
create mode 100644 src/test/test-node-affinity-strict4/groups
create mode 100644 src/test/test-node-affinity-strict4/hardware_status
create mode 100644 src/test/test-node-affinity-strict4/log.expect
create mode 100644 src/test/test-node-affinity-strict4/manager_status
create mode 100644 src/test/test-node-affinity-strict4/service_config
create mode 100644 src/test/test-node-affinity-strict5/README
create mode 100644 src/test/test-node-affinity-strict5/cmdlist
create mode 100644 src/test/test-node-affinity-strict5/groups
create mode 100644 src/test/test-node-affinity-strict5/hardware_status
create mode 100644 src/test/test-node-affinity-strict5/log.expect
create mode 100644 src/test/test-node-affinity-strict5/manager_status
create mode 100644 src/test/test-node-affinity-strict5/service_config
create mode 100644 src/test/test-node-affinity-strict6/README
create mode 100644 src/test/test-node-affinity-strict6/cmdlist
create mode 100644 src/test/test-node-affinity-strict6/groups
create mode 100644 src/test/test-node-affinity-strict6/hardware_status
create mode 100644 src/test/test-node-affinity-strict6/log.expect
create mode 100644 src/test/test-node-affinity-strict6/manager_status
create mode 100644 src/test/test-node-affinity-strict6/service_config
create mode 100755 src/test/test_rules_config.pl
base-commit: 264dc2c58d145394219f82f25d41f4fc438c4dc4
prerequisite-patch-id: 530b875c25a6bded1cc2294960cf465d5c2bcbca
docs:
Daniel Kral (1):
ha: add documentation about ha rules and ha node affinity rules
Makefile | 2 +
gen-ha-rules-node-affinity-opts.pl | 20 ++++++
gen-ha-rules-opts.pl | 17 +++++
ha-manager.adoc | 103 +++++++++++++++++++++++++++++
ha-rules-node-affinity-opts.adoc | 18 +++++
ha-rules-opts.adoc | 12 ++++
pmxcfs.adoc | 1 +
7 files changed, 173 insertions(+)
create mode 100755 gen-ha-rules-node-affinity-opts.pl
create mode 100755 gen-ha-rules-opts.pl
create mode 100644 ha-rules-node-affinity-opts.adoc
create mode 100644 ha-rules-opts.adoc
base-commit: 7cc17ee5950a53bbd5b5ad81270352ccdb1c541c
prerequisite-patch-id: 92556cd6c1edfb88b397ae244d7dcd56876cd8fb
manager:
Daniel Kral (3):
api: ha: add ha rules api endpoints
ui: ha: remove ha groups from ha resource components
ui: ha: show failback flag in resources status view
PVE/API2/HAConfig.pm | 8 +++++++-
www/manager6/ha/ResourceEdit.js | 16 ++++++++++++----
www/manager6/ha/Resources.js | 17 +++--------------
www/manager6/ha/StatusView.js | 5 ++++-
4 files changed, 26 insertions(+), 20 deletions(-)
base-commit: c0cbe76ee90e7110934c50414bc22371cf13c01a
prerequisite-patch-id: ec6a39936719cfe38787fccb1a80af6378980723
Summary over all repositories:
140 files changed, 3599 insertions(+), 115 deletions(-)
--
Generated by git-murpp 0.8.0
More information about the pve-devel
mailing list