[pve-devel] [PATCH cluster/docs/ha-manager/manager v3 00/20] HA Rules

Daniel Kral d.kral at proxmox.com
Fri Jul 4 20:16:39 CEST 2025


RFC v1: https://lore.proxmox.com/pve-devel/20250325151254.193177-1-d.kral@proxmox.com/
RFC v2: https://lore.proxmox.com/pve-devel/20250620143148.218469-1-d.kral@proxmox.com/

I've separated the core HA Rules module and the transformation from HA
groups to HA Node Affinity rules (formerly known as HA Location rules)
in this patch series, to reduce the overhead for reviewers and strive
for a better version history, as changing two things at a time is rather
confusing.

The main things that have changed since the last version (v2):

- split up the patch series (ofc)

- rebased on newest available master

- renamed "HA Location Rule" to "HA Node Affinity Rule"

- renamed any reference of a 'HA service' to 'HA resource' (e.g. rules
  property 'services' is now 'resources')

- converted tri-state property 'state' to a binary 'disable' flag on HA
  rules and expose the 'contradictory' state with an 'errors' hash

- remove the "use-location-rules" feature flag and implement a more
  straightforward ha groups migration (more on that below)

- remove any reference of ha groups from the web interface



As before, HA groups are migrated to HA node affinity rules in each HA
Manager round where something has changed about the HA groups / HA
resources config file, but these are now unconditionally done as soon as
a HA Manager runs with that version. It will also try to persistently
migrate these, but that will only be successful as soon as all other
nodes are upgraded (i.e. every node can run at least the HA Manager
version that can successfully parse and apply the HA rules).


There are still some things left to do, which I didn't get the time to
come around to do for this revision:

- Testing, testing, testing

- I've ran out of time on the persistent HA groups migration part, which
  has at least the two TODOs, which are mentioned in the patch itself,
  and I haven't tested them on any real PVE upgrade yet; It's more of a
  draft on how the migration should potentially work

- Also, the last patch for the persistent HA groups migration part will
  fail the tests but the two that have been added, because of the way
  the other tests are designed; that should be abstracted away in the HA
  environment, e.g., a routine "have_groups_been_migrated" for PVE2/Sim.

- There might be a bit too many in-memory group migrations on the HA
  Rules API side now, but better safe then sorry, maybe they can be
  removed later; however, these shouldn't overwrite the rules that come
  from the config, I haven't checked on that yet

- Should the HA Groups API (and the HA Resources 'group' property in the
  HA Resources API) be removed now? Or should these stay and uses of
  them make auto-migrations to the HA Rules?

As in the previous revisions, I've run a

    git rebase master --exec 'make clean && make deb'

on the series, so the tests should work for every patch.

cluster:

Daniel Kral (1):
  cfs: add 'ha/rules.cfg' to observed files

 src/PVE/Cluster.pm  | 1 +
 src/pmxcfs/status.c | 1 +
 2 files changed, 2 insertions(+)


base-commit: 60e36c87b0fffe6dbdd5b1be72a9273b6f7cec2b
prerequisite-patch-id: 50b1021d35ecf86562d33dc6068c90e219557ab7
prerequisite-patch-id: 0374f409a039eebe9dd7587d6c018ef71ac2c67d
prerequisite-patch-id: d17849368da2aa61fcab9e08235f8673a2d0258e

ha-manager:

Daniel Kral (15):
  tree-wide: make arguments for select_service_node explicit
  manager: improve signature of select_service_node
  introduce rules base plugin
  rules: introduce node affinity rule plugin
  config, env, hw: add rules read and parse methods
  config: delete services from rules if services are deleted from config
  manager: read and update rules config
  test: ha tester: add test cases for future node affinity rules
  resources: introduce failback property in ha resource config
  manager: migrate ha groups to node affinity rules in-memory
  manager: apply node affinity rules when selecting service nodes
  test: add test cases for rules config
  api: introduce ha rules api endpoints
  cli: expose ha rules api endpoints to ha-manager cli
  manager: persistently migrate ha groups to ha rules

 .gitignore                                    |   1 +
 debian/pve-ha-manager.install                 |   3 +
 src/PVE/API2/HA/Makefile                      |   2 +-
 src/PVE/API2/HA/Resources.pm                  |   9 +
 src/PVE/API2/HA/Rules.pm                      | 391 +++++++++++++++
 src/PVE/API2/HA/Status.pm                     |  11 +-
 src/PVE/CLI/ha_manager.pm                     |  32 ++
 src/PVE/HA/Config.pm                          |  58 ++-
 src/PVE/HA/Env.pm                             |  30 ++
 src/PVE/HA/Env/PVE2.pm                        |  40 ++
 src/PVE/HA/Groups.pm                          |  48 ++
 src/PVE/HA/Makefile                           |   3 +-
 src/PVE/HA/Manager.pm                         | 259 ++++++----
 src/PVE/HA/Resources.pm                       |   9 +
 src/PVE/HA/Resources/PVECT.pm                 |   1 +
 src/PVE/HA/Resources/PVEVM.pm                 |   1 +
 src/PVE/HA/Rules.pm                           | 455 ++++++++++++++++++
 src/PVE/HA/Rules/Makefile                     |   6 +
 src/PVE/HA/Rules/NodeAffinity.pm              | 296 ++++++++++++
 src/PVE/HA/Sim/Env.pm                         |  44 ++
 src/PVE/HA/Sim/Hardware.pm                    |  44 ++
 src/PVE/HA/Tools.pm                           |  46 ++
 src/test/Makefile                             |   4 +-
 .../defaults-for-node-affinity-rules.cfg      |  22 +
 ...efaults-for-node-affinity-rules.cfg.expect |  60 +++
 ...e-resource-refs-in-node-affinity-rules.cfg |  31 ++
 ...rce-refs-in-node-affinity-rules.cfg.expect |  63 +++
 src/test/test-group-migrate1/README           |  10 +
 src/test/test-group-migrate1/cmdlist          |   3 +
 src/test/test-group-migrate1/groups           |   7 +
 src/test/test-group-migrate1/hardware_status  |   5 +
 src/test/test-group-migrate1/log.expect       | 306 ++++++++++++
 src/test/test-group-migrate1/manager_status   |   1 +
 src/test/test-group-migrate1/service_config   |   5 +
 src/test/test-group-migrate2/README           |  10 +
 src/test/test-group-migrate2/cmdlist          |   3 +
 src/test/test-group-migrate2/groups           |   7 +
 src/test/test-group-migrate2/hardware_status  |   5 +
 src/test/test-group-migrate2/log.expect       |  47 ++
 src/test/test-group-migrate2/manager_status   |   1 +
 src/test/test-group-migrate2/service_config   |   5 +
 src/test/test-node-affinity-nonstrict1/README |  10 +
 .../test-node-affinity-nonstrict1/cmdlist     |   4 +
 src/test/test-node-affinity-nonstrict1/groups |   2 +
 .../hardware_status                           |   5 +
 .../test-node-affinity-nonstrict1/log.expect  |  40 ++
 .../manager_status                            |   1 +
 .../service_config                            |   3 +
 src/test/test-node-affinity-nonstrict2/README |  12 +
 .../test-node-affinity-nonstrict2/cmdlist     |   4 +
 src/test/test-node-affinity-nonstrict2/groups |   3 +
 .../hardware_status                           |   5 +
 .../test-node-affinity-nonstrict2/log.expect  |  35 ++
 .../manager_status                            |   1 +
 .../service_config                            |   3 +
 src/test/test-node-affinity-nonstrict3/README |  10 +
 .../test-node-affinity-nonstrict3/cmdlist     |   4 +
 src/test/test-node-affinity-nonstrict3/groups |   2 +
 .../hardware_status                           |   5 +
 .../test-node-affinity-nonstrict3/log.expect  |  56 +++
 .../manager_status                            |   1 +
 .../service_config                            |   5 +
 src/test/test-node-affinity-nonstrict4/README |  14 +
 .../test-node-affinity-nonstrict4/cmdlist     |   4 +
 src/test/test-node-affinity-nonstrict4/groups |   2 +
 .../hardware_status                           |   5 +
 .../test-node-affinity-nonstrict4/log.expect  |  54 +++
 .../manager_status                            |   1 +
 .../service_config                            |   5 +
 src/test/test-node-affinity-nonstrict5/README |  16 +
 .../test-node-affinity-nonstrict5/cmdlist     |   5 +
 src/test/test-node-affinity-nonstrict5/groups |   2 +
 .../hardware_status                           |   5 +
 .../test-node-affinity-nonstrict5/log.expect  |  66 +++
 .../manager_status                            |   1 +
 .../service_config                            |   3 +
 src/test/test-node-affinity-nonstrict6/README |  14 +
 .../test-node-affinity-nonstrict6/cmdlist     |   5 +
 src/test/test-node-affinity-nonstrict6/groups |   3 +
 .../hardware_status                           |   5 +
 .../test-node-affinity-nonstrict6/log.expect  |  52 ++
 .../manager_status                            |   1 +
 .../service_config                            |   3 +
 src/test/test-node-affinity-strict1/README    |  10 +
 src/test/test-node-affinity-strict1/cmdlist   |   4 +
 src/test/test-node-affinity-strict1/groups    |   3 +
 .../hardware_status                           |   5 +
 .../test-node-affinity-strict1/log.expect     |  40 ++
 .../test-node-affinity-strict1/manager_status |   1 +
 .../test-node-affinity-strict1/service_config |   3 +
 src/test/test-node-affinity-strict2/README    |  11 +
 src/test/test-node-affinity-strict2/cmdlist   |   4 +
 src/test/test-node-affinity-strict2/groups    |   4 +
 .../hardware_status                           |   5 +
 .../test-node-affinity-strict2/log.expect     |  40 ++
 .../test-node-affinity-strict2/manager_status |   1 +
 .../test-node-affinity-strict2/service_config |   3 +
 src/test/test-node-affinity-strict3/README    |  10 +
 src/test/test-node-affinity-strict3/cmdlist   |   4 +
 src/test/test-node-affinity-strict3/groups    |   3 +
 .../hardware_status                           |   5 +
 .../test-node-affinity-strict3/log.expect     |  74 +++
 .../test-node-affinity-strict3/manager_status |   1 +
 .../test-node-affinity-strict3/service_config |   5 +
 src/test/test-node-affinity-strict4/README    |  14 +
 src/test/test-node-affinity-strict4/cmdlist   |   4 +
 src/test/test-node-affinity-strict4/groups    |   3 +
 .../hardware_status                           |   5 +
 .../test-node-affinity-strict4/log.expect     |  54 +++
 .../test-node-affinity-strict4/manager_status |   1 +
 .../test-node-affinity-strict4/service_config |   5 +
 src/test/test-node-affinity-strict5/README    |  16 +
 src/test/test-node-affinity-strict5/cmdlist   |   5 +
 src/test/test-node-affinity-strict5/groups    |   3 +
 .../hardware_status                           |   5 +
 .../test-node-affinity-strict5/log.expect     |  66 +++
 .../test-node-affinity-strict5/manager_status |   1 +
 .../test-node-affinity-strict5/service_config |   3 +
 src/test/test-node-affinity-strict6/README    |  14 +
 src/test/test-node-affinity-strict6/cmdlist   |   5 +
 src/test/test-node-affinity-strict6/groups    |   4 +
 .../hardware_status                           |   5 +
 .../test-node-affinity-strict6/log.expect     |  52 ++
 .../test-node-affinity-strict6/manager_status |   1 +
 .../test-node-affinity-strict6/service_config |   3 +
 src/test/test_failover1.pl                    |  27 +-
 src/test/test_rules_config.pl                 | 100 ++++
 127 files changed, 3398 insertions(+), 95 deletions(-)
 create mode 100644 src/PVE/API2/HA/Rules.pm
 create mode 100644 src/PVE/HA/Rules.pm
 create mode 100644 src/PVE/HA/Rules/Makefile
 create mode 100644 src/PVE/HA/Rules/NodeAffinity.pm
 create mode 100644 src/test/rules_cfgs/defaults-for-node-affinity-rules.cfg
 create mode 100644 src/test/rules_cfgs/defaults-for-node-affinity-rules.cfg.expect
 create mode 100644 src/test/rules_cfgs/multiple-resource-refs-in-node-affinity-rules.cfg
 create mode 100644 src/test/rules_cfgs/multiple-resource-refs-in-node-affinity-rules.cfg.expect
 create mode 100644 src/test/test-group-migrate1/README
 create mode 100644 src/test/test-group-migrate1/cmdlist
 create mode 100644 src/test/test-group-migrate1/groups
 create mode 100644 src/test/test-group-migrate1/hardware_status
 create mode 100644 src/test/test-group-migrate1/log.expect
 create mode 100644 src/test/test-group-migrate1/manager_status
 create mode 100644 src/test/test-group-migrate1/service_config
 create mode 100644 src/test/test-group-migrate2/README
 create mode 100644 src/test/test-group-migrate2/cmdlist
 create mode 100644 src/test/test-group-migrate2/groups
 create mode 100644 src/test/test-group-migrate2/hardware_status
 create mode 100644 src/test/test-group-migrate2/log.expect
 create mode 100644 src/test/test-group-migrate2/manager_status
 create mode 100644 src/test/test-group-migrate2/service_config
 create mode 100644 src/test/test-node-affinity-nonstrict1/README
 create mode 100644 src/test/test-node-affinity-nonstrict1/cmdlist
 create mode 100644 src/test/test-node-affinity-nonstrict1/groups
 create mode 100644 src/test/test-node-affinity-nonstrict1/hardware_status
 create mode 100644 src/test/test-node-affinity-nonstrict1/log.expect
 create mode 100644 src/test/test-node-affinity-nonstrict1/manager_status
 create mode 100644 src/test/test-node-affinity-nonstrict1/service_config
 create mode 100644 src/test/test-node-affinity-nonstrict2/README
 create mode 100644 src/test/test-node-affinity-nonstrict2/cmdlist
 create mode 100644 src/test/test-node-affinity-nonstrict2/groups
 create mode 100644 src/test/test-node-affinity-nonstrict2/hardware_status
 create mode 100644 src/test/test-node-affinity-nonstrict2/log.expect
 create mode 100644 src/test/test-node-affinity-nonstrict2/manager_status
 create mode 100644 src/test/test-node-affinity-nonstrict2/service_config
 create mode 100644 src/test/test-node-affinity-nonstrict3/README
 create mode 100644 src/test/test-node-affinity-nonstrict3/cmdlist
 create mode 100644 src/test/test-node-affinity-nonstrict3/groups
 create mode 100644 src/test/test-node-affinity-nonstrict3/hardware_status
 create mode 100644 src/test/test-node-affinity-nonstrict3/log.expect
 create mode 100644 src/test/test-node-affinity-nonstrict3/manager_status
 create mode 100644 src/test/test-node-affinity-nonstrict3/service_config
 create mode 100644 src/test/test-node-affinity-nonstrict4/README
 create mode 100644 src/test/test-node-affinity-nonstrict4/cmdlist
 create mode 100644 src/test/test-node-affinity-nonstrict4/groups
 create mode 100644 src/test/test-node-affinity-nonstrict4/hardware_status
 create mode 100644 src/test/test-node-affinity-nonstrict4/log.expect
 create mode 100644 src/test/test-node-affinity-nonstrict4/manager_status
 create mode 100644 src/test/test-node-affinity-nonstrict4/service_config
 create mode 100644 src/test/test-node-affinity-nonstrict5/README
 create mode 100644 src/test/test-node-affinity-nonstrict5/cmdlist
 create mode 100644 src/test/test-node-affinity-nonstrict5/groups
 create mode 100644 src/test/test-node-affinity-nonstrict5/hardware_status
 create mode 100644 src/test/test-node-affinity-nonstrict5/log.expect
 create mode 100644 src/test/test-node-affinity-nonstrict5/manager_status
 create mode 100644 src/test/test-node-affinity-nonstrict5/service_config
 create mode 100644 src/test/test-node-affinity-nonstrict6/README
 create mode 100644 src/test/test-node-affinity-nonstrict6/cmdlist
 create mode 100644 src/test/test-node-affinity-nonstrict6/groups
 create mode 100644 src/test/test-node-affinity-nonstrict6/hardware_status
 create mode 100644 src/test/test-node-affinity-nonstrict6/log.expect
 create mode 100644 src/test/test-node-affinity-nonstrict6/manager_status
 create mode 100644 src/test/test-node-affinity-nonstrict6/service_config
 create mode 100644 src/test/test-node-affinity-strict1/README
 create mode 100644 src/test/test-node-affinity-strict1/cmdlist
 create mode 100644 src/test/test-node-affinity-strict1/groups
 create mode 100644 src/test/test-node-affinity-strict1/hardware_status
 create mode 100644 src/test/test-node-affinity-strict1/log.expect
 create mode 100644 src/test/test-node-affinity-strict1/manager_status
 create mode 100644 src/test/test-node-affinity-strict1/service_config
 create mode 100644 src/test/test-node-affinity-strict2/README
 create mode 100644 src/test/test-node-affinity-strict2/cmdlist
 create mode 100644 src/test/test-node-affinity-strict2/groups
 create mode 100644 src/test/test-node-affinity-strict2/hardware_status
 create mode 100644 src/test/test-node-affinity-strict2/log.expect
 create mode 100644 src/test/test-node-affinity-strict2/manager_status
 create mode 100644 src/test/test-node-affinity-strict2/service_config
 create mode 100644 src/test/test-node-affinity-strict3/README
 create mode 100644 src/test/test-node-affinity-strict3/cmdlist
 create mode 100644 src/test/test-node-affinity-strict3/groups
 create mode 100644 src/test/test-node-affinity-strict3/hardware_status
 create mode 100644 src/test/test-node-affinity-strict3/log.expect
 create mode 100644 src/test/test-node-affinity-strict3/manager_status
 create mode 100644 src/test/test-node-affinity-strict3/service_config
 create mode 100644 src/test/test-node-affinity-strict4/README
 create mode 100644 src/test/test-node-affinity-strict4/cmdlist
 create mode 100644 src/test/test-node-affinity-strict4/groups
 create mode 100644 src/test/test-node-affinity-strict4/hardware_status
 create mode 100644 src/test/test-node-affinity-strict4/log.expect
 create mode 100644 src/test/test-node-affinity-strict4/manager_status
 create mode 100644 src/test/test-node-affinity-strict4/service_config
 create mode 100644 src/test/test-node-affinity-strict5/README
 create mode 100644 src/test/test-node-affinity-strict5/cmdlist
 create mode 100644 src/test/test-node-affinity-strict5/groups
 create mode 100644 src/test/test-node-affinity-strict5/hardware_status
 create mode 100644 src/test/test-node-affinity-strict5/log.expect
 create mode 100644 src/test/test-node-affinity-strict5/manager_status
 create mode 100644 src/test/test-node-affinity-strict5/service_config
 create mode 100644 src/test/test-node-affinity-strict6/README
 create mode 100644 src/test/test-node-affinity-strict6/cmdlist
 create mode 100644 src/test/test-node-affinity-strict6/groups
 create mode 100644 src/test/test-node-affinity-strict6/hardware_status
 create mode 100644 src/test/test-node-affinity-strict6/log.expect
 create mode 100644 src/test/test-node-affinity-strict6/manager_status
 create mode 100644 src/test/test-node-affinity-strict6/service_config
 create mode 100755 src/test/test_rules_config.pl


base-commit: 264dc2c58d145394219f82f25d41f4fc438c4dc4
prerequisite-patch-id: 530b875c25a6bded1cc2294960cf465d5c2bcbca

docs:

Daniel Kral (1):
  ha: add documentation about ha rules and ha node affinity rules

 Makefile                           |   2 +
 gen-ha-rules-node-affinity-opts.pl |  20 ++++++
 gen-ha-rules-opts.pl               |  17 +++++
 ha-manager.adoc                    | 103 +++++++++++++++++++++++++++++
 ha-rules-node-affinity-opts.adoc   |  18 +++++
 ha-rules-opts.adoc                 |  12 ++++
 pmxcfs.adoc                        |   1 +
 7 files changed, 173 insertions(+)
 create mode 100755 gen-ha-rules-node-affinity-opts.pl
 create mode 100755 gen-ha-rules-opts.pl
 create mode 100644 ha-rules-node-affinity-opts.adoc
 create mode 100644 ha-rules-opts.adoc


base-commit: 7cc17ee5950a53bbd5b5ad81270352ccdb1c541c
prerequisite-patch-id: 92556cd6c1edfb88b397ae244d7dcd56876cd8fb

manager:

Daniel Kral (3):
  api: ha: add ha rules api endpoints
  ui: ha: remove ha groups from ha resource components
  ui: ha: show failback flag in resources status view

 PVE/API2/HAConfig.pm            |  8 +++++++-
 www/manager6/ha/ResourceEdit.js | 16 ++++++++++++----
 www/manager6/ha/Resources.js    | 17 +++--------------
 www/manager6/ha/StatusView.js   |  5 ++++-
 4 files changed, 26 insertions(+), 20 deletions(-)


base-commit: c0cbe76ee90e7110934c50414bc22371cf13c01a
prerequisite-patch-id: ec6a39936719cfe38787fccb1a80af6378980723

Summary over all repositories:
  140 files changed, 3599 insertions(+), 115 deletions(-)

-- 
Generated by git-murpp 0.8.0




More information about the pve-devel mailing list