[pve-devel] [PATCH-SERIES proxmox-resource-scheduling/pve-ha-manager/etc] add static usage scheduler for HA manager
Fiona Ebner
f.ebner at proxmox.com
Thu Nov 10 15:37:39 CET 2022
Right now, the online node usage calculation for the HA manager only
considers the number of active services on each node. This patch
series allows switching to a 'static' scheduler mode instead, where
static usage information from the nodes and guest configurations is
used instead.
This also includes the remaining cgroup/cpuunits-related patches,
because the broadcasting of static information was done to include the
cgroup mode of the node.
With this version, the effect is limited to choosing nodes during
recovery, but the plan is to extend this.
As a next step, it would be nice to also have for startup, but AFAICT
the issue is that the node selection only happens after the state is
already set to started and I think select_service_node() doesn't
currently know if a service has been newly started. I haven't looked
into it in too much detail though.
An idea to get a balancer out of it, is to:
1. (optionally) sort all services by badness (needs new backend function)
2. iterate scoring the nodes for each service, adding the usage to the
chosen node after each iteration. The current node can be kept if the
score compared to the best node doesn't differ too much.
3. record the chosen nodes and migrate the services accordingly.
Still missing are also unit tests for ha-manager itself.
Almost all of the series is preparatory infrastructure, but the hope
is that much of it can be re-used for balancers and dynamic
scheduling in the future.
The proxmox-resource-scheduling Rust crate implements the TOPSIS
algorithm first suggested by Alexandre. It also models the static node
and service usages in PVE and allows to score nodes where to start
new or recovered service. This is done by simulating starting it on
each node and comparing the alternatives with average and highest CPU
and memory as criteria. Memory being weighted much more as it is a
more limited resource than CPU.
I did not implement the criteria weighing process from AHP (yet) (also
suggested by Alexandre) which computes avaraged weights and a bias
score from a table of pairwise weights between criteria. The downside
is that one needs to guess n(n-1)/2 weights instead of n, and the
upside is that it has to be done only pairwise rather than relative to
all others. But this still can be done in the future if we want.
In proxmox-perl-rs, a class is provided for interfacing from Perl.
In pve-manager, the static node information is broadcast whenever
outdated. There also are the unrelated (but touching the same code)
cgroup/cpuunits patches.
In pve-cluster, a new crs (=cluster-resource-scheduler) option is
added, initially with a mode for HA.
In pve-ha-manager, the online node usage calculation is factored out
into a 'Usage' plugin system to ease adding the new static mode
without much cluttering. If not all nodes provide static service
information, we fall back to the 'basic' mode. If only the scoring
fails (but that /should/ be rather unlikely), there is no real
fallback implemented currently (the '|| $a cmp $b' in
select_service_node() destroys the random hash keys order again ;)).
We could change it to stay random or better, track the service count
in Usage::Static too and use that.
Dependency bumps needed:
proxmox-perl-rs depends on proxmox-resource-scheduling
proxmox-ha-manager (build)depends on proxmox-perl-rs
The new feature is only usable with updated pve-manager and
pve-cluster of course, but no hard dependency.
proxmox-resource-scheduling:
Fiona Ebner (3):
initial commit
add pve_static module
add Debian packaging
proxmox-perl-rs:
Fiona Ebner (2):
pve-rs: add resource scheduling module
add basic test for resource scheduling
Makefile | 1 +
pve-rs/Cargo.toml | 1 +
pve-rs/src/lib.rs | 1 +
pve-rs/src/resource_scheduling/mod.rs | 1 +
pve-rs/src/resource_scheduling/static.rs | 116 +++++++++++++++++++++++
pve-rs/test/Makefile | 4 +
pve-rs/test/README | 2 +
pve-rs/test/resource_scheduling.pl | 70 ++++++++++++++
8 files changed, 196 insertions(+)
create mode 100644 pve-rs/src/resource_scheduling/mod.rs
create mode 100644 pve-rs/src/resource_scheduling/static.rs
create mode 100644 pve-rs/test/Makefile
create mode 100644 pve-rs/test/README
create mode 100755 pve-rs/test/resource_scheduling.pl
pve-manager:
Fiona Ebner (3):
pvestatd: broadcast static node information
cluster resources: add cgroup-mode to node properties
ui: lxc/qemu: cpu edit: make cpuunits depend on node's cgroup version
PVE/API2/Cluster.pm | 13 +++++++++++++
PVE/Service/pvestatd.pm | 25 ++++++++++++++++++++++++
www/manager6/lxc/CreateWizard.js | 8 ++++++++
www/manager6/lxc/ResourceEdit.js | 31 +++++++++++++++++++++++++-----
www/manager6/lxc/Resources.js | 8 +++++++-
www/manager6/qemu/CreateWizard.js | 8 ++++++++
www/manager6/qemu/HardwareView.js | 8 +++++++-
www/manager6/qemu/ProcessorEdit.js | 31 +++++++++++++++++++++++-------
8 files changed, 118 insertions(+), 14 deletions(-)
pve-cluster:
Fiona Ebner (1):
datacenter config: add cluster resource scheduling (crs) options
data/PVE/DataCenterConfig.pm | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)
pve-ha-manager:
Fiona Ebner (11):
env: add get_static_node_stats() method
resources: add get_static_stats() method
add Usage base plugin and Usage::Basic plugin
manager: select service node: add $sid to parameters
manager: online node usage: switch to Usage::Basic plugin
usage: add Usage::Static plugin
env: add get_crs_settings() method
manager: set resource scheduler mode upon init
manager: use static resource scheduler when configured
manager: avoid scoring nodes if maintenance fallback node is valid
manager: avoid scoring nodes when not trying next and current node is
valid
debian/pve-ha-manager.install | 3 +
src/PVE/HA/Env.pm | 13 ++++
src/PVE/HA/Env/PVE2.pm | 29 +++++++++
src/PVE/HA/Makefile | 3 +-
src/PVE/HA/Manager.pm | 77 ++++++++++++++---------
src/PVE/HA/Resources.pm | 5 ++
src/PVE/HA/Resources/PVECT.pm | 11 ++++
src/PVE/HA/Resources/PVEVM.pm | 14 +++++
src/PVE/HA/Sim/Env.pm | 9 +++
src/PVE/HA/Sim/TestEnv.pm | 6 ++
src/PVE/HA/Usage.pm | 50 +++++++++++++++
src/PVE/HA/Usage/Basic.pm | 52 ++++++++++++++++
src/PVE/HA/Usage/Makefile | 6 ++
src/PVE/HA/Usage/Static.pm | 114 ++++++++++++++++++++++++++++++++++
src/test/test_failover1.pl | 21 ++++---
15 files changed, 374 insertions(+), 39 deletions(-)
create mode 100644 src/PVE/HA/Usage.pm
create mode 100644 src/PVE/HA/Usage/Basic.pm
create mode 100644 src/PVE/HA/Usage/Makefile
create mode 100644 src/PVE/HA/Usage/Static.pm
pve-docs:
Fiona Ebner (1):
ha: add section about scheduler modes
ha-manager.adoc | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)
--
2.30.2
More information about the pve-devel
mailing list