[pve-devel] [PATCH V2 pve-ha-manager 0/2] POC/RFC: ressource aware HA manager

Alexandre Derumier aderumier at odiso.com
Mon Dec 20 08:42:30 CET 2021


Hi,

this is a proof of concept to implement ressource aware HA.

The current implementation is really basic,
simply balancing the number of services on each node.

I had some real production cases, where a node is failing, and restarted vm
impact others nodes because of too much cpu/ram usage.


Changelog v2:

- merging main code && Sim code in same patch for now. (I'll split them later)
- cleanup will all Thomas comments review (thanks again)
- add more comments in code
- check storage for lxc too
- use maxmem for windows vms

I still need to add missing storage availability test



Alexandre Derumier (2):
  add ressource awareness manager
  add test-basic0

 src/PVE/HA/Env.pm                    |  33 ++++
 src/PVE/HA/Env/PVE2.pm               | 171 ++++++++++++++++++
 src/PVE/HA/Manager.pm                | 258 ++++++++++++++++++++++++++-
 src/PVE/HA/Sim/Hardware.pm           |  61 +++++++
 src/PVE/HA/Sim/TestEnv.pm            |  50 +++++-
 src/test/test-basic0/README          |   1 +
 src/test/test-basic0/cmdlist         |   4 +
 src/test/test-basic0/hardware_status |   5 +
 src/test/test-basic0/log.expect      |  52 ++++++
 src/test/test-basic0/manager_status  |   1 +
 src/test/test-basic0/node_stats      |   5 +
 src/test/test-basic0/service_config  |   5 +
 src/test/test-basic0/service_stats   |   5 +
 13 files changed, 642 insertions(+), 9 deletions(-)
 create mode 100644 src/test/test-basic0/README
 create mode 100644 src/test/test-basic0/cmdlist
 create mode 100644 src/test/test-basic0/hardware_status
 create mode 100644 src/test/test-basic0/log.expect
 create mode 100644 src/test/test-basic0/manager_status
 create mode 100644 src/test/test-basic0/node_stats
 create mode 100644 src/test/test-basic0/service_config
 create mode 100644 src/test/test-basic0/service_stats

-- 
2.30.2





More information about the pve-devel mailing list