[pve-devel] [PATCH ha-manager 0/4] various HA fixes

Thomas Lamprecht t.lamprecht at proxmox.com
Fri Sep 9 16:15:33 CEST 2016


The first two patches are both regarding bug #1100, allowing
an HA configured service to be started if its old node failed during
a cluster locked action, like a backup and ading an regression test for
such a situation.

The third is for a bug which has no bug tracker entry, it is an additional
fix for synchronity between CRM and LRM service commands and results.
We fixed already the race that LRM could execute a single command more
than once and now fix the problem that the CRM could ask for a new start
command even if the previous requested one isn't finsihed and processed by
both LRM and CRM, see commit messages for more details.

Fourth patch is a small enhancment fopr ensuring the CRM and LRM always run,
even after a failure.

All those patches have RFC status, maybe someone has a better solution to the
problems.

cheers,
Thomas

Thomas Lamprecht (4):
  add possibility to simulate locks from services
  cleanup service from old locks after recovery (fixes #1100)
  fix race condition on slow resource commands
  tell systemd to restart CRM and LRM on failures

 debian/pve-ha-crm.service                     |  1 +
 debian/pve-ha-lrm.service                     |  1 +
 src/PVE/HA/Env/PVE2.pm                        |  2 ++
 src/PVE/HA/Manager.pm                         | 12 +++++++-
 src/PVE/HA/Resources.pm                       |  9 ++++++
 src/PVE/HA/Resources/PVECT.pm                 | 13 ++++++++
 src/PVE/HA/Resources/PVEVM.pm                 | 13 ++++++++
 src/PVE/HA/Sim/Env.pm                         |  9 +++++-
 src/PVE/HA/Sim/Hardware.pm                    | 38 +++++++++++++++++++++++
 src/PVE/HA/Sim/Resources.pm                   | 21 +++++++++++++
 src/PVE/HA/Sim/TestHardware.pm                |  9 ++++++
 src/test/test-locked-service1/README          |  3 ++
 src/test/test-locked-service1/cmdlist         |  5 +++
 src/test/test-locked-service1/hardware_status |  5 +++
 src/test/test-locked-service1/log.expect      | 44 +++++++++++++++++++++++++++
 src/test/test-locked-service1/manager_status  |  1 +
 src/test/test-locked-service1/service_config  |  3 ++
 17 files changed, 187 insertions(+), 2 deletions(-)
 create mode 100644 src/test/test-locked-service1/README
 create mode 100644 src/test/test-locked-service1/cmdlist
 create mode 100644 src/test/test-locked-service1/hardware_status
 create mode 100644 src/test/test-locked-service1/log.expect
 create mode 100644 src/test/test-locked-service1/manager_status
 create mode 100644 src/test/test-locked-service1/service_config

-- 
2.1.4





More information about the pve-devel mailing list