[pve-devel] [PATCH ha-manager v2 21/26] manager: handle negative colocations with too many services
Daniel Kral
d.kral at proxmox.com
Fri Jun 20 16:31:33 CEST 2025
select_service_node(...) in 'none' mode will usually only return no
node, if negative colocations specify more services than nodes
available. In these cases, these cannot be separated as there are no
more nodes left, so these are put in error state for now.
Signed-off-by: Daniel Kral <d.kral at proxmox.com>
---
This is not ideal and I'd rather make this be dropped in the
check_feasibility(...) part, but then we'd need to introduce more state
to the check helpers or make a direct call to
PVE::Cluster::get_nodelist(...).
changes since v1:
- NEW!
src/PVE/HA/Manager.pm | 13 +++++
.../test-colocation-strict-separate9/README | 14 +++++
.../test-colocation-strict-separate9/cmdlist | 3 +
.../hardware_status | 5 ++
.../log.expect | 57 +++++++++++++++++++
.../manager_status | 1 +
.../rules_config | 3 +
.../service_config | 7 +++
8 files changed, 103 insertions(+)
create mode 100644 src/test/test-colocation-strict-separate9/README
create mode 100644 src/test/test-colocation-strict-separate9/cmdlist
create mode 100644 src/test/test-colocation-strict-separate9/hardware_status
create mode 100644 src/test/test-colocation-strict-separate9/log.expect
create mode 100644 src/test/test-colocation-strict-separate9/manager_status
create mode 100644 src/test/test-colocation-strict-separate9/rules_config
create mode 100644 src/test/test-colocation-strict-separate9/service_config
diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
index 66e5710..59b2998 100644
--- a/src/PVE/HA/Manager.pm
+++ b/src/PVE/HA/Manager.pm
@@ -1092,6 +1092,19 @@ sub next_state_started {
);
delete $sd->{maintenance_node};
}
+ } elsif ($select_mode eq 'none' && !defined($node)) {
+ # Having no node here means that the service is started but cannot find any
+ # node it is allowed to run on, e.g. added negative colocation rule, while the
+ # nodes aren't separated yet.
+ # TODO Could be made impossible by a dynamic check to drop negative colocation
+ # rules which have defined more services than available nodes
+ $haenv->log(
+ 'err',
+ "service '$sid' cannot run on '$sd->{node}', but no recovery node found",
+ );
+
+ # TODO Should this really move the service to the error state?
+ $change_service_state->($self, $sid, 'error');
}
# ensure service get started again if it went unexpected down
diff --git a/src/test/test-colocation-strict-separate9/README b/src/test/test-colocation-strict-separate9/README
new file mode 100644
index 0000000..85494dd
--- /dev/null
+++ b/src/test/test-colocation-strict-separate9/README
@@ -0,0 +1,14 @@
+Test whether a strict negative colocation rule among five services on a three
+node cluster, makes the services which are on the same node be put in error
+state as there are not enough nodes to separate all of them and it's also not
+clear which of the three is more important to run.
+
+The test scenario is:
+- vm:101 through vm:105 must be kept separate
+- vm:101 through vm:105 are all running on node1
+
+The expected outcome is:
+- As the cluster comes up, vm:102 and vm:103 are migrated to node2 and node3
+- vm:101, vm:104, and vm:105 will be put in error state as there are not enough
+ nodes left to separate them but it is also not clear which service is more
+ important to be run on the only node left.
diff --git a/src/test/test-colocation-strict-separate9/cmdlist b/src/test/test-colocation-strict-separate9/cmdlist
new file mode 100644
index 0000000..3bfad44
--- /dev/null
+++ b/src/test/test-colocation-strict-separate9/cmdlist
@@ -0,0 +1,3 @@
+[
+ [ "power node1 on", "power node2 on", "power node3 on"]
+]
diff --git a/src/test/test-colocation-strict-separate9/hardware_status b/src/test/test-colocation-strict-separate9/hardware_status
new file mode 100644
index 0000000..451beb1
--- /dev/null
+++ b/src/test/test-colocation-strict-separate9/hardware_status
@@ -0,0 +1,5 @@
+{
+ "node1": { "power": "off", "network": "off" },
+ "node2": { "power": "off", "network": "off" },
+ "node3": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-colocation-strict-separate9/log.expect b/src/test/test-colocation-strict-separate9/log.expect
new file mode 100644
index 0000000..efe85a2
--- /dev/null
+++ b/src/test/test-colocation-strict-separate9/log.expect
@@ -0,0 +1,57 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 20 node1/crm: got lock 'ha_manager_lock'
+info 20 node1/crm: status change wait_for_quorum => master
+info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info 20 node1/crm: adding new service 'vm:101' on node 'node1'
+info 20 node1/crm: adding new service 'vm:102' on node 'node1'
+info 20 node1/crm: adding new service 'vm:103' on node 'node1'
+info 20 node1/crm: adding new service 'vm:104' on node 'node1'
+info 20 node1/crm: adding new service 'vm:105' on node 'node1'
+info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:102': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:103': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:104': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: service 'vm:105': state changed from 'request_start' to 'started' (node = node1)
+info 20 node1/crm: migrate service 'vm:101' to node 'node2' (running)
+info 20 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node1, target = node2)
+info 20 node1/crm: migrate service 'vm:102' to node 'node3' (running)
+info 20 node1/crm: service 'vm:102': state changed from 'started' to 'migrate' (node = node1, target = node3)
+err 20 node1/crm: service 'vm:103' cannot run on 'node1', but no recovery node found
+info 20 node1/crm: service 'vm:103': state changed from 'started' to 'error'
+err 20 node1/crm: service 'vm:104' cannot run on 'node1', but no recovery node found
+info 20 node1/crm: service 'vm:104': state changed from 'started' to 'error'
+err 20 node1/crm: service 'vm:105' cannot run on 'node1', but no recovery node found
+info 20 node1/crm: service 'vm:105': state changed from 'started' to 'error'
+info 21 node1/lrm: got lock 'ha_agent_node1_lock'
+info 21 node1/lrm: status change wait_for_agent_lock => active
+info 21 node1/lrm: service vm:101 - start migrate to node 'node2'
+info 21 node1/lrm: service vm:101 - end migrate to node 'node2'
+info 21 node1/lrm: service vm:102 - start migrate to node 'node3'
+info 21 node1/lrm: service vm:102 - end migrate to node 'node3'
+err 21 node1/lrm: service vm:103 is in an error state and needs manual intervention. Look up 'ERROR RECOVERY' in the documentation.
+err 21 node1/lrm: service vm:104 is in an error state and needs manual intervention. Look up 'ERROR RECOVERY' in the documentation.
+err 21 node1/lrm: service vm:105 is in an error state and needs manual intervention. Look up 'ERROR RECOVERY' in the documentation.
+info 22 node2/crm: status change wait_for_quorum => slave
+info 24 node3/crm: status change wait_for_quorum => slave
+info 40 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node2)
+info 40 node1/crm: service 'vm:102': state changed from 'migrate' to 'started' (node = node3)
+info 43 node2/lrm: got lock 'ha_agent_node2_lock'
+info 43 node2/lrm: status change wait_for_agent_lock => active
+info 43 node2/lrm: starting service vm:101
+info 43 node2/lrm: service status vm:101 started
+info 45 node3/lrm: got lock 'ha_agent_node3_lock'
+info 45 node3/lrm: status change wait_for_agent_lock => active
+info 45 node3/lrm: starting service vm:102
+info 45 node3/lrm: service status vm:102 started
+info 620 hardware: exit simulation - done
diff --git a/src/test/test-colocation-strict-separate9/manager_status b/src/test/test-colocation-strict-separate9/manager_status
new file mode 100644
index 0000000..9e26dfe
--- /dev/null
+++ b/src/test/test-colocation-strict-separate9/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-colocation-strict-separate9/rules_config b/src/test/test-colocation-strict-separate9/rules_config
new file mode 100644
index 0000000..478d70b
--- /dev/null
+++ b/src/test/test-colocation-strict-separate9/rules_config
@@ -0,0 +1,3 @@
+colocation: lonely-must-too-many-vms-be
+ services vm:101,vm:102,vm:103,vm:104,vm:105
+ affinity separate
diff --git a/src/test/test-colocation-strict-separate9/service_config b/src/test/test-colocation-strict-separate9/service_config
new file mode 100644
index 0000000..a1d61f5
--- /dev/null
+++ b/src/test/test-colocation-strict-separate9/service_config
@@ -0,0 +1,7 @@
+{
+ "vm:101": { "node": "node1", "state": "started" },
+ "vm:102": { "node": "node1", "state": "started" },
+ "vm:103": { "node": "node1", "state": "started" },
+ "vm:104": { "node": "node1", "state": "started" },
+ "vm:105": { "node": "node1", "state": "started" }
+}
--
2.39.5
More information about the pve-devel
mailing list