[pve-devel] [PATCH ha-manager 2/2] relocate policy: do not try failed nodes again

Thu May 19 15:08:17 CEST 2016

If the failure policy triggers more often than 2 times we used an
already tried node again, even if there where other untried nodes.

This does not make real sense as when it failed to start on a node
a short time ago it probably will also fail now (e.g. storage is
offline), whereas an untried node may have the chance to be fully
able to start the service.

Fix that by excluding those already tried nodes from the top
priority node list in 'select_service_node'.

We bound that to try_next as we only can trigger it if try_next is
true, also select_service_node gets called in two places:
* next_state_started: there we want to use this behaviour
* recover_fenced service: there the try_nect is always false
  as we just want to select a node to recover, if a relocation
  policy is then needed it is the duty for next_state_started
  to do so.
So we are safe to do so.

If we have tried all possible nodes and still have relocation tries
left, we try to restart it one more time on the current node, if
that one fails we place it in the error state.

Signed-off-by: Thomas Lamprecht <t.lamprecht at proxmox.com>
---

We could also change $try_next to $tried_nodes as they're connected
anyway. This would spare us a param but I let it like that for now
to be better able to follow the change.

 src/PVE/HA/Manager.pm                      | 13 +++++--
 src/test/test-resource-failure6/log.expect | 54 ++++++++++++++++++++++++++++++
 2 files changed, 64 insertions(+), 3 deletions(-)
 create mode 100644 src/test/test-resource-failure6/log.expect

diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
index e13c782..5eb02f8 100644
--- a/src/PVE/HA/Manager.pm
+++ b/src/PVE/HA/Manager.pm
@@ -52,7 +52,7 @@ sub flush_master_status {
 } 
 
 sub select_service_node {
-    my ($groups, $online_node_usage, $service_conf, $current_node, $try_next) = @_;
+    my ($groups, $online_node_usage, $service_conf, $current_node, $try_next, $tried_nodes) = @_;
 
     my $group = {};
     # add all online nodes to default group to allow try_next when no group set
@@ -100,6 +100,13 @@ sub select_service_node {
 
     my $top_pri = $pri_list[0];
 
+    if ($try_next) {
+	# do not try nodes where the service failed already
+	foreach my $node (@$tried_nodes) {
+	    delete $pri_groups->{$top_pri}->{$node};
+	}
+    }
+
     my @nodes = sort { 
 	$online_node_usage->{$a} <=> $online_node_usage->{$b} || $a cmp $b
     } keys %{$pri_groups->{$top_pri}};
@@ -632,8 +639,8 @@ sub next_state_started {
 		}
 	    }
 
-	    my $node = select_service_node($self->{groups}, $self->{online_node_usage}, 
-					   $cd, $sd->{node}, $try_next);
+	    my $node = select_service_node($self->{groups}, $self->{online_node_usage},
+					   $cd, $sd->{node}, $try_next, $tried_nodes);
 
 	    if ($node && ($sd->{node} ne $node)) {
 		if ($cd->{type} eq 'vm') {
diff --git a/src/test/test-resource-failure6/log.expect b/src/test/test-resource-failure6/log.expect
new file mode 100644
index 0000000..0e88c3e
--- /dev/null
+++ b/src/test/test-resource-failure6/log.expect
@@ -0,0 +1,54 @@
+info      0     hardware: starting simulation
+info     20      cmdlist: execute power node1 on
+info     20    node1/crm: status change startup => wait_for_quorum
+info     20    node1/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node2 on
+info     20    node2/crm: status change startup => wait_for_quorum
+info     20    node2/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node3 on
+info     20    node3/crm: status change startup => wait_for_quorum
+info     20    node3/lrm: status change startup => wait_for_agent_lock
+info     20    node1/crm: got lock 'ha_manager_lock'
+info     20    node1/crm: status change wait_for_quorum => master
+info     20    node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info     20    node1/crm: adding new service 'fa:130' on node 'node2'
+info     20    node1/crm: service 'fa:130': state changed from 'started' to 'request_stop'
+info     22    node2/crm: status change wait_for_quorum => slave
+info     23    node2/lrm: got lock 'ha_agent_node2_lock'
+info     23    node2/lrm: status change wait_for_agent_lock => active
+info     24    node3/crm: status change wait_for_quorum => slave
+info     40    node1/crm: service 'fa:130': state changed from 'request_stop' to 'stopped'
+info    120      cmdlist: execute service fa:130 enabled
+info    120    node1/crm: service 'fa:130': state changed from 'stopped' to 'started'  (node = node2)
+info    123    node2/lrm: starting service fa:130
+warn    123    node2/lrm: unable to start service fa:130
+err     123    node2/lrm: unable to start service fa:130 on local node after 0 retries
+warn    140    node1/crm: starting service fa:130 on node 'node2' failed, relocating service.
+info    140    node1/crm: relocate service 'fa:130' to node 'node1'
+info    140    node1/crm: service 'fa:130': state changed from 'started' to 'relocate'  (node = node2, target = node1)
+info    143    node2/lrm: service fa:130 - start relocate to node 'node1'
+info    143    node2/lrm: service fa:130 - end relocate to node 'node1'
+info    160    node1/crm: service 'fa:130': state changed from 'relocate' to 'started'  (node = node1)
+info    161    node1/lrm: got lock 'ha_agent_node1_lock'
+info    161    node1/lrm: status change wait_for_agent_lock => active
+info    161    node1/lrm: starting service fa:130
+warn    161    node1/lrm: unable to start service fa:130
+err     161    node1/lrm: unable to start service fa:130 on local node after 0 retries
+warn    180    node1/crm: starting service fa:130 on node 'node1' failed, relocating service.
+info    180    node1/crm: relocate service 'fa:130' to node 'node3'
+info    180    node1/crm: service 'fa:130': state changed from 'started' to 'relocate'  (node = node1, target = node3)
+info    181    node1/lrm: service fa:130 - start relocate to node 'node3'
+info    181    node1/lrm: service fa:130 - end relocate to node 'node3'
+info    200    node1/crm: service 'fa:130': state changed from 'relocate' to 'started'  (node = node3)
+info    205    node3/lrm: got lock 'ha_agent_node3_lock'
+info    205    node3/lrm: status change wait_for_agent_lock => active
+info    205    node3/lrm: starting service fa:130
+warn    205    node3/lrm: unable to start service fa:130
+err     205    node3/lrm: unable to start service fa:130 on local node after 0 retries
+warn    220    node1/crm: starting service fa:130 on node 'node3' failed, relocating service.
+info    225    node3/lrm: starting service fa:130
+info    225    node3/lrm: service status fa:130 started
+info    240    node1/crm: relocation policy successful for 'fa:130', tried nodes: node2, node1, node3
+info    720     hardware: exit simulation - done
-- 
2.1.4