[pve-devel] [PATCH v2 3/3] Allow migrate-all button on HA enabled VMs

Thu Mar 17 13:27:17 CET 2016

Comments inline.

----- Rispondi al messaggio -----
Da: "Caspar Smit" <casparsmit at supernas.eu>
A: "PVE development discussion" <pve-devel at pve.proxmox.com>
Oggetto: [pve-devel] [PATCH v2 3/3] Allow migrate-all button on HA enabled VMs
Data: gio, mar 17, 2016 11:55

Hi all,
During some more tests with this feature i (maybe) stumbled on a bug (or maybe this was by design).

When I select the migrate-all button and set the "parallel jobs" option to 1 i noticed the HA managed VM's were migrated at the same time (so it looks like the parallel jobs option is ignored).
But i found out why this is:

When a HA managed VM is migrated a "HA <vmid> - Migrate" task is spawned. This task returns an OK status way BEFORE the actual migration has taken place. The "HA <vmid> - Migrate" task spawns another task which does the actual migration called "VM <vmid> - Migrate".

Now I remember from PVE 3.4 that the "HA <vmid> - Migrate" task did not return an OK until the actual "VM <vmid> - Migrate" returned an OK. Was this changed on purpose or is this a bug?



This is by design. The HA stack consists out of the local resource manager and the Cluster resource mamager which work synced with each other but async from the cluster.

You can limit the concurrent migrations by setting the max_worker setting in datacenter.cfg
Users should limit that if there setup cannot handle that much migrations parallel.



The result here is that the migrate-all task receives an OK (from the HA task) and starts the next migration resulting in multiple HA migrations happen at once.


This is expected.



Kind regards,
Caspar


2016-03-14 12:07 GMT+01:00 Caspar Smit <casparsmit at supernas.eu>:
Signed-off-by: Caspar Smit <casparsmit at supernas.eu>

---

PVE/API2/Nodes.pm | 9 ++++++---

1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/PVE/API2/Nodes.pm b/PVE/API2/Nodes.pm

index f1fb392..b2de907 100644

--- a/PVE/API2/Nodes.pm

+++ b/PVE/API2/Nodes.pm

@@ -1208,9 +1208,6 @@ my $get_start_stop_list = sub {

$startup = { order => $bootorder };

}



-           # skip ha managed VMs (started by pve-ha-manager)

-           return if PVE::HA::Config::vm_is_ha_managed($vmid);

-

$resList->{$startup->{order}}->{$vmid} = $startup;

$resList->{$startup->{order}}->{$vmid}->{type} = $d->{type};

};

@@ -1283,6 +1280,9 @@ __PACKAGE__->register_method ({

die "unknown VM type '$d->{type}'\n";

}



+                   # skip ha managed VMs (started by pve-ha-manager)

+                   next if PVE::HA::Config::vm_is_ha_managed($vmid);

+

PVE::Cluster::check_cfs_quorum(); # abort when we loose quorum



eval {

@@ -1407,6 +1407,9 @@ __PACKAGE__->register_method ({

};



foreach my $vmid (sort {$b <=> $a} keys %$vmlist) {

+                   # skip ha managed VMs (stopped by pve-ha-manager)

+                   next if PVE::HA::Config::vm_is_ha_managed($vmid);

+

my $d = $vmlist->{$vmid};

my $upid;

eval { $upid = &$create_stop_worker($nodename, $d->{type}, $vmid, $d->{down}); };

--

2.1.4
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.proxmox.com/pipermail/pve-devel/attachments/20160317/9c036881/attachment.htm>