[pve-devel] [PATCH ha-manager v2] fix #1073: do not count backup-suspended VMs as running

Thomas Lamprecht t.lamprecht at proxmox.com
Wed Aug 23 10:15:49 CEST 2017


when a stopped VM managed by HA got backuped the HA stack
continuously tried to shut it down as check_running returns only if a
PID for the VM exists.
As the VM was locked the shutdown tries were blocked, but still a lot
of annoying messages and task spawns happened during the backup
period.

As querying the VM status through the vm monitor is not cheap, check
if the VM is locked with the backup lock first, the config is cached
and so this is quite cheap, only then query the VMs status over qmp,
and check if the VM is in the 'prelaunch' state.
This state gets only set if KVM was started with the `-S` option and
has not yet continued guest operation.

Some performance results, I repeated each check 1000 times, first
number is the total time spent just with the check, second time is
the the time per single check:

old check (vm runs):            87.117 ms/total =>  87.117 us/loop
new check (runs, no backup):   107.744 ms/total => 107.744 us/loop
new check (runs, backup):      760.337 ms/total => 760.337 us/loop

If the VM does not run, the time is he same as previously (~6us/loop
on my host), else we normally are ~30 us per check slower.
For the exceptional case where a stopped HA VM gets backed up, we are
an order of magnitude slower.
Even in this case we can handle 1000 stopped VMs running in the worst
case of this check per node well.

Signed-off-by: Thomas Lamprecht <t.lamprecht at proxmox.com>
---

changes v1 -> v2:
* improve and shorten comment and commit message, this affects not
  only suspend mode backups but all backups on stopped HA VMs - as
  Fabian noticed me (thanks!)
* remove the ternary if and just return 0 if we match the prelaunch
  case, 1 then gets returned below already - no need for duplication

 src/PVE/HA/Resources/PVEVM.pm | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/src/PVE/HA/Resources/PVEVM.pm b/src/PVE/HA/Resources/PVEVM.pm
index 529d8fb..c5276dc 100644
--- a/src/PVE/HA/Resources/PVEVM.pm
+++ b/src/PVE/HA/Resources/PVEVM.pm
@@ -120,7 +120,18 @@ sub check_running {
 
     my $nodename = $haenv->nodename();
 
-    return PVE::QemuServer::check_running($vmid, 1, $nodename);
+    if (PVE::QemuServer::check_running($vmid, 1, $nodename)) {
+	# do not count VMs which are suspended for a backup job as running
+	my $conf = PVE::QemuConfig->load_config($vmid, $nodename);
+	if (defined($conf->{lock}) && $conf->{lock} eq 'backup') {
+	    my $qmpstatus = PVE::QemuServer::vm_qmp_command($vmid, {execute => 'query-status'});
+	    return 0 if $qmpstatus->{status} eq 'prelaunch';
+	}
+
+	return 1;
+    } else {
+	return 0;
+    }
 }
 
 sub remove_locks {
-- 
2.11.0




More information about the pve-devel mailing list