[pve-devel] applied+cleanup: [PATCH ha-manager v2] fix #1073: do not count backup-suspended VMs as running

Fabian Gr├╝nbichler f.gruenbichler at proxmox.com
Wed Aug 23 10:33:36 CEST 2017


On Wed, Aug 23, 2017 at 10:15:49AM +0200, Thomas Lamprecht wrote:
> when a stopped VM managed by HA got backuped the HA stack
> continuously tried to shut it down as check_running returns only if a
> PID for the VM exists.
> As the VM was locked the shutdown tries were blocked, but still a lot
> of annoying messages and task spawns happened during the backup
> period.
> 
> As querying the VM status through the vm monitor is not cheap, check
> if the VM is locked with the backup lock first, the config is cached
> and so this is quite cheap, only then query the VMs status over qmp,
> and check if the VM is in the 'prelaunch' state.
> This state gets only set if KVM was started with the `-S` option and
> has not yet continued guest operation.
> 
> Some performance results, I repeated each check 1000 times, first
> number is the total time spent just with the check, second time is
> the the time per single check:
> 
> old check (vm runs):            87.117 ms/total =>  87.117 us/loop
> new check (runs, no backup):   107.744 ms/total => 107.744 us/loop
> new check (runs, backup):      760.337 ms/total => 760.337 us/loop
> 
> If the VM does not run, the time is he same as previously (~6us/loop
> on my host), else we normally are ~30 us per check slower.
> For the exceptional case where a stopped HA VM gets backed up, we are
> an order of magnitude slower.
> Even in this case we can handle 1000 stopped VMs running in the worst
> case of this check per node well.

dropped this last part of the commit message, it's redundant and not
complete (running VMs which are backup locked also trigger the longer
check)

> 
> Signed-off-by: Thomas Lamprecht <t.lamprecht at proxmox.com>
> ---
> 
> changes v1 -> v2:
> * improve and shorten comment and commit message, this affects not
>   only suspend mode backups but all backups on stopped HA VMs - as
>   Fabian noticed me (thanks!)
> * remove the ternary if and just return 0 if we match the prelaunch
>   case, 1 then gets returned below already - no need for duplication
> 
>  src/PVE/HA/Resources/PVEVM.pm | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/src/PVE/HA/Resources/PVEVM.pm b/src/PVE/HA/Resources/PVEVM.pm
> index 529d8fb..c5276dc 100644
> --- a/src/PVE/HA/Resources/PVEVM.pm
> +++ b/src/PVE/HA/Resources/PVEVM.pm
> @@ -120,7 +120,18 @@ sub check_running {
>  
>      my $nodename = $haenv->nodename();
>  
> -    return PVE::QemuServer::check_running($vmid, 1, $nodename);
> +    if (PVE::QemuServer::check_running($vmid, 1, $nodename)) {
> +	# do not count VMs which are suspended for a backup job as running
> +	my $conf = PVE::QemuConfig->load_config($vmid, $nodename);
> +	if (defined($conf->{lock}) && $conf->{lock} eq 'backup') {
> +	    my $qmpstatus = PVE::QemuServer::vm_qmp_command($vmid, {execute => 'query-status'});
> +	    return 0 if $qmpstatus->{status} eq 'prelaunch';
> +	}
> +
> +	return 1;
> +    } else {
> +	return 0;
> +    }
>  }
>  
>  sub remove_locks {
> -- 
> 2.11.0
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel



More information about the pve-devel mailing list