[pve-devel] [PATCH qemu-server 1/2] Fix #2816: increase timeout for allocation on restore

Wed Aug 12 10:38:28 CEST 2020

Am 11.08.20 um 14:36 schrieb Fabian Grünbichler:
> On August 4, 2020 1:32 pm, Fabian Ebner wrote:
>> qcow2 images are allocated with --preallocation=metadata,
>> which can take a while for large images.
>> Avoid using 'got timeout' as an error message by itself,
>> to make it clearer where a timeout occured.
>>
>> Signed-off-by: Fabian Ebner <f.ebner at proxmox.com>
>> ---
>>   PVE/QemuServer.pm | 6 +++---
>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
>> index 0a09f3a..8b0b2c8 100644
>> --- a/PVE/QemuServer.pm
>> +++ b/PVE/QemuServer.pm
>> @@ -6261,7 +6261,7 @@ sub restore_vma_archive {
>>   	    local $SIG{QUIT} =
>>   	    local $SIG{HUP} =
>>   	    local $SIG{PIPE} = sub { die "interrupted by signal\n"; };
>> -	local $SIG{ALRM} = sub { die "got timeout\n"; };
>> +	local $SIG{ALRM} = sub { die "got timeout preparing device images\n"; };
>>   
>>   	$oldtimeout = alarm($timeout);

^[0]

>>   
>> @@ -6275,9 +6275,9 @@ sub restore_vma_archive {
>>   		$devinfo->{$devname} = { size => $size, dev_id => $dev_id };
>>   	    } elsif ($line =~ m/^CTIME: /) {
>>   		# we correctly received the vma config, so we can disable
>> -		# the timeout now for disk allocation (set to 10 minutes, so
>> +		# the timeout now for disk allocation (set to 1 hour, so
>>   		# that we always timeout if something goes wrong)
> 
> do we really need this timeout? we are by definition in a worker
> already, instead of moving the goal post once more could we not drop
> this and let the user hit 'Stop' if the allocation stalls altogether?
> 

I thought we'd need the timeout here, because otherwise the timeout from 
above[0] is still active. This also seems to be the reason the timeout 
was introduced in the first place with 
3cf90d7a40554b4c353e389209d6ef36a89b96a7

But of course we could move the alarm($oldtimeout || 0) to before 
&$print_devmap(). If we do this, then the time spent allocating the 
disks will eat into the oldtimeout.

That said, oldtimeout should always be 0 anyways, because AFAICT the 
only path leading here is:
API-create_vm -> *spawning of worker* -> restore_file_archive -> 
restore_vma_archive
and nobody sets an alarm along the way.

I'll send a v2.

>> -		alarm(600);
>> +		alarm(60 * 60);
>>   		&$print_devmap();
>>   		print $fifofh "done\n";
>>   		my $tmp = $oldtimeout || 0;
>> -- 
>> 2.20.1
>>
>>
>>
>> _______________________________________________
>> pve-devel mailing list
>> pve-devel at lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>>
>>
>>
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel at lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
>