[pve-devel] [PATCH qemu-server 1/3] fix #2816: restore: remove timeout when allocating disks

Dominik Csapak d.csapak at proxmox.com
Mon Sep 25 10:57:18 CEST 2023


On 9/25/23 10:46, Fiona Ebner wrote:
> Am 20.09.23 um 13:23 schrieb Dominik Csapak:
>> comment inline:
> 
> Feel free to cut out irrelevant parts in the reply ;)
> 
>> On 9/12/23 11:16, Fiona Ebner wrote:
>>> @@ -7483,14 +7483,11 @@ sub restore_vma_archive {
>>>            $devinfo->{$devname} = { size => $size, dev_id => $dev_id };
>>>            } elsif ($line =~ m/^CTIME: /) {
>>>            # we correctly received the vma config, so we can disable
>>> -        # the timeout now for disk allocation (set to 10 minutes, so
>>> -        # that we always timeout if something goes wrong)
>>> -        alarm(600);
>>> +        # the timeout now for disk allocation
> 
> I would interpret this comment about disabling of the timeout to be
> talking about the short 5 second timeout for reading the config.

ok, i interpreted it to be disabling *any* timeout to be able
to allocate the disks properly, and since there is only one global
timeout here, selectively disabling one seems strange?

> 
>>> +        alarm($oldtimeout || 0);
>>> +        $oldtimeout = undef;
>>
>>
>> this part looks wrong to me, because AFAIU you want to disable the timeout
>> (by canceling the alarm), but what you do here is to set it to $oldtimeout
>> if that was set before?
>>
>> i guess what we want to do here is:
>>
>> ----
>> alarm(0);
>> <... do stuff ...>
>> alarm($oldtimeout || 0);
>> $oldtimeout = undef;
>> ----
>>
>> ?
> 
> Hmm, I see what you mean. But I'd argue that it's unexpected to disable
> the outer timeout for the full duration of the allocation from a
> caller's perspective.
> 
> sub in_a_hurry {
>      alarm(120); # outer/old timeout
>      restore_vma_archive(...);
> }
> 
> With the code before the patch, it could take up to 5 + 600 + 120
> seconds to hit the outer timeout, with your suggestion up to 5 +
> potentially unlimited + 120 seconds, with patched code up to 5 + 120
> seconds. Since there currently are no callers setting an outer timeout,
> the patch doesn't make the situation worse.


i get what you mean, but maybe that would warrant a comment on the function?
or maybe we should be able to clean up half allocated disks in there
in case the outer timeout triggers?

in any case, i'd find it good to improve the comment that speaks of
'disabling the timeout' that it's meant to only disable the inner 5s one.

> 
> We could even make the calculation more complicated and have the timeout
> always be hit within 120 seconds in the example above, but not sure if
> worth it.

meh. imho thats not worth it for the (up to) 5 seconds that the extraction
can take.

> 
> AFAICS, we do similar "delay" of the outer timeout in e.g.
> run_with_timeout(), where it can also take up to $inner_timeout +
> $outer_timeout seconds to hit the outer timeout.


exactly, only our "inner" timeout here is undefined/unlimited because
disk allocation can take forever?





More information about the pve-devel mailing list