[pve-devel] [PATCH storage] Use bigger timeouts for zfs operations

Fabian Ebner f.ebner at proxmox.com
Thu Oct 10 13:22:21 CEST 2019


On 10/10/19 10:29 AM, Fabian Grünbichler wrote:
> On October 10, 2019 8:55 am, Fabian Ebner wrote:
>> On 10/1/19 12:28 PM, Fabian Grünbichler wrote:
>>> On October 1, 2019 12:17 pm, Fabian Ebner wrote:
>>>> Seems like 'zfs destroy' can take longer than 5 seconds, see [0].
>>>> I changed the timeout to 15 seconds and also changed the default
>>>> timeout to 10 instead of 5 seconds, to be on the safe side
>>>> for other commands like 'zfs create'.
>>>>
>>>> [0]: https://forum.proxmox.com/threads/timeout-beim-l%C3%B6schen-des-entfernen-replikats-bei-entfernung-der-replikation.58467/
>>> NAK, we have a 30s timeout for synchronous API requests that call this,
>>> and they might do more than 2 zfs_requests. we'd rather timeout and have
>>> time to do error handling, than finish successfully sometimes, and die
>>> without cleanup other times.
>>>
>>> the real solution for this is to convert the remaining synchronous API
>>> calls that trigger storage operations to async ones, and switch the GUI
>>> over to use the new variants. we had previous discussions about this
>>> issue already. to implement it really nice, we'd need to things:
>>> - tasks with structured return value (to implement async content listing
>>>     and similar operations, and make conversion of sync to async API calls
>>>     easier)
>>> - light-weight / ephemeral tasks (not included in regular task
>>>     lists/log, cleaned up after final poll / 1 day after finishing / ...)
>>>
>>> in case of the mentioned report, please investigate whether this call in
>>> 'pvesr' context is wrongly treated as sync API call (i.e., is_worker
>>> should maybe return true for pvesr calls?)
>> I looked at setting the worker flag for the (whole) pvesr environment,
>> but it seems a
>> bit much. E.g. in prepare we need to delete stale snapshots before
>> starting the replication
>> and we probably don't want to wait all too long for that to complete,
>> i.e. not
>> have the worker flag when calling volume_snapshot_delete.
>>
>> The situation is different with the vdisk_free call, since there we
>> don't need to
>> sit and wait for the result, so there it would make sense to be treated
>> as a worker.
>> So we could set the worker flag locally before that call and use our own
>> timeout,
>> but it feels like a bad workaround. Also we can't use run_with_timeout
>> to set our own
>> timeout, since run_command is called inside vdisk_free, which resets the
>> alarm.
> but we could use run_fork_with_timeout ;)
>
>> Since we have a mechanism to spawn a worker in vdisk_free already
>> (currently only
>> used by LVM with saferemove), wouldn't it make sense to use that for zfs
>> as well? So
>> having free_image in the zfs plugin return a subroutine instead of
>> executing zfs destroy
>> directly. Also other callers of vdisk_free should be fine with such a
>> change, since it already
>> can happen that vdisk_free spawns a worker (but I haven't looked in detail).
> but that is the non-default behaviour, not the default one. I'd rather
> not switch all ZFS vdisk removals to forking a task if there are other
> solutions.
>
> AFAICT, this vdisk_free happens only for previously, but no-longer
> replicated volumes? a simple solution might be to re-factor 139ff of
> pvesr.pm to fork a worker if any vdisks need to be removed, and remove
> them all asynchronously in a single task?
>
> OTOH, if vdisk_free runs into this problem, it's only a question of load
> for other zfs_requests (like listing volumes, snapshots, creating
> snapshots, removing snapshots) to also run into the low timeouts.. so
> the big question still remains - do we want to increase those timeouts
> in general for pvesr, and if we do, how do we do it?
We might think of it as a feature ;) since on a zfs which is already
under high load pvesr "knows" that now is not a good time to do
a replication.

Contrary to most of the other zfs_requests here, pvesr doesn't
actually need to wait for successful removal of stale disks and
isn't it also the most likely operation to time out?
So I like your proposed simple solution.

>> It would introduce a bit of noise since it would create a task on every
>> vdisk_free for a zfs volume.
>> There the light-weight tasks you mentioned would be ideal, but for now
>> we don't have those.
>> And maybe the timeout is hit very rarely and so it's not worth the
>> change, what do you think?
>>
>>>> Signed-off-by: Fabian Ebner <f.ebner at proxmox.com>
>>>> ---
>>>>    PVE/Storage/ZFSPoolPlugin.pm | 4 ++--
>>>>    1 file changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/PVE/Storage/ZFSPoolPlugin.pm b/PVE/Storage/ZFSPoolPlugin.pm
>>>> index f66b277..3ce06be 100644
>>>> --- a/PVE/Storage/ZFSPoolPlugin.pm
>>>> +++ b/PVE/Storage/ZFSPoolPlugin.pm
>>>> @@ -182,7 +182,7 @@ sub zfs_request {
>>>>        my $msg = '';
>>>>        my $output = sub { $msg .= "$_[0]\n" };
>>>>    
>>>> -    $timeout = PVE::RPCEnvironment->is_worker() ? 60*60 : 5 if !$timeout;
>>>> +    $timeout = PVE::RPCEnvironment->is_worker() ? 60*60 : 10 if !$timeout;
>>>>    
>>>>        run_command($cmd, errmsg => "zfs error", outfunc => $output, timeout => $timeout);
>>>>    
>>>> @@ -346,7 +346,7 @@ sub zfs_delete_zvol {
>>>>    
>>>>        for (my $i = 0; $i < 6; $i++) {
>>>>    
>>>> -	eval { $class->zfs_request($scfg, undef, 'destroy', '-r', "$scfg->{pool}/$zvol"); };
>>>> +	eval { $class->zfs_request($scfg, 15, 'destroy', '-r', "$scfg->{pool}/$zvol"); };
>>>>    	if ($err = $@) {
>>>>    	    if ($err =~ m/^zfs error:(.*): dataset is busy.*/) {
>>>>    		sleep(1);
>>>> -- 
>>>> 2.20.1
>>>>
>>>>
>>>> _______________________________________________
>>>> pve-devel mailing list
>>>> pve-devel at pve.proxmox.com
>>>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>>>>
>>>>
>>> _______________________________________________
>>> pve-devel mailing list
>>> pve-devel at pve.proxmox.com
>>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>>>
>>





More information about the pve-devel mailing list