[pve-devel] [PATCH] increase zfs default timeout to 30sec

Tue Mar 13 10:25:47 CET 2018

On 03/13/2018 09:53 AM, Lauri Tirkkonen wrote:
> On Tue, Mar 13 2018 09:45:18 +0100, Fabian Grünbichler wrote:
>> On Mon, Mar 12, 2018 at 04:06:47PM +0200, Lauri Tirkkonen wrote:
>>> busy pools can easily take more than 5 seconds for eg. zfs create
>>> ---
>>>  PVE/Storage/ZFSPoolPlugin.pm | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/PVE/Storage/ZFSPoolPlugin.pm b/PVE/Storage/ZFSPoolPlugin.pm
>>> index e864a58..7ba035f 100644
>>> --- a/PVE/Storage/ZFSPoolPlugin.pm
>>> +++ b/PVE/Storage/ZFSPoolPlugin.pm
>>> @@ -167,7 +167,7 @@ sub path {
>>>  sub zfs_request {
>>>      my ($class, $scfg, $timeout, $method, @params) = @_;
>>>  
>>> -    my $default_timeout = PVE::RPCEnvironment->is_worker() ? 60*60 : 5;
>>> +    my $default_timeout = PVE::RPCEnvironment->is_worker() ? 60*60 : 30;
>>
>> unfortunately, this is not the way to solve this. our API has a timeout
>> per request, so things that (can) take a long time should run in a
>> worker (the API then returns the ID of the task, which can be queried
>> over the API to retrieve status and/or output).
> 
> This did solve our immediate issue: adding a disk to an existing VM from
> the web UI failed with timeout, when the pool had some load on it. (A
> side note: SIGKILLing the zfs process won't make it exit any faster if
> it's waiting in kernel, which it likely is).
> 
> Obviously you are free to solve the underlying issue(s) any way you
> wish, we just wanted to share our fix. 5 seconds just isn't enough.

What Fabian meant with:
> [...] our API has a timeout per request, [...]

is that our API already has 30 seconds as timeout for response,
so using 30 seconds here can be problematic.

As a quick easy improvement we could probably increase it from
5 to say 20-25 seconds, still below the API timeout, but nonetheless
multiple times higher than now.

cheers,
Thomas