[pve-devel] [PATCH] increase zfs default timeout to 30sec

Thu Apr 12 14:24:59 CEST 2018

On 04/12/2018 02:13 PM, Fabian Grünbichler wrote:
> On Thu, Apr 12, 2018 at 02:59:22PM +0300, Lauri Tirkkonen wrote:
>> On Thu, Apr 12 2018 12:59:13 +0200, Fabian Grünbichler wrote:
>>> On Tue, Apr 10, 2018 at 05:40:50PM +0300, Lauri Tirkkonen wrote:
>>>> Hi,
>>>>
>>>> On Tue, Mar 13 2018 10:25:47 +0100, Thomas Lamprecht wrote:
>>>>> What Fabian meant with:
>>>>>> [...] our API has a timeout per request, [...]
>>>>>
>>>>> is that our API already has 30 seconds as timeout for response,
>>>>> so using 30 seconds here can be problematic.
>>>>>
>>>>> As a quick easy improvement we could probably increase it from
>>>>> 5 to say 20-25 seconds, still below the API timeout, but nonetheless
>>>>> multiple times higher than now.
>>>>
>>>> Now that the CLA stuff is out of the way, here's a diff to do that then.
>>>
>>> if you bump the timeout to 20, you need to check every non-worker call
>>> path that ends up in zfs_request to verify that you don't call
>>> zfs_request twice (because then you'd have a total timeout of 40 which
>>> is over the 30s API request limit).
>>>
>>> unless you have done that (and somehow provide a convincing argument
>>> that you did), this cannot be applied. if you do that, you will quickly
>>> realize that converting API endpoints that can trigger expensive storage
>>> operations to workers makes more sense, which is why I originally
>>> proposed that approach.
>>
>> I have not done that. I submitted this diff because Thomas implied it
>> could be done: a quick and easy fix, that fixed a real problem for us.
> 
> I already stated in previous replies that we can only bump it after
> careful analysis ;)
> 
>> Knowing very close to zero about PVE code in general I'm much more
>> hesitant to do things like converting things to workers - I imagine that
>> will end up being a change so large that I'd likely give up before
>> finishing. You don't have to take my diff, of course, but I have no
>> plans to write your convert-to-worker diff either.
> 
> that's okay. in this case, we already have an async API call for what
> you want to do - we'd simply need to switch the GUI to use it.
> 
> @Dominik: could we (either in general, or when detecting disks are
> involved) use the async POST path instead of PUT for updating VM
> configs?
> 

yes, we could change the method for disks on vms/ct to post
the code also supports showing a task log if we get back a upid,etc.

in fact wolgang link already asked me about this, and i think
he works on that