[pbs-devel] [PATCH proxmox-backup 2/4] api: verify: use worker-threads to determine the number of threads to use
Christian Ebner
c.ebner at proxmox.com
Thu Nov 6 12:57:17 CET 2025
On 11/6/25 12:21 PM, Nicolas Frey wrote:
> On 11/6/25 10:32 AM, Christian Ebner wrote:
>> On 11/6/25 10:22 AM, Nicolas Frey wrote:
>>> On 11/6/25 10:08 AM, Christian Ebner wrote:
>>>> Please add a short commit message describing what the worker threads
>>>> cover, e.g. that this parameter controls the number of reader and
>>>> chunk verification threads.
>>>>
>>>> What tripped me over just now:
>>>> Is this intentionally not increasing the number of chunk verification
>>>> threads? Or was that overlooked? From the name of the parameter I
>>>> suspected this to act on both, reading and verifying. If this is not
>>>> the case, maybe the parameter should get renamed to a more telling
>>>> `parallel-chunk-readers` instead?
>>>
>>> I wasn't sure if the number of threads for verification should be
>>> controlled via this as well, as the original patch only added a new
>>> thread pool for reading, whereas the verification pool was already
>>> implemented.
>>> I pointed this out in the cover letter, though it might have been
>>> better to put this here too:
>>>
>>> The number of `worker-threads` only controls the thread pool for
>>> reading, though if it makes sense to reuse this for the verification
>>> pool, it could be adjusted to do so too.
>>>
>>> I think it makes sense to use it to control the number of threads of
>>> both. Thanks for the feedback, I'll adjust it along with the other
>>> proposed changes in a v2!
>>
>> Well, that was just an uninformed assumption from my side when reading
>> the parameter name (and I did not re-read the cover letter today after
>> having looked at this quickly yesterday, sorry for that).
>
> That makes sense, the parameter name does not accurately describe the
> function it serves here anyway, so that should have been named a bit
> better.
>
>>
>> But maybe you can also evaluate if it actually makes sense to control
>> both by the same parameter, or if it only makes sense to e.g. increase
>> the number of verification tasks (no point for that if the IO remains
>> the bottleneck), or if it would make sense to have either 2 parameters
>> or couple them by some proportionality constant.
>>
>
> I had an idea along the lines of:
>
> self.worker_threads.mul(2).clamp(4, 32),
On second thought, this will most likely not cover most cases? One
system could be severely IO bound, the other one severely CPU bound...
> though the proportionality factor should be tested to determine what
> would actually be sensible here and of course be documented accordingly.
>
> I also thought a minimum of 4 threads for verification makes sense, as
> when the default value of 1 thread is used, it has somewhat the same
> behavior as before adding the read thread pool (i.e. 1 thread for
> reading, 4 threads for verification) and would scale somewhat
> accordingly. The threads should also clamped to a max of 32 to respect
> the constraints of the schema also stating 32 as a max.
>
> What do you think?
I think it would make sense to keep both decoupled for the time being,
especially since this might depend strongly on the backend. E.g. for S3
backed datastores you might gain a lot by increasing the number of
readers, but not much by increasing the number of verify threads.
More information about the pbs-devel
mailing list