[pve-devel] Discussion of major PBS restore speedup in proxmox-backup-qemu
Dominik Csapak
d.csapak at proxmox.com
Thu Jul 3 10:57:01 CEST 2025
Hi,
On 7/3/25 10:29, Adam Kalisz via pve-devel wrote:
> Hi,
>
> On Friday I have submitted the patch with a slight edit to allow
> setting the number of threads from an environment variable.
>
Yes, we saw, thanks for tackling this.
> On Tue, 2025-06-24 at 12:43 +0200, Fabian Grünbichler wrote:
>>> Adam Kalisz<adam.kalisz at notnullmakers.com> hat am 24.06.2025 12:22
>>> CEST geschrieben:
>>> Hi Fabian,
>> CCing the list again, assuming it got dropped by accident.
>>
>>> the CPU usage is higher, I see about 400% for the restore process.
>>> I
>>> didn't investigate the original much because it's unbearably slow.
>>>
>>> Yes, having configurable CONCURRENT_REQUESTS and
>>> max_blocking_threads
>>> would be great. However we would need to wire it up all the way to
>>> qmrestore or similar or ensure it is read from some env vars. I
>>> didn't
>>> feel confident to introduce this kind of infrastructure as a first
>>> time
>>> contribution.
>> we can guide you if you want, but it's also possible to follow-up on
>> our end with that as part of applying the change.
> That would be great, it shouldn't be too much work for somebody more
> familiar with the project structure where everything needs to be.
Just to clarify, it's OK (and preferred?) for you if we continue working with
this patch? In that case I'd take a swing at it.
>
>>> The writer to disk is single thread still so a CPU that can ramp up
>>> a
>>> single core to a high frequency/ IPC will usually do better on the
>>> benchmarks.
>> I think that limitation is no longer there on the QEMU side nowadays,
>> but it would likely require some more changes to actually make use of
>> multiple threads submitting IO.
> The storage writing seemed to be less of a bottleneck than the fetching
> of chunks. It seems to me there still is a bottleneck in the network
> part because I haven't seen an instance with substantially higher speed
> than 1.1 GBps.
I guess this largely depends on the actual storage and network config,
e.g. if the target storage IO depth is the bottle neck, multiple
writers will speed up that too.
>
> Perhaps we could have a discussion about the backup, restore and
> synchronization speeds and strategies for debugging and improving the
> situation after we have taken the intermediate step of improving the
> restore speed as proposed to gather more feedback from the field?
I'd at least like to take a very short glance at how hard it would
be to add multiple writers to the image before deciding. If
it's not trivial, then IMHO yes, we can increase the fetching threads for now.
Though I have to look in how we'd want to limit/configure that from
outside. E.g. a valid way to view that would maybe to limit the threads
from exceeding what the vm config says + some extra?
(have to think about that)
>
>>> What are the chances of this getting accepted more or less as is?
>> proper review and discussion of potential follow-ups (no matter who
>> ends
>> up doing them) would require submitting a properly signed-off patch
>> and a CLA - see
>> https://pve.proxmox.com/wiki/Developer_Documentation#Software_License_and_Copyright
> I have cleared the CLA with the Proxmox office last week.
>
Thanks
>> Fabian
> Adam
>
Best Regards
Dominik
More information about the pve-devel
mailing list