[pve-devel] Discussion of major PBS restore speedup in proxmox-backup-qemu

Tue Jun 24 12:43:19 CEST 2025

> Adam Kalisz <adam.kalisz at notnullmakers.com> hat am 24.06.2025 12:22 CEST geschrieben:
> Hi Fabian,

CCing the list again, assuming it got dropped by accident.

> the CPU usage is higher, I see about 400% for the restore process. I
> didn't investigate the original much because it's unbearably slow.
> 
> Yes, having configurable CONCURRENT_REQUESTS and max_blocking_threads
> would be great. However we would need to wire it up all the way to
> qmrestore or similar or ensure it is read from some env vars. I didn't
> feel confident to introduce this kind of infrastructure as a first time
> contribution.

we can guide you if you want, but it's also possible to follow-up on our
end with that as part of applying the change.

> The writer to disk is single thread still so a CPU that can ramp up a
> single core to a high frequency/ IPC will usually do better on the
> benchmarks.

I think that limitation is no longer there on the QEMU side nowadays,
but it would likely require some more changes to actually make use of
multiple threads submitting IO.

> What are the chances of this getting accepted more or less as is?

proper review and discussion of potential follow-ups (no matter who ends
up doing them) would require submitting a properly signed-off patch
and a CLA - see https://pve.proxmox.com/wiki/Developer_Documentation#Software_License_and_Copyright

Fabian

> On Tue, 2025-06-24 at 09:28 +0200, Fabian Grünbichler wrote:
> > 
> > > Adam Kalisz via pve-devel <pve-devel at lists.proxmox.com> hat am
> > > 23.06.2025 18:10 CEST geschrieben:
> > > Hi list,
> > 
> > Hi!
> > 
> > > before I go through all the hoops to submit a patch I wanted to
> > > discuss
> > > the current form of the patch that can be found here:
> > > 
> > > https://github.com/NOT-NULL-Makers/proxmox-backup-qemu/commit/e91f09cfd1654010d6205d8330d9cca71358e030
> > > 
> > > The speedup process was discussed here:
> > > 
> > > https://forum.proxmox.com/threads/abysmally-slow-restore-from-backup.133602/
> > > 
> > > The current numbers are:
> > > 
> > > With the most current snapshot of a VM with 10 GiB system disk and
> > > 2x
> > > 100 GiB disks with random data:
> > > 
> > > Original as of 1.5.1:
> > > 10 GiB system:    duration=11.78s,  speed=869.34MB/s
> > > 100 GiB random 1: duration=412.85s, speed=248.03MB/s
> > > 100 GiB random 2: duration=422.42s, speed=242.41MB/s
> > > 
> > > With the 12-way concurrent fetching:
> > > 
> > > 10 GiB system:    duration=2.05s,   speed=4991.99MB/s
> > > 100 GiB random 1: duration=100.54s, speed=1018.48MB/s
> > > 100 GiB random 2: duration=100.10s, speed=1022.97MB/s
> > 
> > Those numbers do look good - do you also have CPU usage stats before
> > and after?
> > 
> > > The hardware is on the PVE side:
> > > 2x Intel Xeon Gold 6244, 1 TB RAM, 2x 100 Gbps Mellanox, 14x
> > > Samsung
> > > NVMe 3,8 TB drives in RAID10 using mdadm/ LVM-thin.
> > > 
> > > On the PBS side:
> > > 2x Intel Xeon Gold 6334, 1 TB RAM, 2x 100 Gbps Mellanox, 8x Samsung
> > > NVMe in RAID using 4 ZFS mirrors with recordsize 1M, lz4
> > > compression.
> > > 
> > > Similar or slightly better speeds were achieved on Hetzner AX52
> > > with
> > > AMD Ryzen 7 7700 with 64 GB RAM and 2x 1 TB NVMe in stripe on PVE
> > > with
> > > recordsize 16k connected to another Hetzner AX52 using a 10 Gbps
> > > connection. The PBS has normal NVMe ZFS mirror again with
> > > recordsize
> > > 1M.
> > > 
> > > On bigger servers a 16-way concurrency was even better on smaller
> > > servers with high frequency CPUs 8-way concurrency performed
> > > better.
> > > The 12-way concurrency is a compromise. We seem to hit a bottleneck
> > > somewhere in the realm of TLS connection and shallow buffers. The
> > > network on the 100 Gbps servers can support up to about 3 GBps
> > > (almost
> > > 20 Gbps) of traffic in a single TCP connection using mbuffer. The
> > > storage can keep up with such a speed.
> > 
> > This sounds like it might make sense to make the number of threads
> > configurable (the second lower count can probably be derived from
> > it?)
> > to allow high-end systems to make the most of it, without overloading
> > smaller setups. Or maybe deriving it from the host CPU count would
> > also work?
> > 
> > > Before I submit the patch, I would also like to do the most up to
> > > date
> > > build but I have trouble updating my build environment to reflect
> > > the
> > > latest commits. What do I have to put in my /etc/apt/sources.list
> > > to be
> > > able to install e.g. librust-cbindgen-0.27+default-dev librust-
> > > http-
> > > body-util-0.1+default-dev librust-hyper-1+default-dev and all the
> > > rest?
> > 
> > We are currently in the process of rebasing all our repositories on
> > top
> > of the upcoming Debian Trixie release. The built packages are not yet
> > available for public testing, so you'd either need to wait a bit (in
> > the
> > order of a few weeks at most), or submit the patches for the current
> > stable Bookworm-based version and let us forward port them.
> > 
> > > This work was sponsored by ČMIS s.r.o. and consulted with the
> > > General
> > > Manager Václav Svátek (ČMIS), Daniel Škarda (NOT NULL Makers
> > > s.r.o.)
> > > and Linux team leader Roman Müller (ČMIS).
> > 
> > Nice! Looking forward to the "official" patch submission!
> > Fabian