[pve-devel] Discussion of major PBS restore speedup in proxmox-backup-qemu
Adam Kalisz
adam.kalisz at notnullmakers.com
Thu Jul 3 10:29:43 CEST 2025
Hi,
On Friday I have submitted the patch with a slight edit to allow
setting the number of threads from an environment variable.
On Tue, 2025-06-24 at 12:43 +0200, Fabian Grünbichler wrote:
>
> > Adam Kalisz <adam.kalisz at notnullmakers.com> hat am 24.06.2025 12:22
> > CEST geschrieben:
> > Hi Fabian,
>
> CCing the list again, assuming it got dropped by accident.
>
> > the CPU usage is higher, I see about 400% for the restore process.
> > I
> > didn't investigate the original much because it's unbearably slow.
> >
> > Yes, having configurable CONCURRENT_REQUESTS and
> > max_blocking_threads
> > would be great. However we would need to wire it up all the way to
> > qmrestore or similar or ensure it is read from some env vars. I
> > didn't
> > feel confident to introduce this kind of infrastructure as a first
> > time
> > contribution.
>
> we can guide you if you want, but it's also possible to follow-up on
> our end with that as part of applying the change.
That would be great, it shouldn't be too much work for somebody more
familiar with the project structure where everything needs to be.
> > The writer to disk is single thread still so a CPU that can ramp up
> > a
> > single core to a high frequency/ IPC will usually do better on the
> > benchmarks.
>
> I think that limitation is no longer there on the QEMU side nowadays,
> but it would likely require some more changes to actually make use of
> multiple threads submitting IO.
The storage writing seemed to be less of a bottleneck than the fetching
of chunks. It seems to me there still is a bottleneck in the network
part because I haven't seen an instance with substantially higher speed
than 1.1 GBps.
Perhaps we could have a discussion about the backup, restore and
synchronization speeds and strategies for debugging and improving the
situation after we have taken the intermediate step of improving the
restore speed as proposed to gather more feedback from the field?
> > What are the chances of this getting accepted more or less as is?
>
> proper review and discussion of potential follow-ups (no matter who
> ends
> up doing them) would require submitting a properly signed-off patch
> and a CLA - see
> https://pve.proxmox.com/wiki/Developer_Documentation#Software_License_and_Copyright
I have cleared the CLA with the Proxmox office last week.
> Fabian
Adam
> > On Tue, 2025-06-24 at 09:28 +0200, Fabian Grünbichler wrote:
> > >
> > > > Adam Kalisz via pve-devel <pve-devel at lists.proxmox.com> hat am
> > > > 23.06.2025 18:10 CEST geschrieben:
> > > > Hi list,
> > >
> > > Hi!
> > >
> > > > before I go through all the hoops to submit a patch I wanted to
> > > > discuss
> > > > the current form of the patch that can be found here:
> > > >
> > > > https://github.com/NOT-NULL-Makers/proxmox-backup-qemu/commit/e91f09cfd1654010d6205d8330d9cca71358e030
> > > >
> > > > The speedup process was discussed here:
> > > >
> > > > https://forum.proxmox.com/threads/abysmally-slow-restore-from-backup.133602/
> > > >
> > > > The current numbers are:
> > > >
> > > > With the most current snapshot of a VM with 10 GiB system disk
> > > > and
> > > > 2x
> > > > 100 GiB disks with random data:
> > > >
> > > > Original as of 1.5.1:
> > > > 10 GiB system: duration=11.78s, speed=869.34MB/s
> > > > 100 GiB random 1: duration=412.85s, speed=248.03MB/s
> > > > 100 GiB random 2: duration=422.42s, speed=242.41MB/s
> > > >
> > > > With the 12-way concurrent fetching:
> > > >
> > > > 10 GiB system: duration=2.05s, speed=4991.99MB/s
> > > > 100 GiB random 1: duration=100.54s, speed=1018.48MB/s
> > > > 100 GiB random 2: duration=100.10s, speed=1022.97MB/s
> > >
> > > Those numbers do look good - do you also have CPU usage stats
> > > before
> > > and after?
> > >
> > > > The hardware is on the PVE side:
> > > > 2x Intel Xeon Gold 6244, 1 TB RAM, 2x 100 Gbps Mellanox, 14x
> > > > Samsung
> > > > NVMe 3,8 TB drives in RAID10 using mdadm/ LVM-thin.
> > > >
> > > > On the PBS side:
> > > > 2x Intel Xeon Gold 6334, 1 TB RAM, 2x 100 Gbps Mellanox, 8x
> > > > Samsung
> > > > NVMe in RAID using 4 ZFS mirrors with recordsize 1M, lz4
> > > > compression.
> > > >
> > > > Similar or slightly better speeds were achieved on Hetzner AX52
> > > > with
> > > > AMD Ryzen 7 7700 with 64 GB RAM and 2x 1 TB NVMe in stripe on
> > > > PVE
> > > > with
> > > > recordsize 16k connected to another Hetzner AX52 using a 10
> > > > Gbps
> > > > connection. The PBS has normal NVMe ZFS mirror again with
> > > > recordsize
> > > > 1M.
> > > >
> > > > On bigger servers a 16-way concurrency was even better on
> > > > smaller
> > > > servers with high frequency CPUs 8-way concurrency performed
> > > > better.
> > > > The 12-way concurrency is a compromise. We seem to hit a
> > > > bottleneck
> > > > somewhere in the realm of TLS connection and shallow buffers.
> > > > The
> > > > network on the 100 Gbps servers can support up to about 3 GBps
> > > > (almost
> > > > 20 Gbps) of traffic in a single TCP connection using mbuffer.
> > > > The
> > > > storage can keep up with such a speed.
> > >
> > > This sounds like it might make sense to make the number of
> > > threads
> > > configurable (the second lower count can probably be derived from
> > > it?)
> > > to allow high-end systems to make the most of it, without
> > > overloading
> > > smaller setups. Or maybe deriving it from the host CPU count
> > > would
> > > also work?
> > >
> > > > Before I submit the patch, I would also like to do the most up
> > > > to
> > > > date
> > > > build but I have trouble updating my build environment to
> > > > reflect
> > > > the
> > > > latest commits. What do I have to put in my
> > > > /etc/apt/sources.list
> > > > to be
> > > > able to install e.g. librust-cbindgen-0.27+default-dev librust-
> > > > http-
> > > > body-util-0.1+default-dev librust-hyper-1+default-dev and all
> > > > the
> > > > rest?
> > >
> > > We are currently in the process of rebasing all our repositories
> > > on
> > > top
> > > of the upcoming Debian Trixie release. The built packages are not
> > > yet
> > > available for public testing, so you'd either need to wait a bit
> > > (in
> > > the
> > > order of a few weeks at most), or submit the patches for the
> > > current
> > > stable Bookworm-based version and let us forward port them.
> > >
> > > > This work was sponsored by ČMIS s.r.o. and consulted with the
> > > > General
> > > > Manager Václav Svátek (ČMIS), Daniel Škarda (NOT NULL Makers
> > > > s.r.o.)
> > > > and Linux team leader Roman Müller (ČMIS).
> > >
> > > Nice! Looking forward to the "official" patch submission!
> > > Fabian
More information about the pve-devel
mailing list