[pve-devel] [PATCH storage v4 09/27] plugin: introduce new_backup_provider() method

Thu Apr 3 09:24:41 CEST 2025

On Wed, Apr 02, 2025 at 06:16:57PM +0200, Andreas Rogge wrote:
> Am 02.04.25 um 10:30 schrieb Wolfgang Bumiller:
> > On Tue, Apr 01, 2025 at 08:21:30PM +0200, Thomas Lamprecht wrote:
> > > > This sounds pretty inefficient - especially when
> > > > comparing with qmrestore's ability to just read read from stdin.
> > 
> > The reading from stdin is quite limited, does not support sparse files
> > efficiently, and does not support our live-restore feature.

(PS: I may have had a different case in mind when thinking of sparse
files - the restore from stdin uses the VMA format after all)

> > 
> > If we can *pull* data out-of-order from the backup provider via a better
> > protocol (like NBD which Thomas mentioned), holes in the disks  don't
> > need to be transferred over the network, and we could probably support
> > live-restore, where the VMs immediately start running *during* the
> > restore process. (qemu would simply treat read-requests from the guest
> > which have not yet been restored with a higher priority while otherwise
> > continuing to copy the data in-order in the background)
> Neither pulling nor out-of-order is an option in the Bareos architecture.

Fair enough.

While this is not currently done, I'd still like to know: what about
*backing up* out of order? PVE's original "VMA" format was specifically
designed to allow what is essentially a "copy-before-write" style backup
while a VM is running.
The current provider-API uses a fleecing-approach to support running VMS
while the provider plugin can choose to perform backups in its desired
order.

But I do wonder if - to reduce space-requirements for backing up running
VMs - at some point we might also add the ability for qemu to provide
some kind of queue containing the offset-length pairs of blocks which
have been stored in the temporary fleecing image. The provider could
consume this queue to keep the temporary storage at a minimum by doing
out-of-order backups.
This only makes sense if there are backup systems which can benefit from
it. (And it should be a simple-enough api extension to add this in the
future, from what I can tell)

> 
> > Some alternatives btw. are providing a fuse or ublk device on the PVE
> > side which pulls data from bareos (or to which bareos can push data
> > which, like qemu's "fleecing" mode, could store the not-yet restored
> > portions in temporary files, discarding them as they are read by/moved
> > to qemu).
> That's basically like staging the image just that you can start reading
> before we finish writing. I'll keep it in mind, even tough I don't think it
> is really feasible.
> 
> > *Technically* we could have a mode where we allocate the disks and "map"
> > them onto the system (potentially via nbd, or `rbd map` for ceph etc.)
> Yes, please do.

I guess it makes sense for us to not expect/require random access, as
any feature like that already imposes limitations on how the data can be
stored. I'd expect different backup solutions to have different
limitations in that regard.

The API design in this series (or at least the current version of it)
wouldn't prevent pure sequential restores, just not make them "easy"
right now, but that's something we should be able to accommodate with
another mechanism.

I *believe* `qemu-nbd` should be able to bind all the storage types we
want to restore to to /dev/nbdXY devices, which would give the provider
a bunch of block devices to write to in whichever way they want to, so
the provider would then only need to figure out how to receive the data
and forward that to the devices.
We'll need to try.

> > 
> > *But* it would make live-restore impossible with that plugin.
> There will be no live-restore with Bareos.
> If we would ever consider implementing this, on the Bareos side it would be
> pretty complicated and full of limitations. In that case we would probably
> just implement yet another plugin for PVE.

Fair enough.

> 
> > Which is why the most flexible thing to do is to use a `qemu-img` call
> > and giving it the paths, or more precisely, the URLs to the disks.
> 
> I understand how this makes sense. However, if you don't have the data in a
> format that qemu-img can consume, things become complicated.

It can also just read data from a stream (but without any smarter
protocol, this would make holes/unallocated blocks extremely
inefficient), where the provider would create that stream in whichever
way they want to.
(Although I'm not actually sure the current invocation can deal with a
stream - we do have patches on qemu-img's `dd` subcommand to
specifically allow this, though...)