[pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API
Fiona Ebner
f.ebner at proxmox.com
Thu Nov 7 17:51:12 CET 2024
Changes in v3:
* Add storage_has_feature() helper and use it to decide on whether the
storage uses a backup provider, instead of having this be implicit
with whether a backup provider is returned by new_backup_provider().
* Fix querying block-node size for fleecing in stop mode, by issuing
the QMP command only after the VM is enforced running.
* Run backup_container() in user namespace associated to the
container.
* And introduce 'prepare' phase for backup_hook() to be used to
prepare for running in that user namespace context.
* Pass in guest and firewall config as raw data instead of by file
name (so files don't have to be accessible in user namespace context
for containers).
* Run restore of containers with 'directory' mechanism in user
namespace switching from 'rsync' to 'tar' which is easier to "split"
into a privileged and unprivileged half.
* Check potentially untrusted tar archives.
* Borg plugin: make SSH work and use that.
Changes in v2:
* Add 'block-device' backup mechansim for VMs. The NBD export is
mounted by Proxmox VE and only the block device path (as well as a
callback to get the next dirty range for bitmaps) is passed to the
backup provider.
* Add POC example for Borg - note that I tested with borg 1.2.4 in
Debian and only tested with a local repository, not SSH yet.
* Merge hook API into a single function for backup and for jobs.
* Add restore_vm_init() and restore_vm_cleanup() for better
flexibility to allow preparing the whole restore. Question is
if restore_vm_volume_init() and restore_vm_volume_cleanup() should
be dropped (but certain providers might prefer using only those)?
Having both is more flexible, but makes the API longer of course.
* Switch to backup_vm() (was per-volume backup_vm_volume() before) and
backup_container(), passing along the configuration files, rather
than having dedicated methods for the configuration files, for
giving the backup provider more flexibility.
* Some renames in API methods/params to improve clarity.
* Pass backup time to backup 'start' hook and use that in the
directory example rather than the job start time.
* Use POD for base plugin documentation and flesh out documentation.
* Use 'BackupProvider::Plugin::' namespace.
* Various smaller improvements in the directory provider example.
======
A backup provider needs to implement a storage plugin as well as a
backup provider plugin. The storage plugin is for integration in
Proxmox VE's front-end, so users can manage the backups via
UI/API/CLI. The backup provider plugin is for interfacing with the
backup provider's backend to integrate backup and restore with that
backend into Proxmox VE.
This is an initial draft of an API and required changes to the backup
stack in Proxmox VE to make it work. Depending on feedback from other
developers and interested parties, it can still substantially change.
======
The backup provider API is split into two parts, both of which again
need different implementations for VM and LXC guests:
1. Backup API
There are two hook callback functions, namely:
1. job_hook() is called during the start/end/abort phases of the whole
backup job.
2. backup_hook() is called during the start/end/abort phases of the
backup of an individual guest. There also is a 'prepare' phase
useful for container backups, because the backup method for
containers itself is executed in the user namespace context
associated to the container.
The backup_get_mechanism() method is used to decide on the backup
mechanism. Currently, 'block-device' or 'nbd' for VMs, and 'directory'
for containers is possible. The method also let's the plugin indicate
whether to use a bitmap for incremental VM backup or not. It is enough
to implement one mechanism for VMs and one mechanism for containers.
Next, there are methods for backing up the guest's configuration and
data, backup_vm() for VM backup and backup_container() for container
backup, with the latter running
Finally, some helpers like getting the provider name or volume ID for
the backup target, as well as for handling the backup log.
1.1 Backup Mechanisms
VM:
Access to the data on the VM's disk from the time the backup started
is made available via a so-called "snapshot access". This is either
the full image, or in case a bitmap is used, the dirty parts of the
image since the last time the bitmap was used for a successful backup.
Reading outside of the dirty parts will result in an error. After
backing up each part of the disk, it should be discarded in the export
to avoid unnecessary space usage on the Proxmox VE side (there is an
associated fleecing image).
VM mechanism 'block-device':
The snapshot access is exposed as a block device. If used, a bitmap is
passed along.
VM mechanism 'nbd':
The snapshot access and, if used, bitmap are exported via NBD.
Container mechanism 'directory':
A copy or snapshot of the container's filesystem state is made
available as a directory. The method is executed inside the user
namespace associated to the container.
2. Restore API
The restore_get_mechanism() method is used to decide on the restore
mechanism. Currently, 'qemu-img' for VMs, and 'directory' or 'tar' for
containers are possible. It is enough to implement one mechanism for
VMs and one mechanism for containers.
Next, methods for extracting the guest and firewall configuration and
the implementations of the restore mechanism via a pair of methods: an
init method, for making the data available to Proxmox VE and a cleanup
method that is called after restore.
For VMs, there also is a restore_vm_get_device_info() helper required,
to get the disks included in the backup and their sizes.
2.1. Restore Mechanisms
VM mechanism 'qemu-img':
The backup provider gives a path to the disk image that will be
restored. The path needs to be something 'qemu-img' can deal with,
e.g. can also be an NBD URI or similar.
Container mechanism 'directory':
The backup provider gives the path to a directory with the full
filesystem structure of the container.
Container mechanism 'tar':
The backup provider gives the path to a (potentially compressed) tar
archive with the full filesystem structure of the container.
See the PVE::BackupProvider::Plugin module for the full API
documentation.
======
This series adapts the backup stack in Proxmox VE to allow using the
above API. For QEMU, backup access setup and teardown QMP commands are
implemented to be able to provide access to a consistent disk state to
the backup provider.
The series also provides an example implementation for a backup
provider as a proof-of-concept, exposing the different features.
======
Open questions:
Should the backup provider plugin system also follow the same API
age+version schema with a Custom/ directory for external plugins
derived from the base plugin?
Should the bitmap action be passed directly to the backup provider?
I.e. have 'not-used', 'not-used-removed', 'new', 'used', 'invalid',
instead of only 'none', 'new' and 'reuse'. It makes API slightly more
complicated. Is there any situation where backup provider could care
if bitmap is new, because it was the first or bitmap is new because
previous was invalid? Both cases require the backup provider to do a
full backup.
======
The patches marked as PATCH rather than RFC can make sense
independently, with QEMU patches 02 and 03 having been sent already
before (touching same code, so included here):
https://lore.proxmox.com/pve-devel/20240625133551.210636-1-f.ebner@proxmox.com/#r
======
Feedback is very welcome, especially from people wishing to implement
such a backup provider plugin! Please tell me what issues you see with
the proposed API, what would and wouldn't work from your perspective?
======
Dependencies: pve-manager, pve-container and qemu-server all depend on
new libpve-storage-perl. pve-manager also build-depends on the new
libpve-storage-perl for its tests. pve-container depends on new
pve-common. To keep things clean, pve-manager should also depend on
new pve-container and qemu-server.
In qemu-server, there is no version guard added yet, as that depends
on the QEMU version the feature will land in.
======
qemu:
Fiona Ebner (9):
block/reqlist: allow adding overlapping requests
PVE backup: fixup error handling for fleecing
PVE backup: factor out setting up snapshot access for fleecing
PVE backup: save device name in device info structure
PVE backup: include device name in error when setting up snapshot
access fails
PVE backup: add target ID in backup state
PVE backup: get device info: allow caller to specify filter for which
devices use fleecing
PVE backup: implement backup access setup and teardown API for
external providers
PVE backup: implement bitmap support for external backup access
block/copy-before-write.c | 3 +-
block/reqlist.c | 2 -
pve-backup.c | 620 +++++++++++++++++++++++++++++++++-----
pve-backup.h | 16 +
qapi/block-core.json | 61 ++++
system/runstate.c | 6 +
6 files changed, 637 insertions(+), 71 deletions(-)
create mode 100644 pve-backup.h
common:
Fiona Ebner (1):
env: add module with helpers to run a Perl subroutine in a user
namespace
src/Makefile | 1 +
src/PVE/Env.pm | 136 +++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 137 insertions(+)
create mode 100644 src/PVE/Env.pm
storage:
Fiona Ebner (5):
add storage_has_feature() helper function
plugin: introduce new_backup_provider() method
extract backup config: delegate to backup provider for storages that
support it
add backup provider example
WIP Borg plugin
src/PVE/API2/Storage/Config.pm | 2 +-
src/PVE/BackupProvider/Makefile | 3 +
src/PVE/BackupProvider/Plugin/Base.pm | 1158 +++++++++++++++++
src/PVE/BackupProvider/Plugin/Borg.pm | 439 +++++++
.../BackupProvider/Plugin/DirectoryExample.pm | 697 ++++++++++
src/PVE/BackupProvider/Plugin/Makefile | 5 +
src/PVE/Makefile | 1 +
src/PVE/Storage.pm | 33 +-
src/PVE/Storage/BorgBackupPlugin.pm | 595 +++++++++
.../Custom/BackupProviderDirExamplePlugin.pm | 307 +++++
src/PVE/Storage/Custom/Makefile | 5 +
src/PVE/Storage/Makefile | 2 +
src/PVE/Storage/Plugin.pm | 25 +
13 files changed, 3269 insertions(+), 3 deletions(-)
create mode 100644 src/PVE/BackupProvider/Makefile
create mode 100644 src/PVE/BackupProvider/Plugin/Base.pm
create mode 100644 src/PVE/BackupProvider/Plugin/Borg.pm
create mode 100644 src/PVE/BackupProvider/Plugin/DirectoryExample.pm
create mode 100644 src/PVE/BackupProvider/Plugin/Makefile
create mode 100644 src/PVE/Storage/BorgBackupPlugin.pm
create mode 100644 src/PVE/Storage/Custom/BackupProviderDirExamplePlugin.pm
create mode 100644 src/PVE/Storage/Custom/Makefile
qemu-server:
Fiona Ebner (9):
move nbd_stop helper to QMPHelpers module
backup: move cleanup of fleecing images to cleanup method
backup: cleanup: check if VM is running before issuing QMP commands
backup: keep track of block-node size for fleecing
backup: allow adding fleecing images also for EFI and TPM
backup: implement backup for external providers
restore: die early when there is no size for a device
backup: implement restore for external providers
backup restore: external: hardening check for untrusted source image
PVE/API2/Qemu.pm | 33 ++-
PVE/CLI/qm.pm | 3 +-
PVE/QemuServer.pm | 152 +++++++++++++-
PVE/QemuServer/QMPHelpers.pm | 6 +
PVE/VZDump/QemuServer.pm | 382 ++++++++++++++++++++++++++++++++---
5 files changed, 539 insertions(+), 37 deletions(-)
container:
Fiona Ebner (8):
create: add missing include of PVE::Storage::Plugin
backup: implement backup for external providers
create: factor out tar restore command helper
backup: implement restore for external providers
external restore: don't use 'one-file-system' tar flag when restoring
from a directory
create: factor out compression option helper
restore tar archive: check potentially untrusted archive
api: add early check against restoring privileged container from
external source
src/PVE/API2/LXC.pm | 14 +++
src/PVE/LXC/Create.pm | 284 +++++++++++++++++++++++++++++++++++++-----
src/PVE/VZDump/LXC.pm | 38 +++++-
3 files changed, 304 insertions(+), 32 deletions(-)
manager:
Fiona Ebner (2):
ui: backup: also check for backup subtype to classify archive
backup: implement backup for external providers
PVE/VZDump.pm | 57 ++++++++++++++++++++++++++----
test/vzdump_new_test.pl | 3 ++
www/manager6/Utils.js | 10 +++---
www/manager6/grid/BackupView.js | 4 +--
www/manager6/storage/BackupView.js | 4 +--
5 files changed, 63 insertions(+), 15 deletions(-)
Summary over all repositories:
34 files changed, 4949 insertions(+), 158 deletions(-)
--
Generated by git-murpp 0.5.0
More information about the pve-devel
mailing list