[pve-devel] [PATCH qemu/storage/qemu-server/container/manager v9 00/29] backup provider API

Wolfgang Bumiller w.bumiller at proxmox.com
Fri Apr 4 16:08:50 CEST 2025


Changes to v8 look good, did some basic testing of the affected code
with a my custom provider.

Tested-by: Wolfgang Bumiller <w.bumiller at proxmox.com>
Reviewed-by: Wolfgang Bumiller <w.bumiller at proxmox.com>

On Fri, Apr 04, 2025 at 03:31:35PM +0200, Fiona Ebner wrote:
> v8: https://lore.proxmox.com/pve-devel/20250403123118.264974-1-w.bumiller@proxmox.com/
> v7: https://lore.proxmox.com/pve-devel/20250401173435.221892-1-f.ebner@proxmox.com/
> v6: https://lore.proxmox.com/pve-devel/20250331132020.105324-1-f.ebner@proxmox.com/
> v5: https://lore.proxmox.com/pve-devel/20250321134852.103871-1-f.ebner@proxmox.com/
> v4: https://lore.proxmox.com/pve-devel/20241114150754.374376-1-f.ebner@proxmox.com/
> v3: https://lore.proxmox.com/pve-devel/20241107165146.125935-1-f.ebner@proxmox.com/
> 
> Changes in v9:
> * Drop already applied patches.
> * Drop bitmap-name parameter from backup-access-setup QMP call, always
>   use target ID.
> * Fix file_size_info() call in directory example plugin (thanks
>   Friedrich/Wolfgang!)
> * Move construction of $devices parameter for querying bitmap mode to
>   after where device size is actually available.
> 
> Changes in v8 (thanks Wolfgang!):
> * Use per-device bitmap mode.
> * Improve doc style.
> 
> Changes in v7:
> * Verify result from backup_init(). Document allowed characters, i.e.
>   $PVE::Storage::SAFE_CHAR_CLASS_RE + slash.
> * Move backup_container_perpare() to directly above backup_container()
>   in the base backup provider plugin.
> * Document limitation of backup_container() method not being able to
>   persistently modify $self, because it runs in a fork().
> * Support per-device bitmap names. Changed the QMP interface for
>   setup-backup-access to take an actual list of device information
>   rather than just a single string (device list). This also makes it
>   more extensible. Added a new backup_vm_available_bitmaps() method
>   to the backup provider API. The backup provider can check on the
>   server which bitmaps can be used in principle and tell us to try.
>   Acutal bitmap mode still will be what's passed to backup_vm(),
>   because the requested bitmap might not exist (anymore). This also
>   means the get_backup_mechanism() method will return only the
>   mechanism and not the bitmap name.
> * Improve documentation relating to snapshot-access in context of
>   dirty bitmap and reference NBD dirty-bitmap metacontext
>   documentation.
> * Dropped already applied pve-common patch for fallocate syscall.
> 
> Changes in v6:
> * Factor out some helpers in QEMU backup code.
> * Import constants in LXC::Namespace module from PVE::Tools instead of
>   re-declaring.
> * Require that source for 'qemu-img' restore mechanisms is in raw
>   format.
> * Document that methods for extracting configurations are called
>   without restore_*_init() first.
> * Rename restore_get_*_config() methods to archive_get_*_config().
> * Replace job/backup hooks with dedicated methods,
>   {job,backup}_{init,cleanup} and backup_container_prepare() for the
>   privileged part right before backup_container() is called.
> * Remove backup_get_archive_name() method. The archive name is now
>   returned as part of the backup_init() method. That was also a bit of
>   a misnomer, since it was necessary for the backup provider to
>   remember that archive name for later.
> * Supersede backup_get_task_size() method and return stats as part of
>   the backup_cleanup() method (in the success case). Currently, this
>   only includes the archive size as well, but could be extended later
>   to allow custom stats to include in the task notification summary.
> * Drop reference to vm_backup_get_device_info() method in docs/cover
>   letter. This was dropped in an earlier version.
> * Squash implementation of external backup in qemu-server together
>   with patch with untrusted check.
> 
> Changes in v5:
> * Drop already applied patches.
> * Rebase on latest master.
> * Set finishing state and end time in backup state in QEMU to avoid
>   wrongly detecting previous backup as still active leading to a
>   warning.
> * Unbreak container restore with a bandwidth limit.
> * Switch from 'block-device' mechanism to 'file-handle', providing
>   access to the image contents as a (virtual) file created with
>   'nbdfuse'. The reason is that block devices created via qemu-nbd
>   will lead to a lot of ugly error messages if the device is
>   disconnected before partition probing is finished (which can be
>   delayed even until the end of the backup, apparently there is a
>   lock). This also avoids the need to load the 'nbd' kernel module
>   and not exposing the devices as host block devices is just nicer,
>   potentially also security-wise. This requires using the fallocate
>   syscall instead of the previous BLKDISCARD ioctl. A helper for that
>   is provided via the Storage/Common.pm module.
> * Indicate support for QEMU feature via 'backup-access-api' flag
>   returned by 'query-proxmox-support' QMP command rather than
>   requiring a version guard, so it can land whenever.
> * Update storage's ApiChangelog, split out API age+version bump into
>   its own patch.
> * Let plugins define their own list of sensitive properties.
> * Add postinst snippet to load NBD module during qemu-server upgrade.
> * Check nbd/parameters directory instead of nbd/coresize to determine
>   whether the module is loaded. coresize was just picked on a whim,
>   parameters seems cleaner.
> * Handle edge case with size not being a single number for character
>   devices when parsing/checking tar contents.
> * Borg plugin:
>   * improve SSH options/handling
>   * improve handling and permissions of run and ssh directories
>   * mount borg archives to subdirectory of run dir
>   * add helper for common command environment
>   * move to Custom subfolder
> * Directory example plugin:
>   * use NBD device node reservation mechanism.
>   * die if NBD module is not loaded.
>   * Unbreak logging warnings by using the correct function.
> 
> Changes in v4:
> * Drop already applied patches.
> * Rework run_in_userns() and related helpers.
> * Improve child PID handling for VM 'block-device' method backup
> * Improve child PID and volume cleanup for VM 'block-device' method
>   backup
> * Improve /dev/nbdX handling by adding reservation and checks
> * Add modules-load and modprobe configs for 'nbd' module
> * Add some more logging
> 
> Changes in v3:
> * Add storage_has_feature() helper and use it to decide on whether the
>   storage uses a backup provider, instead of having this be implicit
>   with whether a backup provider is returned by new_backup_provider().
> * Fix querying block-node size for fleecing in stop mode, by issuing
>   the QMP command only after the VM is enforced running.
> * Run backup_container() in user namespace associated to the
>   container.
> * And introduce 'prepare' phase for backup_hook() to be used to
>   prepare for running in that user namespace context.
> * Pass in guest and firewall config as raw data instead of by file
>   name (so files don't have to be accessible in user namespace context
>   for containers).
> * Run restore of containers with 'directory' mechanism in user
>   namespace switching from 'rsync' to 'tar' which is easier to "split"
>   into a privileged and unprivileged half.
> * Check potentially untrusted tar archives.
> * Borg plugin: make SSH work and use that.
> 
> Changes in v2:
> * Add 'block-device' backup mechansim for VMs. The NBD export is
>   mounted by Proxmox VE and only the block device path (as well as a
>   callback to get the next dirty range for bitmaps) is passed to the
>   backup provider.
> * Add POC example for Borg - note that I tested with borg 1.2.4 in
>   Debian and only tested with a local repository, not SSH yet.
> * Merge hook API into a single function for backup and for jobs.
> * Add restore_vm_init() and restore_vm_cleanup() for better
>   flexibility to allow preparing the whole restore. Question is
>   if restore_vm_volume_init() and restore_vm_volume_cleanup() should
>   be dropped (but certain providers might prefer using only those)?
>   Having both is more flexible, but makes the API longer of course.
> * Switch to backup_vm() (was per-volume backup_vm_volume() before) and
>   backup_container(), passing along the configuration files, rather
>   than having dedicated methods for the configuration files, for
>   giving the backup provider more flexibility.
> * Some renames in API methods/params to improve clarity.
> * Pass backup time to backup 'start' hook and use that in the
>   directory example rather than the job start time.
> * Use POD for base plugin documentation and flesh out documentation.
> * Use 'BackupProvider::Plugin::' namespace.
> * Various smaller improvements in the directory provider example.
> 
> ======
> 
> A backup provider needs to implement a storage plugin as well as a
> backup provider plugin. The storage plugin is for integration in
> Proxmox VE's front-end, so users can manage the backups via
> UI/API/CLI. The backup provider plugin is for interfacing with the
> backup provider's backend to integrate backup and restore with that
> backend into Proxmox VE.
> 
> This is an initial draft of an API and required changes to the backup
> stack in Proxmox VE to make it work. Depending on feedback from other
> developers and interested parties, it can still substantially change.
> 
> ======
> 
> The backup provider API is split into two parts, both of which again
> need different implementations for VM and LXC guests:
> 
> 1. Backup API
> 
> In Proxmox VE, a backup job consists of backup tasks for individual guests.
> There are methods for initialization and cleanup of the job, i.e. job_init() and
> job_cleanup() and for each guest backup, i.e. backup_init() and
> backup_cleanup().
> 
> The backup_get_mechanism() method is used to decide on the backup mechanism.
> Currently, 'file-handle' or 'nbd' for VMs, and 'directory' for containers is
> possible. The method also let's the plugin indicate whether to use a bitmap for
> incremental VM backup or not. It is enough to implement one mechanism for VMs
> and one mechanism for containers.
> 
> Next, there are methods for backing up the guest's configuration and data,
> ackup_vm() for VM backup and backup_container() for container backup.
> 
> Finally, some helpers like provider_name() for getting the name of the backup
> provider and backup_handle_log_file() for handling the backup task log.
> 
> The backup transaction looks as follows:
> 
> First, job_init() is called that can be used to check backup server availability
> nd prepare the connection. Then for each guest backup_init() followed by
> backup_vm() or backup_container() and finally backup_cleanup(). Afterwards
> job_cleanup() is called. For containers, there is an additional
> backup_container_prepare() call while still privileged. The actual
> backup_container() call happens as the (unprivileged) container root user, so
> that the file owner and group IDs match the container's perspective.
> 
> 1.1 Backup Mechanisms
> 
> VM:
> 
> Access to the data on the VM's disk from the time the backup started
> is made available via a so-called "snapshot access". This is either
> the full image, or in case a bitmap is used, the dirty parts of the
> image since the last time the bitmap was used for a successful backup.
> Reading outside of the dirty parts will result in an error. After
> backing up each part of the disk, it should be discarded in the export
> to avoid unnecessary space usage on the Proxmox VE side (there is an
> associated fleecing image).
> 
> VM mechanism 'file-handle':
> 
> The snapshot access is exposed via a file descriptor. A subroutine to
> read the dirty regions for incremental backup is provided as well.
> 
> VM mechanism 'nbd':
> 
> The snapshot access and, if used, bitmap are exported via NBD.
> 
> Container mechanism 'directory':
> 
> A copy or snapshot of the container's filesystem state is made
> available as a directory. The method is executed inside the user
> namespace associated to the container.
> 
> 2. Restore API
> 
> The restore_get_mechanism() method is used to decide on the restore
> mechanism. Currently, 'qemu-img' for VMs, and 'directory' or 'tar' for
> containers are possible. It is enough to implement one mechanism for
> VMs and one mechanism for containers.
> 
> Next, methods for extracting the guest and firewall configuration and
> the implementations of the restore mechanism via a pair of methods: an
> init method, for making the data available to Proxmox VE and a cleanup
> method that is called after restore.
> 
> 2.1. Restore Mechanisms
> 
> VM mechanism 'qemu-img':
> 
> The backup provider gives a path to the disk image that will be
> restored. The path needs to be something 'qemu-img' can deal with,
> e.g. can also be an NBD URI or similar.
> 
> Container mechanism 'directory':
> 
> The backup provider gives the path to a directory with the full
> filesystem structure of the container.
> 
> Container mechanism 'tar':
> 
> The backup provider gives the path to a (potentially compressed) tar
> archive with the full filesystem structure of the container.
> 
> See the PVE::BackupProvider::Plugin module for the full API
> documentation.
> 
> ======
> 
> This series adapts the backup stack in Proxmox VE to allow using the
> above API. For QEMU, backup access setup and teardown QMP commands are
> implemented to be able to provide access to a consistent disk state to
> the backup provider.
> 
> The series also provides an example implementation for a backup
> provider as a proof-of-concept, exposing the different features.
> 
> ======
> 
> Open questions:
> 
> Should the backup provider plugin system also follow the same API
> age+version schema with a Custom/ directory for external plugins
> derived from the base plugin?
> 
> Should the bitmap action be passed directly to the backup provider?
> I.e. have 'not-used', 'not-used-removed', 'new', 'used', 'invalid',
> instead of only 'none', 'new' and 'reuse'. It makes API slightly more
> complicated. Is there any situation where backup provider could care
> if bitmap is new, because it was the first or bitmap is new because
> previous was invalid? Both cases require the backup provider to do a
> full backup.
> 
> ======
> 
> Feedback is very welcome, especially from people wishing to implement
> such a backup provider plugin! Please tell me what issues you see with
> the proposed API, what would and wouldn't work from your perspective?
> 
> ======
> 
> Dependencies: libpve-storage-perl depends on pve-common. pve-manager,
> pve-container and qemu-server all depend on new libpve-storage-perl.
> pve-manager also build-depends on the new libpve-storage-perl for its
> tests. pve-container depends on new pve-common, i.e. 8.2.9 for the
> "tools: run fork: allow running code in parent after fork"
> functionality. To keep things clean, pve-manager should also depend on
> new pve-container and qemu-server.
> 
> ======
> 
> qemu:
> 
> Fiona Ebner (1):
>   PVE backup: backup-access api: simplify bitmap logic
> 
>  pve-backup.c         | 72 ++++++++++++--------------------------------
>  qapi/block-core.json | 15 ++++-----
>  2 files changed, 26 insertions(+), 61 deletions(-)
> 
> 
> storage:
> 
> Fiona Ebner (8):
>   add storage_has_feature() helper function
>   common: add deallocate helper function
>   plugin: introduce new_backup_provider() method
>   config api/plugins: let plugins define sensitive properties themselves
>   plugin api: bump api version and age
>   extract backup config: delegate to backup provider for storages that
>     support it
>   add backup provider example
>   Borg example plugin
> 
>  ApiChangeLog                                  |   32 +
>  src/PVE/API2/Storage/Config.pm                |    4 +-
>  src/PVE/BackupProvider/Makefile               |    3 +
>  src/PVE/BackupProvider/Plugin/Base.pm         | 1126 +++++++++++++++++
>  src/PVE/BackupProvider/Plugin/Borg.pm         |  466 +++++++
>  .../BackupProvider/Plugin/DirectoryExample.pm |  809 ++++++++++++
>  src/PVE/BackupProvider/Plugin/Makefile        |    5 +
>  src/PVE/Makefile                              |    1 +
>  src/PVE/Storage.pm                            |   31 +-
>  src/PVE/Storage/BTRFSPlugin.pm                |    1 +
>  src/PVE/Storage/CIFSPlugin.pm                 |    1 +
>  src/PVE/Storage/CephFSPlugin.pm               |    1 +
>  src/PVE/Storage/Common.pm                     |   29 +
>  .../Custom/BackupProviderDirExamplePlugin.pm  |  308 +++++
>  src/PVE/Storage/Custom/BorgBackupPlugin.pm    |  689 ++++++++++
>  src/PVE/Storage/Custom/Makefile               |    6 +
>  src/PVE/Storage/DirPlugin.pm                  |    1 +
>  src/PVE/Storage/ESXiPlugin.pm                 |    1 +
>  src/PVE/Storage/GlusterfsPlugin.pm            |    1 +
>  src/PVE/Storage/ISCSIDirectPlugin.pm          |    1 +
>  src/PVE/Storage/ISCSIPlugin.pm                |    1 +
>  src/PVE/Storage/LVMPlugin.pm                  |    1 +
>  src/PVE/Storage/LvmThinPlugin.pm              |    1 +
>  src/PVE/Storage/Makefile                      |    1 +
>  src/PVE/Storage/NFSPlugin.pm                  |    1 +
>  src/PVE/Storage/PBSPlugin.pm                  |    5 +
>  src/PVE/Storage/Plugin.pm                     |   37 +
>  src/PVE/Storage/RBDPlugin.pm                  |    1 +
>  src/PVE/Storage/ZFSPlugin.pm                  |    1 +
>  src/PVE/Storage/ZFSPoolPlugin.pm              |    1 +
>  30 files changed, 3562 insertions(+), 4 deletions(-)
>  create mode 100644 src/PVE/BackupProvider/Makefile
>  create mode 100644 src/PVE/BackupProvider/Plugin/Base.pm
>  create mode 100644 src/PVE/BackupProvider/Plugin/Borg.pm
>  create mode 100644 src/PVE/BackupProvider/Plugin/DirectoryExample.pm
>  create mode 100644 src/PVE/BackupProvider/Plugin/Makefile
>  create mode 100644 src/PVE/Storage/Custom/BackupProviderDirExamplePlugin.pm
>  create mode 100644 src/PVE/Storage/Custom/BorgBackupPlugin.pm
>  create mode 100644 src/PVE/Storage/Custom/Makefile
> 
> 
> qemu-server:
> 
> Fiona Ebner (11):
>   backup: keep track of block-node size for fleecing
>   backup: fleecing: use exact size when allocating non-raw fleecing
>     images
>   backup: allow adding fleecing images also for EFI and TPM
>   backup: implement backup for external providers
>   test: qemu img convert: add test cases for snapshots
>   image convert: collect options in hash argument
>   image convert: allow caller to specify the format of the source path
>   backup: implement restore for external providers
>   backup: future-proof checks for QEMU feature support
>   backup: support 'missing-recreated' bitmap action
>   backup: bitmap action to human: lie about TPM state
> 
>  PVE/API2/Qemu.pm                   |  30 +-
>  PVE/QemuServer.pm                  | 174 ++++++++++-
>  PVE/QemuServer/ImportDisk.pm       |   3 +-
>  PVE/VZDump/QemuServer.pm           | 475 ++++++++++++++++++++++++++++-
>  test/run_qemu_img_convert_tests.pl |  72 +++--
>  5 files changed, 716 insertions(+), 38 deletions(-)
> 
> 
> container:
> 
> Fiona Ebner (7):
>   add LXC::Namespaces module
>   backup: implement backup for external providers
>   backup: implement restore for external providers
>   external restore: don't use 'one-file-system' tar flag when restoring
>     from a directory
>   create: factor out compression option helper
>   restore tar archive: check potentially untrusted archive
>   api: add early check against restoring privileged container from
>     external source
> 
>  src/PVE/API2/LXC.pm       |  14 ++
>  src/PVE/LXC/Create.pm     | 273 +++++++++++++++++++++++++++++++++++---
>  src/PVE/LXC/Makefile      |   1 +
>  src/PVE/LXC/Namespaces.pm |  57 ++++++++
>  src/PVE/VZDump/LXC.pm     |  40 +++++-
>  5 files changed, 362 insertions(+), 23 deletions(-)
>  create mode 100644 src/PVE/LXC/Namespaces.pm
> 
> 
> manager:
> 
> Fiona Ebner (2):
>   ui: backup: also check for backup subtype to classify archive
>   backup: implement backup for external providers
> 
>  PVE/VZDump.pm                      | 61 ++++++++++++++++++++++++++----
>  test/vzdump_new_test.pl            |  3 ++
>  www/manager6/Utils.js              | 10 +++--
>  www/manager6/grid/BackupView.js    |  4 +-
>  www/manager6/storage/BackupView.js |  4 +-
>  5 files changed, 67 insertions(+), 15 deletions(-)
> 
> 
> Summary over all repositories:
>   47 files changed, 4733 insertions(+), 141 deletions(-)
> 
> -- 
> Generated by git-murpp 0.5.0
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel at lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 




More information about the pve-devel mailing list