[pve-devel] [PATCH-SERIES v3] fix #4136: implement backup fleecing

Fiona Ebner f.ebner at proxmox.com
Thu Apr 11 11:29:21 CEST 2024


Changes in v3 (thanks to Wolfgang for feedback!):
    * Fix brittle code for permission check that only worked by
      chance.

Changes in v2 (thanks - not limited to - to Fabian and Alexandre for
feedback!):
    * Use v3 of "discard-source" upstream series (v4 was posted in the
      meantime but without any semantic change)
    * Add patches to specify minimum cluster size during backup, to
      allow discard to work even if fleecing image has larger cluster
      size than backup target.
    * Add permission check for fleecing storage.
    * Record fleecing image in config to be able to clean up after
      hard failure.
    * Do not use "same storage as image" as default fleecing storage.
    * Use qcow2 for fleecing image if storage supports it
    * Flesh out recommendations for fleecing storage in docs.

When a backup for a VM is started, QEMU will install a
"copy-before-write" filter in its block layer. This filter ensures
that upon new guest writes, old data still needed for the backup is
sent to the backup target first. The guest write blocks until this
operation is finished so guest IO to not-yet-backed-up sectors will be
limited by the speed of the backup target.

With backup fleecing, such old data is cached in a fleecing image
rather than sent directly to the backup target. This can help guest IO
performance and even prevent hangs in certain scenarios, at the cost
of requiring more storage space.

With this series it will be possible to enable backup-fleecing via
e.g. `vzdump 123 --fleecing enabled=1,storage=local-lvm` with fleecing
images created on the storage `local-lvm`. The fleecing storage should
be a fast local storage which supports thin-provisioning and discard.
If the storage supports qcow2, that is used as the fleecing image
format. If the underlying file system does not support discard, with
qcow2 and preallocation=off, at least already allocated parts of the
image can be re-used later.

Fleecing images are created by qemu-server via pve-storage and
attached to QEMU before the backup starts, and cleaned up after the
backup finished or failed. The naming schema for fleecing images is
'vm-ID-fleece-N(.FORMAT)'. The allocated images are recorded in the
guest configuration, so that even after a hard failure, clean-up can
be re-attempted. While not too bad, it's a non-trivial amount of code
and I'm not 100% sure about the cost-benefit, so sending those as RFC.

The fleecing image needs to be the exact same size as the source, but
luckily, an explicit size can be specified when attaching a raw image
to QEMU so there are no size issues when using storages that have
coarser allocation/round up. For qcow2, it seems that virtual size can
be nearly arbitrary (i.e. modulo 512 byte granularity) during
allocation.

While tests seem fine so far, most important part to review is the
setup of the backup job and bitmap handling inside QEMU.

QEMU patches are for the submodule for better reviewability. There are
two prerequisites (that are expected to be picked up by upstream at
some point):

1. For being able to discard the fleecing image, addition of a
discard-source parameter [0].

2. In combination with discard, cluster size issue when fleecing image
has a larger cluster size than backup target. Proposed workaround is
to be able to specify the minimum granularity for the backup job [1].


Dependencies:
pve-manager -> pve-guest-common -> pve-common
            \-> qemu-server

Plus new pve-qemu-kvm to actually be able to use the feature.

[0]: https://lore.kernel.org/qemu-devel/20240228141501.455989-1-vsementsov@yandex-team.ru/
[1]: https://lore.kernel.org/qemu-devel/20240308155158.830258-1-f.ebner@proxmox.com/


qemu:

Fiona Ebner (3):
  copy-before-write: allow specifying minimum cluster size
  backup: add minimum cluster size to performance options
  PVE backup: add fleecing option

Vladimir Sementsov-Ogievskiy (4):
  block/copy-before-write: fix permission
  block/copy-before-write: support unligned snapshot-discard
  block/copy-before-write: create block_copy bitmap in filter node
  qapi: blockdev-backup: add discard-source parameter

 block/backup.c                         |   5 +-
 block/block-copy.c                     |  29 ++++-
 block/copy-before-write.c              |  42 ++++++--
 block/copy-before-write.h              |   2 +
 block/monitor/block-hmp-cmds.c         |   1 +
 block/replication.c                    |   4 +-
 blockdev.c                             |   5 +-
 include/block/block-common.h           |   2 +
 include/block/block-copy.h             |   3 +
 include/block/block_int-global-state.h |   2 +-
 pve-backup.c                           | 143 ++++++++++++++++++++++++-
 qapi/block-core.json                   |  29 ++++-
 tests/qemu-iotests/257.out             | 112 +++++++++----------
 13 files changed, 298 insertions(+), 81 deletions(-)


common:

Fiona Ebner (1):
  json schema: add format description for pve-storage-id standard option

 src/PVE/JSONSchema.pm | 1 +
 1 file changed, 1 insertion(+)


guest-common:

Fiona Ebner (3):
  vzdump: schema: add fleecing property string
  vzdump: schema: make storage for fleecing semi-optional
  abstract config: do not copy fleecing images entry for snapshot

 src/PVE/AbstractConfig.pm |  1 +
 src/PVE/VZDump/Common.pm  | 37 +++++++++++++++++++++++++++++++++++++
 2 files changed, 38 insertions(+)


manager:

Fiona Ebner (3):
  vzdump: have property string helpers always return the result
  vzdump: handle new 'fleecing' property string
  api: backup/vzdump: add permission check for fleecing storage

 PVE/API2/Backup.pm | 10 ++++++++--
 PVE/API2/VZDump.pm |  9 +++++----
 PVE/VZDump.pm      | 22 ++++++++++++++++++++--
 3 files changed, 33 insertions(+), 8 deletions(-)


qemu-server:

Fiona Ebner (7):
  backup: disk info: also keep track of size
  backup: implement fleecing option
  parse config: allow config keys with minus sign
  schema: add fleecing-images config property
  vzdump: better cleanup fleecing images after hard errors
  migration: attempt to clean up potential left-over fleecing images
  destroy vm: clean up potential left-over fleecing images

 PVE/API2/Qemu.pm         |   9 +++
 PVE/QemuConfig.pm        |  40 ++++++++++
 PVE/QemuMigrate.pm       |   3 +
 PVE/QemuServer.pm        |  12 ++-
 PVE/VZDump/QemuServer.pm | 163 ++++++++++++++++++++++++++++++++++++++-
 5 files changed, 224 insertions(+), 3 deletions(-)


docs:

Fiona Ebner (1):
  vzdump: add section about backup fleecing

 vzdump.adoc | 38 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)


Summary over all repositories:
  25 files changed, 632 insertions(+), 92 deletions(-)

-- 
Generated by git-murpp 0.5.0




More information about the pve-devel mailing list