[pve-devel] [RFC qemu/guest-common/manager/qemu-server/docs 00/13] fix #4136: implement backup fleecing

DERUMIER, Alexandre alexandre.derumier at groupe-cyllene.com
Thu Jan 25 17:02:03 CET 2024

oh!!! Thanks you very much Fiona !!!

This is really the blocking feature for me, still not using pbs because
of this.

I'll try to build a lab for testing as soon as possible 
(I'm a bit busy with fosdem preparation)

I'l also  to test vm crash/host crash when backup is running, to see
how it's handled.

-------- Message initial --------
De: Fiona Ebner <f.ebner at proxmox.com>
Répondre à: Proxmox VE development discussion <pve-
devel at lists.proxmox.com>
À: pve-devel at lists.proxmox.com
Objet: [pve-devel] [RFC qemu/guest-common/manager/qemu-server/docs
00/13] fix #4136: implement backup fleecing
Date: 25/01/2024 15:41:36

When a backup for a VM is started, QEMU will install a
"copy-before-write" filter in its block layer. This filter ensures
that upon new guest writes, old data still needed for the backup is
sent to the backup target first. The guest write blocks until this
operation is finished so guest IO to not-yet-backed-up sectors will be
limited by the speed of the backup target.

With backup fleecing, such old data is cached in a fleecing image
rather than sent directly to the backup target. This can help guest IO
performance and even prevent hangs in certain scenarios, at the cost
of requiring more storage space.

With this series it will be possible to enable backup-fleecing via
e.g. `vzdump 123 --fleecing enabled=1,storage=local-zfs` with fleecing
images created on the storage `local-zfs`. If no storage is specified,
the fleecing image will be created on the same storage as the original

Fleecing images are created by qemu-server via pve-storage and
attached to QEMU before the backup starts, and cleaned up after the
backup finished or failed. Currently, just a "-fleecing(.raw)" suffix
is added and there is no special handling yet for e.g. qm rescan/etc..
And previous left-overs are not automatically cleaned up, because
while unlikely, images with this name might've been created by a user
too. Happy to discuss alternatives!

The fleecing image needs to be the exact same size as the source, but
luckily, an explicit size can be specified when attaching a raw image
to QEMU so there are no size issues when using storages that have
coarser allocation/round up.

While initial tests seem fine, bitmap handling needs to be carefully
checked for correctness. More eyeballs can't hurt there.

QEMU patches are for the submodule for better reviewability. There are
unfortunately a few prerequisites which are also still being worked on
upstream. These are:

Fix for qcow2 block status querying when used as a source image [0].
Already reviewed and being pulled.

For being able to discard the fleecing image, addition of a
discard-source parameter[1]. This series was adapted for downstream
and I tried to address the two remaining issues:

1. Permission issue when backup source node is read-only (e.g. TMP
state): Made permissions conditional for when discard-source is set
with a new option for the copy-before-write block driver. Currently,
it's part of QAPI, nicer would be to make it internal-only.

2. Cluster size issue when fleecing image has a larger cluster size
than backup target: Made a workaround by also considering source image
when calculating cluster size for block copy and had to hack
.bdrv_co_get_info implementations for snapshot-access and
copy-before-write. Not super confident and better to wait for an
answer from upstream.

Upstream reports/discussions for these can also be found at [1].

No hard dependencies AFAICS, but of course pve-manager should depend
on both new pve-guest-common and qemu-server to actually be able to
use the option.



Fiona Ebner (6):
  backup: factor out gathering device info into helper
  backup: get device info: code cleanup
  block/io: clear BDRV_BLOCK_RECURSE flag after recursing in
  block/{copy-before-write,snapshot-access}: implement bdrv_co_get_info
    driver callback
  block/block-copy: always consider source cluster size too
  PVE backup: add fleecing option

Vladimir Sementsov-Ogievskiy (2):
  block/copy-before-write: create block_copy bitmap in filter node
  qapi: blockdev-backup: add discard-source parameter

 block/backup.c                         |  15 +-
 block/block-copy.c                     |  36 ++--
 block/copy-before-write.c              |  46 ++++-
 block/copy-before-write.h              |   1 +
 block/io.c                             |  10 ++
 block/monitor/block-hmp-cmds.c         |   1 +
 block/replication.c                    |   4 +-
 block/snapshot-access.c                |   7 +
 blockdev.c                             |   2 +-
 include/block/block-copy.h             |   3 +-
 include/block/block_int-global-state.h |   2 +-
 pve-backup.c                           | 234 +++++++++++++++++++------
 qapi/block-core.json                   |  18 +-
 13 files changed, 300 insertions(+), 79 deletions(-)


Fiona Ebner (1):
  vzdump: schema: add fleecing property string

 src/PVE/VZDump/Common.pm | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)


Fiona Ebner (1):
  vzdump: handle new 'fleecing' property string

 PVE/VZDump.pm | 12 ++++++++++++
 1 file changed, 12 insertions(+)


Fiona Ebner (2):
  backup: disk info: also keep track of size
  backup: implement fleecing option

 PVE/VZDump/QemuServer.pm | 141 ++++++++++++++++++++++++++++++++++++++-
 1 file changed, 139 insertions(+), 2 deletions(-)


Fiona Ebner (1):
  vzdump: add section about backup fleecing

 vzdump.adoc | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

Summary over all repositories:
  17 files changed, 504 insertions(+), 0 deletions(-)

More information about the pve-devel mailing list