[pve-devel] [RFC qemu 1/1] add patch for incremental drive-mirror

Fabian Grünbichler f.gruenbichler at proxmox.com
Thu Jul 18 14:43:45 CEST 2019


Signed-off-by: Fabian Grünbichler <f.gruenbichler at proxmox.com>
---

Notes:
    this is a WIP patch from qemu-devel, see
    
    <20181219065020.81256-1-mahaocong_work at 163.com>
    <20181220083839.85523-1-mahaocong_work at 163.com>
    <20190102100415.24680-1-mahaocong_work at 163.com>
    <20190131040154.3770-1-mahaocong_work at 163.com>
    <20190131121715.22954-1-mahaocong_work at 163.com>
    <20190214052809.44336-1-mahaocong_work at 163.com>
    <20190214064312.44794-1-mahaocong_work at 163.com>
    <20190221155521.77094-1-mahaocong_work at 163.com>
    
    for discussions and
    
    <20170504105444.8940-1-daniel.kucera at gmail.com>
    
    for an older, similar patch from a different author but with the
    same motivation (ZFS replication) ;)
    
    if we want to include this we should probably first get it in an upstreamable
    state and push for inclusion in qemu proper, so that we can backport it..

 ...04-drive-mirror-add-incremental-mode.patch | 329 ++++++++++++++++++
 debian/patches/series                         |   1 +
 2 files changed, 330 insertions(+)
 create mode 100644 debian/patches/extra/0004-drive-mirror-add-incremental-mode.patch

diff --git a/debian/patches/extra/0004-drive-mirror-add-incremental-mode.patch b/debian/patches/extra/0004-drive-mirror-add-incremental-mode.patch
new file mode 100644
index 0000000..8bbeaa6
--- /dev/null
+++ b/debian/patches/extra/0004-drive-mirror-add-incremental-mode.patch
@@ -0,0 +1,329 @@
+From fbe68abe71867ffac74aa82a1e56f2a697827cf8 Mon Sep 17 00:00:00 2001
+From: mahaocong <mahaocong at didichuxing.com>
+Date: Thu, 14 Feb 2019 14:43:12 +0800
+Subject: [PATCH qemu] drive-mirror: add incremental mode
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+This patch adds possibility to start mirroring with user-created-bitmap.
+On full mode, mirror create a non-named-bitmap by scanning whole block-chain,
+and on top mode, mirror create a bitmap by scanning the top block layer. So I
+think I can copy a user-created-bitmap and use it as the initial state of the
+mirror, the same as incremental mode drive-backup, and I call this new mode
+as incremental mode drive-mirror.
+
+A possible usage scene of incremental mode mirror is live migration. For maintain
+the block data and recover after a malfunction, someone may backup data to ceph
+or other distributed storage. On qemu incremental backup, we need to create a new
+bitmap and attach to block device before the first backup job. Then the bitmap
+records the change after the backup job. If we want to migration this vm, we can
+migrate block data between source and destionation by using drive-mirror directly,
+or use backup data and backup-bitmap to reduce the data should be synchronize.
+To do this, we should first create a new image in destination and set backing file
+as backup image, then set backup-bitmap as the initial state of incremental mode
+drive-mirror, and synchronize dirty block starting with the state set by the last
+incremental backup job. When the mirror complete, we get an active layer on destination,
+and its backing file is backup image on ceph. Then we can do live copy data from
+backing files into overlay images by using block-stream, or do backup continually.
+
+In this scene, It seems that If the guest os doesn't write too many data after the
+last backup, the incremental mode may transmit less data than full mode or top
+mode. However, if the write data is too many, there is no advantage on incremental
+mode compare with other mode.
+
+This scene can be described as following steps:
+1.create rbd image in ceph, and map nbd device with rbd image.
+2.create a new bitmap and attach to block device.
+3.do full mode backup on nbd device and thus sync it to the rbd image.
+4.severl times incremental mode backup.
+5.create new image in destination and set its backing file as backup image.
+6.do live-migration, and migrate block data by incremental mode drive-mirror
+  with bitmap created from step 2.
+
+Signed-off-by: Ma Haocong <mahaocong at didichuxing.com>
+Signed-off-by: Fabian Grünbichler <f.gruenbichler at proxmox.com>
+---
+ include/block/block_int.h |  3 ++-
+ block/mirror.c            | 47 +++++++++++++++++++++++++++++----------
+ blockdev.c                | 36 ++++++++++++++++++++++++++++--
+ qapi/block-core.json      | 14 ++++++++++--
+ 4 files changed, 83 insertions(+), 17 deletions(-)
+
+diff --git a/include/block/block_int.h b/include/block/block_int.h
+index 01e855a066..537ad23476 100644
+--- a/include/block/block_int.h
++++ b/include/block/block_int.h
+@@ -1122,7 +1122,8 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
+                   BlockDriverState *target, const char *replaces,
+                   int creation_flags, int64_t speed,
+                   uint32_t granularity, int64_t buf_size,
+-                  MirrorSyncMode mode, BlockMirrorBackingMode backing_mode,
++                  MirrorSyncMode mode, BdrvDirtyBitmap *src_bitmap,
++                  BlockMirrorBackingMode backing_mode,
+                   BlockdevOnError on_source_error,
+                   BlockdevOnError on_target_error,
+                   bool unmap, const char *filter_node_name,
+diff --git a/block/mirror.c b/block/mirror.c
+index ff15cfb197..eb8ff34d5b 100644
+--- a/block/mirror.c
++++ b/block/mirror.c
+@@ -50,6 +50,7 @@ typedef struct MirrorBlockJob {
+     /* Used to block operations on the drive-mirror-replace target */
+     Error *replace_blocker;
+     bool is_none_mode;
++    BdrvDirtyBitmap *src_bitmap;
+     BlockMirrorBackingMode backing_mode;
+     MirrorCopyMode copy_mode;
+     BlockdevOnError on_source_error, on_target_error;
+@@ -821,6 +822,16 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
+     return 0;
+ }
+ 
++/*
++ * init dirty bitmap by using user bitmap. usr->hbitmap will be copy to
++ * mirror bitmap->hbitmap instead of reuse it.
++ */
++static void coroutine_fn mirror_dirty_init_incremental(MirrorBlockJob *s,
++                                                       Error **errp)
++{
++    bdrv_merge_dirty_bitmap(s->dirty_bitmap, s->src_bitmap, NULL, errp);
++}
++
+ /* Called when going out of the streaming phase to flush the bulk of the
+  * data to the medium, or just before completing.
+  */
+@@ -846,6 +857,7 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
+     char backing_filename[2]; /* we only need 2 characters because we are only
+                                  checking for a NULL string */
+     int ret = 0;
++    Error *local_err = NULL;
+ 
+     if (job_is_cancelled(&s->common.job)) {
+         goto immediate_exit;
+@@ -920,9 +932,19 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
+ 
+     s->last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
+     if (!s->is_none_mode) {
+-        ret = mirror_dirty_init(s);
+-        if (ret < 0 || job_is_cancelled(&s->common.job)) {
+-            goto immediate_exit;
++        /* incremental mode */
++        if (s->src_bitmap) {
++            mirror_dirty_init_incremental(s, &local_err);
++            if (local_err) {
++                error_propagate(errp, local_err);
++                ret = -1;
++                goto immediate_exit;
++            }
++        } else {
++            ret = mirror_dirty_init(s);
++            if (ret < 0 || job_is_cancelled(&s->common.job)) {
++                goto immediate_exit;
++            }
+         }
+     }
+ 
+@@ -1502,7 +1524,8 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
+                              BlockCompletionFunc *cb,
+                              void *opaque,
+                              const BlockJobDriver *driver,
+-                             bool is_none_mode, BlockDriverState *base,
++                             bool is_none_mode, BdrvDirtyBitmap *src_bitmap,
++                             BlockDriverState *base,
+                              bool auto_complete, const char *filter_node_name,
+                              bool is_mirror, MirrorCopyMode copy_mode,
+                              Error **errp)
+@@ -1617,6 +1640,7 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
+     s->on_source_error = on_source_error;
+     s->on_target_error = on_target_error;
+     s->is_none_mode = is_none_mode;
++    s->src_bitmap = src_bitmap;
+     s->backing_mode = backing_mode;
+     s->copy_mode = copy_mode;
+     s->base = base;
+@@ -1698,7 +1722,8 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
+                   BlockDriverState *target, const char *replaces,
+                   int creation_flags, int64_t speed,
+                   uint32_t granularity, int64_t buf_size,
+-                  MirrorSyncMode mode, BlockMirrorBackingMode backing_mode,
++                  MirrorSyncMode mode, BdrvDirtyBitmap *src_bitmap,
++                  BlockMirrorBackingMode backing_mode,
+                   BlockdevOnError on_source_error,
+                   BlockdevOnError on_target_error,
+                   bool unmap, const char *filter_node_name,
+@@ -1707,17 +1732,14 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
+     bool is_none_mode;
+     BlockDriverState *base;
+ 
+-    if (mode == MIRROR_SYNC_MODE_INCREMENTAL) {
+-        error_setg(errp, "Sync mode 'incremental' not supported");
+-        return;
+-    }
+     is_none_mode = mode == MIRROR_SYNC_MODE_NONE;
+     base = mode == MIRROR_SYNC_MODE_TOP ? backing_bs(bs) : NULL;
+     mirror_start_job(job_id, bs, creation_flags, target, replaces,
+                      speed, granularity, buf_size, backing_mode,
+                      on_source_error, on_target_error, unmap, NULL, NULL,
+-                     &mirror_job_driver, is_none_mode, base, false,
+-                     filter_node_name, true, copy_mode, errp);
++                     &mirror_job_driver, is_none_mode,
++                     src_bitmap, base, false, filter_node_name, true,
++                     copy_mode, errp);
+ }
+ 
+ void commit_active_start(const char *job_id, BlockDriverState *bs,
+@@ -1741,7 +1763,8 @@ void commit_active_start(const char *job_id, BlockDriverState *bs,
+     mirror_start_job(job_id, bs, creation_flags, base, NULL, speed, 0, 0,
+                      MIRROR_LEAVE_BACKING_CHAIN,
+                      on_error, on_error, true, cb, opaque,
+-                     &commit_active_job_driver, false, base, auto_complete,
++                     &commit_active_job_driver, false,
++                     NULL, base, auto_complete,
+                      filter_node_name, false, MIRROR_COPY_MODE_BACKGROUND,
+                      &local_err);
+     if (local_err) {
+diff --git a/blockdev.c b/blockdev.c
+index 4775a07d93..195620851e 100644
+--- a/blockdev.c
++++ b/blockdev.c
+@@ -3685,6 +3685,8 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
+                                    BlockDriverState *target,
+                                    bool has_replaces, const char *replaces,
+                                    enum MirrorSyncMode sync,
++                                   bool has_bitmap,
++                                   const char *bitmap_name,
+                                    BlockMirrorBackingMode backing_mode,
+                                    bool has_speed, int64_t speed,
+                                    bool has_granularity, uint32_t granularity,
+@@ -3702,6 +3704,7 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
+                                    Error **errp)
+ {
+     int job_flags = JOB_DEFAULT;
++    BdrvDirtyBitmap *src_bitmap = NULL;
+ 
+     if (!has_speed) {
+         speed = 0;
+@@ -3724,6 +3727,10 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
+     if (!has_filter_node_name) {
+         filter_node_name = NULL;
+     }
++    if (!has_bitmap) {
++        bitmap_name = NULL;
++    }
++
+     if (!has_copy_mode) {
+         copy_mode = MIRROR_COPY_MODE_BACKGROUND;
+     }
+@@ -3788,13 +3795,35 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
+             return;
+         }
+     }
++    if (!bitmap_name && (sync == MIRROR_SYNC_MODE_INCREMENTAL)) {
++        error_setg(errp, "incremental mode must specify the bitmap name");
++        return;
++    }
++    /*
++     * In incremental mode, we should create null name bitmap by
++     * using user bitmap's granularity.
++     */
++    if (sync == MIRROR_SYNC_MODE_INCREMENTAL) {
++        assert(bitmap_name);
++        src_bitmap = bdrv_find_dirty_bitmap(bs, bitmap_name);
++        if (!src_bitmap) {
++            error_setg(errp, "Error: can't find dirty bitmap "
++                       "before start incremental drive-mirror");
++            return;
++        }
++        if (granularity) {
++            warn_report("On incremental mode, granularity is unused, "
++                        "the bitmap's granularity is used instead");
++        }
++        granularity = bdrv_dirty_bitmap_granularity(src_bitmap);
++    }
+ 
+     /* pass the node name to replace to mirror start since it's loose coupling
+      * and will allow to check whether the node still exist at mirror completion
+      */
+     mirror_start(job_id, bs, target,
+                  has_replaces ? replaces : NULL, job_flags,
+-                 speed, granularity, buf_size, sync, backing_mode,
++                 speed, granularity, buf_size, sync, src_bitmap, backing_mode,
+                  on_source_error, on_target_error, unmap, filter_node_name,
+                  copy_mode, errp);
+ }
+@@ -3914,6 +3943,7 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
+ 
+     blockdev_mirror_common(arg->has_job_id ? arg->job_id : NULL, bs, target_bs,
+                            arg->has_replaces, arg->replaces, arg->sync,
++                           arg->has_bitmap, arg->bitmap,
+                            backing_mode, arg->has_speed, arg->speed,
+                            arg->has_granularity, arg->granularity,
+                            arg->has_buf_size, arg->buf_size,
+@@ -3935,6 +3965,7 @@ void qmp_blockdev_mirror(bool has_job_id, const char *job_id,
+                          const char *device, const char *target,
+                          bool has_replaces, const char *replaces,
+                          MirrorSyncMode sync,
++                         bool has_bitmap, const char *bitmap,
+                          bool has_speed, int64_t speed,
+                          bool has_granularity, uint32_t granularity,
+                          bool has_buf_size, int64_t buf_size,
+@@ -3971,7 +4002,8 @@ void qmp_blockdev_mirror(bool has_job_id, const char *job_id,
+     bdrv_set_aio_context(target_bs, aio_context);
+ 
+     blockdev_mirror_common(has_job_id ? job_id : NULL, bs, target_bs,
+-                           has_replaces, replaces, sync, backing_mode,
++                           has_replaces, replaces, sync, has_bitmap,
++                           bitmap, backing_mode,
+                            has_speed, speed,
+                            has_granularity, granularity,
+                            has_buf_size, buf_size,
+diff --git a/qapi/block-core.json b/qapi/block-core.json
+index 7ccbfff9d0..7074d73df9 100644
+--- a/qapi/block-core.json
++++ b/qapi/block-core.json
+@@ -1914,6 +1914,11 @@
+ #        (all the disk, only the sectors allocated in the topmost image, or
+ #        only new I/O).
+ #
++# @bitmap: The name of a bitmap to use in incremental mode. This argument must
++#          be present for incremental mode and absent otherwise. In incremental
++#          mode, granularity is unused, the bitmap's granularity is used instead
++#          (since 4.0).
++#
+ # @granularity: granularity of the dirty bitmap, default is 64K
+ #               if the image format doesn't have clusters, 4K if the clusters
+ #               are smaller than that, else the cluster size.  Must be a
+@@ -1955,7 +1960,7 @@
+ { 'struct': 'DriveMirror',
+   'data': { '*job-id': 'str', 'device': 'str', 'target': 'str',
+             '*format': 'str', '*node-name': 'str', '*replaces': 'str',
+-            'sync': 'MirrorSyncMode', '*mode': 'NewImageMode',
++            'sync': 'MirrorSyncMode', '*bitmap': 'str', '*mode': 'NewImageMode',
+             '*speed': 'int', '*granularity': 'uint32',
+             '*buf-size': 'int', '*on-source-error': 'BlockdevOnError',
+             '*on-target-error': 'BlockdevOnError',
+@@ -2210,6 +2215,11 @@
+ #        (all the disk, only the sectors allocated in the topmost image, or
+ #        only new I/O).
+ #
++# @bitmap: The name of a bitmap to use in incremental mode. This argument must
++#          be present for incremental mode and absent otherwise. In incremental
++#          mode, granularity is unused, the bitmap's granularity is used instead
++#          (since 4.0).
++#
+ # @granularity: granularity of the dirty bitmap, default is 64K
+ #               if the image format doesn't have clusters, 4K if the clusters
+ #               are smaller than that, else the cluster size.  Must be a
+@@ -2262,7 +2272,7 @@
+ { 'command': 'blockdev-mirror',
+   'data': { '*job-id': 'str', 'device': 'str', 'target': 'str',
+             '*replaces': 'str',
+-            'sync': 'MirrorSyncMode',
++            'sync': 'MirrorSyncMode', '*bitmap': 'str',
+             '*speed': 'int', '*granularity': 'uint32',
+             '*buf-size': 'int', '*on-source-error': 'BlockdevOnError',
+             '*on-target-error': 'BlockdevOnError',
+-- 
+2.20.1
+
diff --git a/debian/patches/series b/debian/patches/series
index 549bc52..2b8f645 100644
--- a/debian/patches/series
+++ b/debian/patches/series
@@ -1,6 +1,7 @@
 extra/0001-target-i386-add-MDS-NO-feature.patch
 extra/0002-target-i386-define-md-clear-bit.patch
 extra/0003-virtio-balloon-fix-QEMU-4.0-config-size-migration-in.patch
+extra/0004-drive-mirror-add-incremental-mode.patch
 pve/0001-PVE-Config-block-file-change-locking-default-to-off.patch
 pve/0002-PVE-Config-Adjust-network-script-path-to-etc-kvm.patch
 pve/0003-PVE-Config-set-the-CPU-model-to-kvm64-32-instead-of-.patch
-- 
2.20.1





More information about the pve-devel mailing list