[pve-devel] [RFC qemu] fix #1420: fix stop mode backup with virtio-blk
Thomas Lamprecht
t.lamprecht at proxmox.com
Tue Aug 29 09:11:18 CEST 2017
your patch looks good, considering that we do not allow to backup disk
with iothread enabled.
As the method in crime was introduced quite recently (commited on june 20
as 5ab4b69ce29908b327a91966dc78ea0fd7424075 ) this was, probably, simply
overseen with the rebasing.
The change is, AFAIS, ok because we do not have a target here, our target
get handled in vma-writer.c - currently, at last.
And the VMA writer runs in the aio conetxt of the qemu main loop, not in
an iothread one. (qemu_get_aio_context() returns the main-loop aio context,
iothread_get_aio_context() would be the one for iothread (duh))
I tested successfully with a sligthly modified change:
diff --git a/block/backup.c b/block/backup.c
index 51b5ba6eda..2eb48c76b6 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -269,6 +269,11 @@ static void backup_attached_aio_context(BlockJob *job, AioContext *aio_context)
{
BackupBlockJob *s = container_of(job, BackupBlockJob, common);
+ if (!s->target) {
+ assert(aio_context == qemu_get_aio_context());
+ return;
+ }
+
blk_set_aio_context(s->target, aio_context);
}
The assert does not trigger, so we are in fact in the qemu main loop aio context.
On 08/28/2017 04:12 PM, Fabian Grünbichler wrote:
> Signed-off-by: Fabian Grünbichler <f.gruenbichler at proxmox.com>
> ---
> this seems in line with our other VMA modifications, and I get the following
> stacktrace without:
>
> Thread 1 (Thread 0x7fa27cffd700 (LWP 29427)):
> #0 blk_bs (blk=0x0) at block/block-backend.c:1655
> #1 blk_set_aio_context (blk=0x0, new_context=0x7fa27e16a8c0) at block/block-backend.c:1656
> #2 0x000055d7d03c68c8 in block_job_attached_aio_context (new_context=<optimized out>, opaque=0x7fa1768f6d80) at blockjob.c:115
> job = 0x7fa1768f6d80
> #3 0x000055d7d03c2a8b in bdrv_attach_aio_context (bs=bs at entry=0x7fa27e1c6000, new_context=new_context at entry=0x7fa27e16a8c0) at block.c:4400
> ban = <optimized out>
> ban_tmp = 0x0
> child = <optimized out>
> __PRETTY_FUNCTION__ = "bdrv_attach_aio_context"
> #4 0x000055d7d03c2b29 in bdrv_set_aio_context (bs=bs at entry=0x7fa27e1c6000, new_context=new_context at entry=0x7fa27e16a8c0) at block.c:4424
> #5 0x000055d7d03fe6b6 in blk_set_aio_context (blk=0x7fa27e031000, new_context=0x7fa27e16a8c0) at block/block-backend.c:1662
> #6 0x000055d7d014571c in virtio_blk_data_plane_start (vdev=<optimized out>) at hw/block/dataplane/virtio-blk.c:202
> vblk = 0x7fa177039510
> __func__ = "virtio_blk_data_plane_start"
> s = 0x7fa177091f00
> qbus = <optimized out>
> k = <optimized out>
> i = <optimized out>
> nvqs = <optimized out>
> r = <optimized out>
>
> with the inner-most two having a blk parameter of 0x0 leading to a segfault.
>
> blockjob.c:115 does the following:
> job->driver->attached_aio_context(job, new_context);
>
> and job / job->driver in frame #2 are:
> $1 = {driver = 0x55d7d0a261a0 <backup_job_driver>, blk = 0x7fa27e033800, id =
> 0x7fa17703c4e0 "drive-virtio0", co = 0x7fa27e16c440, cancelled = false,
> pause_count = 2, user_paused = false, busy = false, paused = true, ready =
> false, deferred_to_main_loop = false, job_list = {le_next = 0x0, le_prev =
> 0x55d7d0adb198 <block_jobs>}, iostatus = BLOCK_DEVICE_IO_STATUS_OK, offset =
> 285147136, len = 34359738368, speed = 0, cb = 0x55d7d01f8800
> <pvebackup_complete_cb>, blocker = 0x7fa176910f40, nodes = 0x7fa27e02b750,
> opaque = 0x7fa27e1cc000, refcnt = 1, completed = false, ret = 0, txn = 0x0,
> txn_list = {le_next = 0x0, le_prev = 0x0}}
>
> $3 = {common = {driver = 0x55d7d0a261a0 <backup_job_driver>, blk =
> 0x7fa27e033800, id = 0x7fa17703c4e0 "drive-virtio0", co = 0x7fa27e16c440,
> cancelled = false, pause_count = 2, user_paused = false, busy = false, paused =
> true, ready = false, deferred_to_main_loop = false, job_list = {le_next = 0x0,
> le_prev = 0x55d7d0adb198 <block_jobs>}, iostatus = BLOCK_DEVICE_IO_STATUS_OK,
> offset = 285147136, len = 34359738368, speed = 0, cb = 0x55d7d01f8800
> <pvebackup_complete_cb>, blocker = 0x7fa176910f40, nodes = 0x7fa27e02b750,
> opaque = 0x7fa27e1cc000, refcnt = 1, completed = false, ret = 0, txn = 0x0,
> txn_list = { le_next = 0x0, le_prev = 0x0}}, target = 0x0, sync_bitmap = 0x0,
> sync_mode = MIRROR_SYNC_MODE_FULL, limit = {slice_start_time = 0,
> slice_end_time = 0, slice_quota = 0, slice_ns = 0, dispatched = 0}, dump_cb =
> 0x55d7d01f83c0 <pvebackup_dump_cb>, on_source_error = BLOCKDEV_ON_ERROR_REPORT,
> on_target_error = BLOCKDEV_ON_ERROR_REPORT, flush_rwlock = {pending_writer = 0,
> reader = 0, mutex = { locked = 0, ctx = 0x0, from_push = {slh_first = 0x0},
> to_pop = {slh_first = 0x0}, handoff = 0, sequence = 0, holder = 0x0}, queue =
> {entries = {sqh_first = 0x0, sqh_last = 0x7fa1768f6eb0}}}, sectors_read =
> 556928, done_bitmap = 0x7fa179c85000, cluster_size = 65536, compress = false,
> before_write = {notify = 0x55d7d0416ed0 <backup_before_write_notify>, node =
> {le_next = 0x0, le_prev = 0x7fa27e1c91c8}}, inflight_reqs = {lh_first = 0x0}}
>
> this is IMHO the place to fix it (without a target, there is nothing further to
> set the context of). review of someone more familiar with the qemu and VMA code
> bases very welcome ;)
>
Just for the record, we have a target, but its not tracked by the BackupBlockJob
container, but rather by ourself in vma-writer.c ...
I'm quite sure that your change is OK, but again I'd wait on Wolfgang B. for
review, not only is he far more familiar, but he overlooked also the new
BackuBlockJob->target access :-P
> debian/patches/pve/0028-adding-old-vma-files.patch | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/debian/patches/pve/0028-adding-old-vma-files.patch b/debian/patches/pve/0028-adding-old-vma-files.patch
> index 5e87ed7..85927a3 100644
> --- a/debian/patches/pve/0028-adding-old-vma-files.patch
> +++ b/debian/patches/pve/0028-adding-old-vma-files.patch
> @@ -105,7 +105,17 @@ index 1ede70c061..51b5ba6eda 100644
> assert(s->target);
> blk_unref(s->target);
> s->target = NULL;
> -@@ -330,9 +344,11 @@ static BlockErrorAction backup_error_action(BackupBlockJob *job,
> +@@ -255,7 +269,8 @@ static void backup_attached_aio_context(BlockJob *job, AioContext *aio_context)
> + {
> + BackupBlockJob *s = container_of(job, BackupBlockJob, common);
> +
> +- blk_set_aio_context(s->target, aio_context);
> ++ if (s->target)
> ++ blk_set_aio_context(s->target, aio_context);
> + }
> +
> + void backup_do_checkpoint(BlockJob *job, Error **errp)
> +@@ -330,9 +345,11 @@ static BlockErrorAction backup_error_action(BackupBlockJob *job,
> if (read) {
> return block_job_error_action(&job->common, job->on_source_error,
> true, error);
>
More information about the pve-devel
mailing list