[pve-devel] [PATCH v2 qemu 4/6] PVE: Don't call job_cancel in coroutines
Stefan Reiter
s.reiter at proxmox.com
Thu Oct 29 14:10:34 CET 2020
...because it hangs on cancelling other jobs in the txn if you do.
Signed-off-by: Stefan Reiter <s.reiter at proxmox.com>
---
v2:
* use new CoCtxData
* use aio_co_enter vs aio_co_schedule for BH return
* cache job_ctx since job_cancel_sync might switch the job to a different
context (when iothreads are in use) thus making us drop the wrong AioContext
if we access job->aio_context again. This is incidentally the same bug I once
fixed for upstream, almost made it in again...
pve-backup.c | 22 +++++++++++++++++++++-
1 file changed, 21 insertions(+), 1 deletion(-)
diff --git a/pve-backup.c b/pve-backup.c
index 92eaada0bc..0466145bec 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -332,6 +332,20 @@ static void pvebackup_complete_cb(void *opaque, int ret)
aio_co_enter(qemu_get_aio_context(), co);
}
+/*
+ * job_cancel(_sync) does not like to be called from coroutines, so defer to
+ * main loop processing via a bottom half.
+ */
+static void job_cancel_bh(void *opaque) {
+ CoCtxData *data = (CoCtxData*)opaque;
+ Job *job = (Job*)data->data;
+ AioContext *job_ctx = job->aio_context;
+ aio_context_acquire(job_ctx);
+ job_cancel_sync(job);
+ aio_context_release(job_ctx);
+ aio_co_enter(data->ctx, data->co);
+}
+
static void coroutine_fn pvebackup_co_cancel(void *opaque)
{
Error *cancel_err = NULL;
@@ -357,7 +371,13 @@ static void coroutine_fn pvebackup_co_cancel(void *opaque)
NULL;
if (cancel_job) {
- job_cancel(&cancel_job->job, false);
+ CoCtxData data = {
+ .ctx = qemu_get_current_aio_context(),
+ .co = qemu_coroutine_self(),
+ .data = &cancel_job->job,
+ };
+ aio_bh_schedule_oneshot(data.ctx, job_cancel_bh, &data);
+ qemu_coroutine_yield();
}
qemu_co_mutex_unlock(&backup_state.backup_mutex);
--
2.20.1
More information about the pve-devel
mailing list