<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body dir="auto"><div>Might this be related to the qemu hang you see when saving vm memory?<br><br>Stefan<div><br></div><div>Excuse my typo s<span style="font-size: 13pt;">ent from my mobile phone.</span></div></div><div><br>Anfang der weitergeleiteten E‑Mail:<br><br></div><blockquote type="cite"><div><b>Von:</b> Liu Yu <<a href="mailto:allanliuyu@gmail.com">allanliuyu@gmail.com</a>><br><b>Datum:</b> 23. August 2014 05:43:00 MESZ<br><b>An:</b> <a href="mailto:qemu-stable@nongnu.org">qemu-stable@nongnu.org</a><br><b>Betreff:</b> <b>[Qemu-stable] [PATCH] stream: fix the deadlock bug when stream finish</b><br><br></div></blockquote><blockquote type="cite"><div><span>From: Liu Yu <<a href="mailto:allanyuliu@tencent.com">allanyuliu@tencent.com</a>></span><br><span></span><br><span>The patch against branch stable-2.0</span><br><span></span><br><span>In case VM does IO while we run a stream job.</span><br><span>When stream finishes, the stream coroutine drains all IOs before</span><br><span>close the unused image, in bdrv_drain_all() it may find</span><br><span>a pending request which is submitted by guest IO coroutine.</span><br><span>In order to wait the pending req finish, the subsequent aio_poll()</span><br><span>call poll() to wait the req. however, if the req is already done by</span><br><span>threadpool and is waiting for the callback, there is no chance to switch</span><br><span>back to guest IO coroutine to call the callback and so that the stream</span><br><span>coroutine waits in poll() all the time.</span><br><span></span><br><span>The patch detects the deadlock case above and switch back to iothread</span><br><span>coroutine to handle the callback, and work on the stream coroutine</span><br><span>after the pending req get finished.</span><br><span></span><br><span>Signed-off-by: Liu Yu <<a href="mailto:allanyuliu@tencent.com">allanyuliu@tencent.com</a>></span><br><span>---</span><br><span>the issue can be reproduced by</span><br><span>1. guest does fio test</span><br><span>2. while host runs virsh blockpull repeatedly</span><br><span></span><br><span></span><br><span> block.c |   27 ++++++++++++++++++++++++++-</span><br><span> 1 files changed, 26 insertions(+), 1 deletions(-)</span><br><span></span><br><span>diff --git a/block.c b/block.c</span><br><span>index 990a754..f8c1a8d 100644</span><br><span>--- a/block.c</span><br><span>+++ b/block.c</span><br><span>@@ -1778,6 +1778,29 @@ static bool bdrv_requests_pending_all(void)</span><br><span>     return false;</span><br><span> }</span><br><span></span><br><span>+static bool bdrv_request_coroutine_wait(void)</span><br><span>+{</span><br><span>+    BlockDriverState *bs;</span><br><span>+    Coroutine *co;</span><br><span>+</span><br><span>+    if (!qemu_in_coroutine())</span><br><span>+        return false;</span><br><span>+</span><br><span>+    co = qemu_coroutine_self();</span><br><span>+    QTAILQ_FOREACH(bs, &bdrv_states, device_list) {</span><br><span>+        if (!QLIST_EMPTY(&bs->tracked_requests)) {</span><br><span>+            BdrvTrackedRequest *req = QLIST_FIRST(&bs->tracked_requests);</span><br><span>+</span><br><span>+            if(req->co == co)</span><br><span>+                continue;</span><br><span>+</span><br><span>+            qemu_co_queue_wait(&req->wait_queue);</span><br><span>+            return true;</span><br><span>+        }</span><br><span>+    }</span><br><span>+    return false;</span><br><span>+}</span><br><span>+</span><br><span> /*</span><br><span>  * Wait for pending requests to complete across all BlockDriverStates</span><br><span>  *</span><br><span>@@ -1800,8 +1823,10 @@ void bdrv_drain_all(void)</span><br><span>         QTAILQ_FOREACH(bs, &bdrv_states, device_list) {</span><br><span>             bdrv_start_throttled_reqs(bs);</span><br><span>         }</span><br><span>-</span><br><span>+recheck:</span><br><span>         busy = bdrv_requests_pending_all();</span><br><span>+        if (busy && bdrv_request_coroutine_wait())</span><br><span>+            goto recheck;</span><br><span>         busy |= aio_poll(qemu_get_aio_context(), busy);</span><br><span>     }</span><br><span> }</span><br><span>-- </span><br><span>1.7.1</span><br><span></span><br><span></span><br></div></blockquote></body></html>