[pve-devel] applied: [PATCH v2 qemu] add patches to work around stuck guest IO with iothread and VirtIO block/SCSI

Thomas Lamprecht t.lamprecht at proxmox.com
Fri Feb 2 19:42:56 CET 2024

Am 25/01/2024 um 10:40 schrieb Fiona Ebner:
> This essentially repeats commit 6b7c181 ("add patch to work around
> stuck guest IO with iothread and VirtIO block/SCSI") with an added
> fix for the SCSI event virtqueue, which requires special handling.
> This is to avoid the issue [3] that made the revert 2a49e66 ("Revert
> "add patch to work around stuck guest IO with iothread and VirtIO
> block/SCSI"") necessary the first time around.
> When using iothread, after commits
> 1665d9326f ("virtio-blk: implement BlockDevOps->drained_begin()")
> 766aa2de0f ("virtio-scsi: implement BlockDevOps->drained_begin()")
> it can happen that polling gets stuck when draining. This would cause
> IO in the guest to get completely stuck.
> A workaround for users is stopping and resuming the vCPUs because that
> would also stop and resume the dataplanes which would kick the host
> notifiers.
> This can happen with block jobs like backup and drive mirror as well
> as with hotplug [2].
> Reports in the community forum that might be about this issue[0][1]
> and there is also one in the enterprise support channel.
> As a workaround in the code, just re-enable notifications and kick the
> virt queue after draining. Draining is already costly and rare, so no
> need to worry about a performance penalty here.
> Take special care to attach the SCSI event virtqueue host notifier
> with the _no_poll() variant like in virtio_scsi_dataplane_start().
> This avoids the issue from the first attempted fix where the iothread
> would suddenly loop with 100% CPU usage whenever some guest IO came in
> [3]. This is necessary because of commit 38738f7dbb ("virtio-scsi:
> don't waste CPU polling the event virtqueue"). See [4] for the
> relevant discussion.
> [0]: https://forum.proxmox.com/threads/137286/
> [1]: https://forum.proxmox.com/threads/137536/
> [2]: https://issues.redhat.com/browse/RHEL-3934
> [3]: https://forum.proxmox.com/threads/138140/
> [4]: https://lore.kernel.org/qemu-devel/bfc7b20c-2144-46e9-acbc-e726276c5a31@proxmox.com/
> Signed-off-by: Fiona Ebner <f.ebner at proxmox.com>
> ---
> Changes in v2:
>     * Pick (functionally equivalent) upstream patches to reduce diff.
>  ...ttach-event-vq-notifier-with-no_poll.patch |  62 +++++++++
>  ...-notifications-disabled-during-drain.patch | 126 ++++++++++++++++++
>  debian/patches/series                         |   2 +
>  3 files changed, 190 insertions(+)
>  create mode 100644 debian/patches/extra/0012-virtio-scsi-Attach-event-vq-notifier-with-no_poll.patch
>  create mode 100644 debian/patches/extra/0013-virtio-Keep-notifications-disabled-during-drain.patch

basically applied this, thanks.

Basically, as I went for the v2 [0], as the series file needed some adaption
 due to recent v8.1.5 patch I just made a new commit but kept your message and
added an Originally-by Trailer, hope that's all right with you.

[0]: https://lore.kernel.org/qemu-devel/20240202153158.788922-1-hreitz@redhat.com/
     But still only the first two patches, the clean-up did not applied at 8.1.5 
     and I did not bother checking that out closely.

More information about the pve-devel mailing list