[pve-devel] [PATCH qemu] fix #5054: backport fix for software reset with SATA
Friedrich Weber
f.weber at proxmox.com
Mon Nov 20 10:29:17 CET 2023
Tested with an OPNsense VM: With pve-qemu-kvm 8.1.2-2, it did not boot
from SATA ("Root mount waiting for: CAM"). virtio worked though.
With the patched pve-qemu-kvm package I got from Fiona, the VM booted
from SATA again. virtio still works too.
Tested-by: Friedrich Weber <f.weber at proxmox.com>
On 20/11/2023 10:16, Fiona Ebner wrote:
> The issue prevented FreeBSD 14 VMs with SATA disk from booting.
>
> The commit it fixes e2a5d9b3d9c3 ("hw/ide/ahci: simplify and document
> PxCI handling") is part of stable 8.1.2.
>
> The patch was already applied to the block branch upstream:
> https://lists.nongnu.org/archive/html/qemu-devel/2023-11/msg02711.html
>
> Signed-off-by: Fiona Ebner <f.ebner at proxmox.com>
> ---
> ...w-ide-ahci-fix-legacy-software-reset.patch | 107 ++++++++++++++++++
> debian/patches/series | 1 +
> 2 files changed, 108 insertions(+)
> create mode 100644 debian/patches/extra/0009-hw-ide-ahci-fix-legacy-software-reset.patch
>
> diff --git a/debian/patches/extra/0009-hw-ide-ahci-fix-legacy-software-reset.patch b/debian/patches/extra/0009-hw-ide-ahci-fix-legacy-software-reset.patch
> new file mode 100644
> index 0000000..f070818
> --- /dev/null
> +++ b/debian/patches/extra/0009-hw-ide-ahci-fix-legacy-software-reset.patch
> @@ -0,0 +1,107 @@
> +From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
> +From: Niklas Cassel <niklas.cassel at wdc.com>
> +Date: Wed, 8 Nov 2023 23:26:57 +0100
> +Subject: [PATCH] hw/ide/ahci: fix legacy software reset
> +
> +Legacy software contains a standard mechanism for generating a reset to a
> +Serial ATA device - setting the SRST (software reset) bit in the Device
> +Control register.
> +
> +Serial ATA has a more robust mechanism called COMRESET, also referred to
> +as port reset. A port reset is the preferred mechanism for error
> +recovery and should be used in place of software reset.
> +
> +Commit e2a5d9b3d9c3 ("hw/ide/ahci: simplify and document PxCI handling")
> +improved the handling of PxCI, such that PxCI gets cleared after handling
> +a non-NCQ, or NCQ command (instead of incorrectly clearing PxCI after
> +receiving anything - even a FIS that failed to parse, which should NOT
> +clear PxCI, so that you can see which command slot that caused an error).
> +
> +However, simply clearing PxCI after a non-NCQ, or NCQ command, is not
> +enough, we also need to clear PxCI when receiving a SRST in the Device
> +Control register.
> +
> +A legacy software reset is performed by the host sending two H2D FISes,
> +the first H2D FIS asserts SRST, and the second H2D FIS deasserts SRST.
> +
> +The first H2D FIS will not get a D2H reply, and requires the FIS to have
> +the C bit set to one, such that the HBA itself will clear the bit in PxCI.
> +
> +The second H2D FIS will get a D2H reply once the diagnostic is completed.
> +The clearing of the bit in PxCI for this command should ideally be done
> +in ahci_init_d2h() (if it was a legacy software reset that caused the
> +reset (a COMRESET does not use a command slot)). However, since the reset
> +value for PxCI is 0, modify ahci_reset_port() to actually clear PxCI to 0,
> +that way we can avoid complex logic in ahci_init_d2h().
> +
> +This fixes an issue for FreeBSD where the device would fail to reset.
> +The problem was not noticed in Linux, because Linux uses a COMRESET
> +instead of a legacy software reset by default.
> +
> +Fixes: e2a5d9b3d9c3 ("hw/ide/ahci: simplify and document PxCI handling")
> +Reported-by: Marcin Juszkiewicz <marcin.juszkiewicz at linaro.org>
> +Signed-off-by: Niklas Cassel <niklas.cassel at wdc.com>
> +(picked from https://lists.nongnu.org/archive/html/qemu-devel/2023-11/msg02277.html)
> +Signed-off-by: Fiona Ebner <f.ebner at proxmox.com>
> +---
> + hw/ide/ahci.c | 27 ++++++++++++++++++++++++++-
> + 1 file changed, 26 insertions(+), 1 deletion(-)
> +
> +diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
> +index d0a774bc17..1718b7e902 100644
> +--- a/hw/ide/ahci.c
> ++++ b/hw/ide/ahci.c
> +@@ -623,9 +623,13 @@ static void ahci_init_d2h(AHCIDevice *ad)
> + return;
> + }
> +
> ++ /*
> ++ * For simplicity, do not call ahci_clear_cmd_issue() for this
> ++ * ahci_write_fis_d2h(). (The reset value for PxCI is 0.)
> ++ */
> + if (ahci_write_fis_d2h(ad, true)) {
> + ad->init_d2h_sent = true;
> +- /* We're emulating receiving the first Reg H2D Fis from the device;
> ++ /* We're emulating receiving the first Reg D2H FIS from the device;
> + * Update the SIG register, but otherwise proceed as normal. */
> + pr->sig = ((uint32_t)ide_state->hcyl << 24) |
> + (ide_state->lcyl << 16) |
> +@@ -663,6 +667,7 @@ static void ahci_reset_port(AHCIState *s, int port)
> + pr->scr_act = 0;
> + pr->tfdata = 0x7F;
> + pr->sig = 0xFFFFFFFF;
> ++ pr->cmd_issue = 0;
> + d->busy_slot = -1;
> + d->init_d2h_sent = false;
> +
> +@@ -1243,10 +1248,30 @@ static void handle_reg_h2d_fis(AHCIState *s, int port,
> + case STATE_RUN:
> + if (cmd_fis[15] & ATA_SRST) {
> + s->dev[port].port_state = STATE_RESET;
> ++ /*
> ++ * When setting SRST in the first H2D FIS in the reset sequence,
> ++ * the device does not send a D2H FIS. Host software thus has to
> ++ * set the "Clear Busy upon R_OK" bit such that PxCI (and BUSY)
> ++ * gets cleared. See AHCI 1.3.1, section 10.4.1 Software Reset.
> ++ */
> ++ if (opts & AHCI_CMD_CLR_BUSY) {
> ++ ahci_clear_cmd_issue(ad, slot);
> ++ }
> + }
> + break;
> + case STATE_RESET:
> + if (!(cmd_fis[15] & ATA_SRST)) {
> ++ /*
> ++ * When clearing SRST in the second H2D FIS in the reset
> ++ * sequence, the device will execute diagnostics. When this is
> ++ * done, the device will send a D2H FIS with the good status.
> ++ * See SATA 3.5a Gold, section 11.4 Software reset protocol.
> ++ *
> ++ * This D2H FIS is the first D2H FIS received from the device,
> ++ * and is received regardless if the reset was performed by a
> ++ * COMRESET or by setting and clearing the SRST bit. Therefore,
> ++ * the logic for this is found in ahci_init_d2h() and not here.
> ++ */
> + ahci_reset_port(s, port);
> + }
> + break;
> diff --git a/debian/patches/series b/debian/patches/series
> index ad84088..992299c 100644
> --- a/debian/patches/series
> +++ b/debian/patches/series
> @@ -6,6 +6,7 @@ extra/0005-hw-ide-reset-cancel-async-DMA-operation-before-reset.patch
> extra/0006-Revert-Revert-graph-lock-Disable-locking-for-now.patch
> extra/0007-migration-states-workaround-snapshot-performance-reg.patch
> extra/0008-Revert-x86-acpi-workaround-Windows-not-handling-name.patch
> +extra/0009-hw-ide-ahci-fix-legacy-software-reset.patch
> bitmap-mirror/0001-drive-mirror-add-support-for-sync-bitmap-mode-never.patch
> bitmap-mirror/0002-drive-mirror-add-support-for-conditional-and-always-.patch
> bitmap-mirror/0003-mirror-add-check-for-bitmap-mode-without-bitmap.patch
More information about the pve-devel
mailing list