[pve-devel] [PATCH qemu] fix #5054: backport fix for software reset with SATA

Friedrich Weber f.weber at proxmox.com
Mon Nov 20 10:29:17 CET 2023


Tested with an OPNsense VM: With pve-qemu-kvm 8.1.2-2, it did not boot
from SATA ("Root mount waiting for: CAM"). virtio worked though.

With the patched pve-qemu-kvm package I got from Fiona, the VM booted
from SATA again. virtio still works too.

Tested-by: Friedrich Weber <f.weber at proxmox.com>

On 20/11/2023 10:16, Fiona Ebner wrote:
> The issue prevented FreeBSD 14 VMs with SATA disk from booting.
> 
> The commit it fixes e2a5d9b3d9c3 ("hw/ide/ahci: simplify and document
> PxCI handling") is part of stable 8.1.2.
> 
> The patch was already applied to the block branch upstream:
> https://lists.nongnu.org/archive/html/qemu-devel/2023-11/msg02711.html
> 
> Signed-off-by: Fiona Ebner <f.ebner at proxmox.com>
> ---
>  ...w-ide-ahci-fix-legacy-software-reset.patch | 107 ++++++++++++++++++
>  debian/patches/series                         |   1 +
>  2 files changed, 108 insertions(+)
>  create mode 100644 debian/patches/extra/0009-hw-ide-ahci-fix-legacy-software-reset.patch
> 
> diff --git a/debian/patches/extra/0009-hw-ide-ahci-fix-legacy-software-reset.patch b/debian/patches/extra/0009-hw-ide-ahci-fix-legacy-software-reset.patch
> new file mode 100644
> index 0000000..f070818
> --- /dev/null
> +++ b/debian/patches/extra/0009-hw-ide-ahci-fix-legacy-software-reset.patch
> @@ -0,0 +1,107 @@
> +From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
> +From: Niklas Cassel <niklas.cassel at wdc.com>
> +Date: Wed, 8 Nov 2023 23:26:57 +0100
> +Subject: [PATCH] hw/ide/ahci: fix legacy software reset
> +
> +Legacy software contains a standard mechanism for generating a reset to a
> +Serial ATA device - setting the SRST (software reset) bit in the Device
> +Control register.
> +
> +Serial ATA has a more robust mechanism called COMRESET, also referred to
> +as port reset. A port reset is the preferred mechanism for error
> +recovery and should be used in place of software reset.
> +
> +Commit e2a5d9b3d9c3 ("hw/ide/ahci: simplify and document PxCI handling")
> +improved the handling of PxCI, such that PxCI gets cleared after handling
> +a non-NCQ, or NCQ command (instead of incorrectly clearing PxCI after
> +receiving anything - even a FIS that failed to parse, which should NOT
> +clear PxCI, so that you can see which command slot that caused an error).
> +
> +However, simply clearing PxCI after a non-NCQ, or NCQ command, is not
> +enough, we also need to clear PxCI when receiving a SRST in the Device
> +Control register.
> +
> +A legacy software reset is performed by the host sending two H2D FISes,
> +the first H2D FIS asserts SRST, and the second H2D FIS deasserts SRST.
> +
> +The first H2D FIS will not get a D2H reply, and requires the FIS to have
> +the C bit set to one, such that the HBA itself will clear the bit in PxCI.
> +
> +The second H2D FIS will get a D2H reply once the diagnostic is completed.
> +The clearing of the bit in PxCI for this command should ideally be done
> +in ahci_init_d2h() (if it was a legacy software reset that caused the
> +reset (a COMRESET does not use a command slot)). However, since the reset
> +value for PxCI is 0, modify ahci_reset_port() to actually clear PxCI to 0,
> +that way we can avoid complex logic in ahci_init_d2h().
> +
> +This fixes an issue for FreeBSD where the device would fail to reset.
> +The problem was not noticed in Linux, because Linux uses a COMRESET
> +instead of a legacy software reset by default.
> +
> +Fixes: e2a5d9b3d9c3 ("hw/ide/ahci: simplify and document PxCI handling")
> +Reported-by: Marcin Juszkiewicz <marcin.juszkiewicz at linaro.org>
> +Signed-off-by: Niklas Cassel <niklas.cassel at wdc.com>
> +(picked from https://lists.nongnu.org/archive/html/qemu-devel/2023-11/msg02277.html)
> +Signed-off-by: Fiona Ebner <f.ebner at proxmox.com>
> +---
> + hw/ide/ahci.c | 27 ++++++++++++++++++++++++++-
> + 1 file changed, 26 insertions(+), 1 deletion(-)
> +
> +diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
> +index d0a774bc17..1718b7e902 100644
> +--- a/hw/ide/ahci.c
> ++++ b/hw/ide/ahci.c
> +@@ -623,9 +623,13 @@ static void ahci_init_d2h(AHCIDevice *ad)
> +         return;
> +     }
> + 
> ++    /*
> ++     * For simplicity, do not call ahci_clear_cmd_issue() for this
> ++     * ahci_write_fis_d2h(). (The reset value for PxCI is 0.)
> ++     */
> +     if (ahci_write_fis_d2h(ad, true)) {
> +         ad->init_d2h_sent = true;
> +-        /* We're emulating receiving the first Reg H2D Fis from the device;
> ++        /* We're emulating receiving the first Reg D2H FIS from the device;
> +          * Update the SIG register, but otherwise proceed as normal. */
> +         pr->sig = ((uint32_t)ide_state->hcyl << 24) |
> +             (ide_state->lcyl << 16) |
> +@@ -663,6 +667,7 @@ static void ahci_reset_port(AHCIState *s, int port)
> +     pr->scr_act = 0;
> +     pr->tfdata = 0x7F;
> +     pr->sig = 0xFFFFFFFF;
> ++    pr->cmd_issue = 0;
> +     d->busy_slot = -1;
> +     d->init_d2h_sent = false;
> + 
> +@@ -1243,10 +1248,30 @@ static void handle_reg_h2d_fis(AHCIState *s, int port,
> +         case STATE_RUN:
> +             if (cmd_fis[15] & ATA_SRST) {
> +                 s->dev[port].port_state = STATE_RESET;
> ++                /*
> ++                 * When setting SRST in the first H2D FIS in the reset sequence,
> ++                 * the device does not send a D2H FIS. Host software thus has to
> ++                 * set the "Clear Busy upon R_OK" bit such that PxCI (and BUSY)
> ++                 * gets cleared. See AHCI 1.3.1, section 10.4.1 Software Reset.
> ++                 */
> ++                if (opts & AHCI_CMD_CLR_BUSY) {
> ++                    ahci_clear_cmd_issue(ad, slot);
> ++                }
> +             }
> +             break;
> +         case STATE_RESET:
> +             if (!(cmd_fis[15] & ATA_SRST)) {
> ++                /*
> ++                 * When clearing SRST in the second H2D FIS in the reset
> ++                 * sequence, the device will execute diagnostics. When this is
> ++                 * done, the device will send a D2H FIS with the good status.
> ++                 * See SATA 3.5a Gold, section 11.4 Software reset protocol.
> ++                 *
> ++                 * This D2H FIS is the first D2H FIS received from the device,
> ++                 * and is received regardless if the reset was performed by a
> ++                 * COMRESET or by setting and clearing the SRST bit. Therefore,
> ++                 * the logic for this is found in ahci_init_d2h() and not here.
> ++                 */
> +                 ahci_reset_port(s, port);
> +             }
> +             break;
> diff --git a/debian/patches/series b/debian/patches/series
> index ad84088..992299c 100644
> --- a/debian/patches/series
> +++ b/debian/patches/series
> @@ -6,6 +6,7 @@ extra/0005-hw-ide-reset-cancel-async-DMA-operation-before-reset.patch
>  extra/0006-Revert-Revert-graph-lock-Disable-locking-for-now.patch
>  extra/0007-migration-states-workaround-snapshot-performance-reg.patch
>  extra/0008-Revert-x86-acpi-workaround-Windows-not-handling-name.patch
> +extra/0009-hw-ide-ahci-fix-legacy-software-reset.patch
>  bitmap-mirror/0001-drive-mirror-add-support-for-sync-bitmap-mode-never.patch
>  bitmap-mirror/0002-drive-mirror-add-support-for-conditional-and-always-.patch
>  bitmap-mirror/0003-mirror-add-check-for-bitmap-mode-without-bitmap.patch





More information about the pve-devel mailing list