2023-07-14 12:56:58

by Jakub Vaněk

[permalink] [raw]
Subject: [PATCH v3] Revert "usb: dwc3: core: Enable AutoRetry feature in the controller"

This reverts commit b138e23d3dff90c0494925b4c1874227b81bddf7.

AutoRetry has been found to sometimes cause controller freezes when
communicating with buggy USB devices.

This controller feature allows the controller in host mode to send
non-terminating/burst retry ACKs instead of terminating retry ACKs
to devices when a transaction error (CRC error or overflow) occurs.

Unfortunately, if the USB device continues to respond with a CRC error,
the controller will not complete endpoint-related commands while it
keeps trying to auto-retry. [3] The xHCI driver will notice this once
it tries to abort the transfer using a Stop Endpoint command and
does not receive a completion in time. [1]
This situation is reported to dmesg:

[sda] tag#29 uas_eh_abort_handler 0 uas-tag 1 inflight: CMD IN
[sda] tag#29 CDB: opcode=0x28 28 00 00 69 42 80 00 00 48 00
xhci-hcd: xHCI host not responding to stop endpoint command
xhci-hcd: xHCI host controller not responding, assume dead
xhci-hcd: HC died; cleaning up

Some users observed this problem on an Odroid HC2 with the JMS578
USB3-to-SATA bridge. The issue can be triggered by starting
a read-heavy workload on an attached SSD. After a while, the host
controller would die and the SSD would disappear from the system. [1]

Further analysis by Synopsys determined that controller revisions
other than the one in Odroid HC2 are also affected by this.
The recommended solution was to disable AutoRetry altogether.
This change does not have a noticeable performance impact. [2]

Revert the enablement commit. This will keep the AutoRetry bit in
the default state configured during SoC design [2].

Fixes: b138e23d3dff ("usb: dwc3: core: Enable AutoRetry feature in the controller")
Link: https://lore.kernel.org/r/[email protected]/ [1]
Link: https://lore.kernel.org/r/[email protected]/ [2]
Link: https://lore.kernel.org/r/[email protected]/ [3]
Cc: [email protected]
Cc: Mauro Ribeiro <[email protected]>
Cc: Krzysztof Kozlowski <[email protected]>
Suggested-by: Thinh Nguyen <[email protected]>
Signed-off-by: Jakub Vanek <[email protected]>
---
V2 -> V3: Include more findings in changelog
V1 -> V2: Updated to disable AutoRetry everywhere based on Synopsys feedback
Reworded the changelog a bit to make it clearer

drivers/usb/dwc3/core.c | 16 ----------------
drivers/usb/dwc3/core.h | 3 ---
2 files changed, 19 deletions(-)

diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index f6689b731718..a4e079d37566 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -1209,22 +1209,6 @@ static int dwc3_core_init(struct dwc3 *dwc)
dwc3_writel(dwc->regs, DWC3_GUCTL1, reg);
}

- if (dwc->dr_mode == USB_DR_MODE_HOST ||
- dwc->dr_mode == USB_DR_MODE_OTG) {
- reg = dwc3_readl(dwc->regs, DWC3_GUCTL);
-
- /*
- * Enable Auto retry Feature to make the controller operating in
- * Host mode on seeing transaction errors(CRC errors or internal
- * overrun scenerios) on IN transfers to reply to the device
- * with a non-terminating retry ACK (i.e, an ACK transcation
- * packet with Retry=1 & Nump != 0)
- */
- reg |= DWC3_GUCTL_HSTINAUTORETRY;
-
- dwc3_writel(dwc->regs, DWC3_GUCTL, reg);
- }
-
/*
* Must config both number of packets and max burst settings to enable
* RX and/or TX threshold.
diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
index 8b1295e4dcdd..a69ac67d89fe 100644
--- a/drivers/usb/dwc3/core.h
+++ b/drivers/usb/dwc3/core.h
@@ -256,9 +256,6 @@
#define DWC3_GCTL_GBLHIBERNATIONEN BIT(1)
#define DWC3_GCTL_DSBLCLKGTNG BIT(0)

-/* Global User Control Register */
-#define DWC3_GUCTL_HSTINAUTORETRY BIT(14)
-
/* Global User Control 1 Register */
#define DWC3_GUCTL1_DEV_DECOUPLE_L1L2_EVT BIT(31)
#define DWC3_GUCTL1_TX_IPGAP_LINECHECK_DIS BIT(28)
--
2.25.1



2023-07-14 21:44:02

by Thinh Nguyen

[permalink] [raw]
Subject: Re: [PATCH v3] Revert "usb: dwc3: core: Enable AutoRetry feature in the controller"

On Fri, Jul 14, 2023, Jakub Vanek wrote:
> This reverts commit b138e23d3dff90c0494925b4c1874227b81bddf7.
>
> AutoRetry has been found to sometimes cause controller freezes when
> communicating with buggy USB devices.
>
> This controller feature allows the controller in host mode to send
> non-terminating/burst retry ACKs instead of terminating retry ACKs
> to devices when a transaction error (CRC error or overflow) occurs.
>
> Unfortunately, if the USB device continues to respond with a CRC error,
> the controller will not complete endpoint-related commands while it
> keeps trying to auto-retry. [3] The xHCI driver will notice this once
> it tries to abort the transfer using a Stop Endpoint command and
> does not receive a completion in time. [1]
> This situation is reported to dmesg:
>
> [sda] tag#29 uas_eh_abort_handler 0 uas-tag 1 inflight: CMD IN
> [sda] tag#29 CDB: opcode=0x28 28 00 00 69 42 80 00 00 48 00
> xhci-hcd: xHCI host not responding to stop endpoint command
> xhci-hcd: xHCI host controller not responding, assume dead
> xhci-hcd: HC died; cleaning up
>
> Some users observed this problem on an Odroid HC2 with the JMS578
> USB3-to-SATA bridge. The issue can be triggered by starting
> a read-heavy workload on an attached SSD. After a while, the host
> controller would die and the SSD would disappear from the system. [1]
>
> Further analysis by Synopsys determined that controller revisions
> other than the one in Odroid HC2 are also affected by this.
> The recommended solution was to disable AutoRetry altogether.
> This change does not have a noticeable performance impact. [2]
>
> Revert the enablement commit. This will keep the AutoRetry bit in
> the default state configured during SoC design [2].
>
> Fixes: b138e23d3dff ("usb: dwc3: core: Enable AutoRetry feature in the controller")
> Link: https://urldefense.com/v3/__https://lore.kernel.org/r/[email protected]/__;!!A4F2R9G_pg!cQZmUozSCVry9wEn5TTVGljEo6B_nKIel0kWXsUQbaleJZwkOyrvc4FzWTlbfcUJ7hGz9V50rn0A2WMEceULIS9y9A$ [1]
> Link: https://urldefense.com/v3/__https://lore.kernel.org/r/[email protected]/__;!!A4F2R9G_pg!cQZmUozSCVry9wEn5TTVGljEo6B_nKIel0kWXsUQbaleJZwkOyrvc4FzWTlbfcUJ7hGz9V50rn0A2WMEceXrbrG9iA$ [2]
> Link: https://urldefense.com/v3/__https://lore.kernel.org/r/[email protected]/__;!!A4F2R9G_pg!cQZmUozSCVry9wEn5TTVGljEo6B_nKIel0kWXsUQbaleJZwkOyrvc4FzWTlbfcUJ7hGz9V50rn0A2WMEceWVJz8nuQ$ [3]
> Cc: [email protected]
> Cc: Mauro Ribeiro <[email protected]>
> Cc: Krzysztof Kozlowski <[email protected]>
> Suggested-by: Thinh Nguyen <[email protected]>
> Signed-off-by: Jakub Vanek <[email protected]>
> ---
> V2 -> V3: Include more findings in changelog
> V1 -> V2: Updated to disable AutoRetry everywhere based on Synopsys feedback
> Reworded the changelog a bit to make it clearer
>
> drivers/usb/dwc3/core.c | 16 ----------------
> drivers/usb/dwc3/core.h | 3 ---
> 2 files changed, 19 deletions(-)
>
> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
> index f6689b731718..a4e079d37566 100644
> --- a/drivers/usb/dwc3/core.c
> +++ b/drivers/usb/dwc3/core.c
> @@ -1209,22 +1209,6 @@ static int dwc3_core_init(struct dwc3 *dwc)
> dwc3_writel(dwc->regs, DWC3_GUCTL1, reg);
> }
>
> - if (dwc->dr_mode == USB_DR_MODE_HOST ||
> - dwc->dr_mode == USB_DR_MODE_OTG) {
> - reg = dwc3_readl(dwc->regs, DWC3_GUCTL);
> -
> - /*
> - * Enable Auto retry Feature to make the controller operating in
> - * Host mode on seeing transaction errors(CRC errors or internal
> - * overrun scenerios) on IN transfers to reply to the device
> - * with a non-terminating retry ACK (i.e, an ACK transcation
> - * packet with Retry=1 & Nump != 0)
> - */
> - reg |= DWC3_GUCTL_HSTINAUTORETRY;
> -
> - dwc3_writel(dwc->regs, DWC3_GUCTL, reg);
> - }
> -
> /*
> * Must config both number of packets and max burst settings to enable
> * RX and/or TX threshold.
> diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
> index 8b1295e4dcdd..a69ac67d89fe 100644
> --- a/drivers/usb/dwc3/core.h
> +++ b/drivers/usb/dwc3/core.h
> @@ -256,9 +256,6 @@
> #define DWC3_GCTL_GBLHIBERNATIONEN BIT(1)
> #define DWC3_GCTL_DSBLCLKGTNG BIT(0)
>
> -/* Global User Control Register */
> -#define DWC3_GUCTL_HSTINAUTORETRY BIT(14)
> -
> /* Global User Control 1 Register */
> #define DWC3_GUCTL1_DEV_DECOUPLE_L1L2_EVT BIT(31)
> #define DWC3_GUCTL1_TX_IPGAP_LINECHECK_DIS BIT(28)
> --
> 2.25.1
>

Acked-by: Thinh Nguyen <[email protected]>

Thanks,
Thinh

2023-07-17 15:30:46

by Jakub Vaněk

[permalink] [raw]
Subject: Re: [PATCH v3] Revert "usb: dwc3: core: Enable AutoRetry feature in the controller"

Hi Thinh,

On Fri, 2023-07-14 at 21:38 +0000, Thinh Nguyen wrote:
> On Fri, Jul 14, 2023, Jakub Vanek wrote:
> > This reverts commit b138e23d3dff90c0494925b4c1874227b81bddf7.
> >
> > AutoRetry has been found to sometimes cause controller freezes when
> > communicating with buggy USB devices.
> >
> > This controller feature allows the controller in host mode to send
> > non-terminating/burst retry ACKs instead of terminating retry ACKs
> > to devices when a transaction error (CRC error or overflow) occurs.
> >
> > Unfortunately, if the USB device continues to respond with a CRC
> > error,
> > the controller will not complete endpoint-related commands while it
> > keeps trying to auto-retry. [3] The xHCI driver will notice this
> > once
> > it tries to abort the transfer using a Stop Endpoint command and
> > does not receive a completion in time. [1]
> > This situation is reported to dmesg:
> >
> > [sda] tag#29 uas_eh_abort_handler 0 uas-tag 1 inflight: CMD IN
> > [sda] tag#29 CDB: opcode=0x28 28 00 00 69 42 80 00 00 48 00
> > xhci-hcd: xHCI host not responding to stop endpoint command
> > xhci-hcd: xHCI host controller not responding, assume dead
> > xhci-hcd: HC died; cleaning up
> >
> > Some users observed this problem on an Odroid HC2 with the JMS578
> > USB3-to-SATA bridge. The issue can be triggered by starting
> > a read-heavy workload on an attached SSD. After a while, the host
> > controller would die and the SSD would disappear from the system.
> > [1]
> >
> > Further analysis by Synopsys determined that controller revisions
> > other than the one in Odroid HC2 are also affected by this.
> > The recommended solution was to disable AutoRetry altogether.
> > This change does not have a noticeable performance impact. [2]
> >
> > Revert the enablement commit. This will keep the AutoRetry bit in
> > the default state configured during SoC design [2].
> >
> > Fixes: b138e23d3dff ("usb: dwc3: core: Enable AutoRetry feature in
> > the controller")
> > Link:
> > https://urldefense.com/v3/__https://lore.kernel.org/r/[email protected]/__;!!A4F2R9G_pg!cQZmUozSCVry9wEn5TTVGljEo6B_nKIel0kWXsUQbaleJZwkOyrvc4FzWTlbfcUJ7hGz9V50rn0A2WMEceULIS9y9A$
> >   [1]
> > Link:
> > https://urldefense.com/v3/__https://lore.kernel.org/r/[email protected]/__;!!A4F2R9G_pg!cQZmUozSCVry9wEn5TTVGljEo6B_nKIel0kWXsUQbaleJZwkOyrvc4FzWTlbfcUJ7hGz9V50rn0A2WMEceXrbrG9iA$
> >   [2]
> > Link:
> > https://urldefense.com/v3/__https://lore.kernel.org/r/[email protected]/__;!!A4F2R9G_pg!cQZmUozSCVry9wEn5TTVGljEo6B_nKIel0kWXsUQbaleJZwkOyrvc4FzWTlbfcUJ7hGz9V50rn0A2WMEceWVJz8nuQ$
> >   [3]
> > Cc: [email protected]
> > Cc: Mauro Ribeiro <[email protected]>
> > Cc: Krzysztof Kozlowski <[email protected]>
> > Suggested-by: Thinh Nguyen <[email protected]>
> > Signed-off-by: Jakub Vanek <[email protected]>
> > ---
> > V2 -> V3: Include more findings in changelog
> > V1 -> V2: Updated to disable AutoRetry everywhere based on Synopsys
> > feedback
> >           Reworded the changelog a bit to make it clearer
> >
> >  drivers/usb/dwc3/core.c | 16 ----------------
> >  drivers/usb/dwc3/core.h |  3 ---
> >  2 files changed, 19 deletions(-)
> >
> > diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
> > index f6689b731718..a4e079d37566 100644
> > --- a/drivers/usb/dwc3/core.c
> > +++ b/drivers/usb/dwc3/core.c
> > @@ -1209,22 +1209,6 @@ static int dwc3_core_init(struct dwc3 *dwc)
> >                 dwc3_writel(dwc->regs, DWC3_GUCTL1, reg);
> >         }
> >  
> > -       if (dwc->dr_mode == USB_DR_MODE_HOST ||
> > -           dwc->dr_mode == USB_DR_MODE_OTG) {
> > -               reg = dwc3_readl(dwc->regs, DWC3_GUCTL);
> > -
> > -               /*
> > -                * Enable Auto retry Feature to make the controller
> > operating in
> > -                * Host mode on seeing transaction errors(CRC
> > errors or internal
> > -                * overrun scenerios) on IN transfers to reply to
> > the device
> > -                * with a non-terminating retry ACK (i.e, an ACK
> > transcation
> > -                * packet with Retry=1 & Nump != 0)
> > -                */
> > -               reg |= DWC3_GUCTL_HSTINAUTORETRY;
> > -
> > -               dwc3_writel(dwc->regs, DWC3_GUCTL, reg);
> > -       }
> > -
> >         /*
> >          * Must config both number of packets and max burst
> > settings to enable
> >          * RX and/or TX threshold.
> > diff --git a/drivers/usb/dwc3/core.h b/drivers/usb/dwc3/core.h
> > index 8b1295e4dcdd..a69ac67d89fe 100644
> > --- a/drivers/usb/dwc3/core.h
> > +++ b/drivers/usb/dwc3/core.h
> > @@ -256,9 +256,6 @@
> >  #define DWC3_GCTL_GBLHIBERNATIONEN     BIT(1)
> >  #define DWC3_GCTL_DSBLCLKGTNG          BIT(0)
> >  
> > -/* Global User Control Register */
> > -#define DWC3_GUCTL_HSTINAUTORETRY      BIT(14)
> > -
> >  /* Global User Control 1 Register */
> >  #define DWC3_GUCTL1_DEV_DECOUPLE_L1L2_EVT      BIT(31)
> >  #define DWC3_GUCTL1_TX_IPGAP_LINECHECK_DIS     BIT(28)
> > --
> > 2.25.1
> >
>
> Acked-by: Thinh Nguyen <[email protected]>

Thank you!

> Thanks,
> Thinh

Best regards,
Jakub