2018-11-22 17:53:37

by Sandeep Singh

[permalink] [raw]
Subject: [PATCH v2] xhci: workaround CSS timeout on AMD SNPS 3.0 xHC

From: Sandeep Singh <[email protected]>

Occasionally AMD SNPS 3.0 xHC does not respond to
CSS when set, also it does not flag anything on SRE and HCE
to point the internal xHC errors on USBSTS register. This stalls
the entire system wide suspend and there is no point in stalling
just because of xHC CSS is not responding.

To work around this problem, if the xHC does not flag
anything on SRE and HCE, we can skip the CSS
timeout and allow the system to continue the suspend. Once the
system resume happens we can internally reset the controller
using XHCI_RESET_ON_RESUME quirk

Signed-off-by: Shyam Sundar S K <[email protected]>
Signed-off-by: Sandeep Singh <[email protected]>
cc: Nehal Shah <[email protected]>
---
Changes since v1:

-> New Variable based decision making when SNPS issue happens hence
quirk interdependency removed.
-> Removed STS conditional check in suspend function.

drivers/usb/host/xhci-pci.c | 4 ++++
drivers/usb/host/xhci.c | 26 ++++++++++++++++++++++----
drivers/usb/host/xhci.h | 3 +++
3 files changed, 29 insertions(+), 4 deletions(-)

diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index 01c5705..72493c4 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -139,6 +139,10 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
pdev->device == 0x43bb))
xhci->quirks |= XHCI_SUSPEND_DELAY;

+ if (pdev->vendor == PCI_VENDOR_ID_AMD &&
+ (pdev->device == 0x15e0 || pdev->device == 0x15e1))
+ xhci->quirks |= XHCI_SNPS_BROKEN_SUSPEND;
+
if (pdev->vendor == PCI_VENDOR_ID_AMD)
xhci->quirks |= XHCI_TRUST_TX_LENGTH;

diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index 0420eef..808677d 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -970,6 +970,7 @@ int xhci_suspend(struct xhci_hcd *xhci, bool do_wakeup)
unsigned int delay = XHCI_MAX_HALT_USEC;
struct usb_hcd *hcd = xhci_to_hcd(xhci);
u32 command;
+ u32 res;

if (!hcd->state)
return 0;
@@ -1023,11 +1024,28 @@ int xhci_suspend(struct xhci_hcd *xhci, bool do_wakeup)
command = readl(&xhci->op_regs->command);
command |= CMD_CSS;
writel(command, &xhci->op_regs->command);
+ xhci->broken_suspend = 0;
if (xhci_handshake(&xhci->op_regs->status,
STS_SAVE, 0, 10 * 1000)) {
- xhci_warn(xhci, "WARN: xHC save state timeout\n");
- spin_unlock_irq(&xhci->lock);
- return -ETIMEDOUT;
+ /*
+ * AMD SNPS xHC 3.0 occasionally does not clear the
+ * SSS bit of USBSTS and when driver tries to poll
+ * to see if the xHC clears BIT(8) which never happens
+ * and driver assumes that controller is not responding
+ * and times out. To workaround this, its good to check
+ * if SRE and HCE bits are not set (as per xhci
+ * Section 5.4.2) and bypass the timeout.
+ */
+ res = readl(&xhci->op_regs->status);
+ if ((xhci->quirks & XHCI_SNPS_BROKEN_SUSPEND) &&
+ (((res & STS_SRE) == 0) &&
+ ((res & STS_HCE) == 0))) {
+ xhci->broken_suspend = 1;
+ } else {
+ xhci_warn(xhci, "WARN: xHC save state timeout\n");
+ spin_unlock_irq(&xhci->lock);
+ return -ETIMEDOUT;
+ }
}
spin_unlock_irq(&xhci->lock);

@@ -1080,7 +1098,7 @@ int xhci_resume(struct xhci_hcd *xhci, bool hibernated)
set_bit(HCD_FLAG_HW_ACCESSIBLE, &xhci->shared_hcd->flags);

spin_lock_irq(&xhci->lock);
- if (xhci->quirks & XHCI_RESET_ON_RESUME)
+ if ((xhci->quirks & XHCI_RESET_ON_RESUME) || xhci->broken_suspend)
hibernated = true;

if (!hibernated) {
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
index bf0b369..d5d19b2 100644
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1849,6 +1849,7 @@ struct xhci_hcd {
#define XHCI_INTEL_USB_ROLE_SW BIT_ULL(31)
#define XHCI_ZERO_64B_REGS BIT_ULL(32)
#define XHCI_DEFAULT_PM_RUNTIME_ALLOW BIT_ULL(33)
+#define XHCI_SNPS_BROKEN_SUSPEND BIT_ULL(34)

unsigned int num_active_eps;
unsigned int limit_active_eps;
@@ -1878,6 +1879,8 @@ struct xhci_hcd {
void *dbc;
/* platform-specific data -- must come last */
unsigned long priv[0] __aligned(sizeof(s64));
+ /* Broken Suspend flag for SNPS Suspend resume issue */
+ u8 broken_suspend;
};

/* Platform specific overrides to generic XHCI hc_driver ops */
--
2.7.4



2018-11-24 08:18:36

by Kai-Heng Feng

[permalink] [raw]
Subject: Re: [PATCH v2] xhci: workaround CSS timeout on AMD SNPS 3.0 xHC

Hi Sandeep,

> On Nov 22, 2018, at 12:23 PM, Singh, Sandeep <[email protected]> wrote:
>
> From: Sandeep Singh <[email protected]>
>
> Occasionally AMD SNPS 3.0 xHC does not respond to
> CSS when set, also it does not flag anything on SRE and HCE
> to point the internal xHC errors on USBSTS register. This stalls
> the entire system wide suspend and there is no point in stalling
> just because of xHC CSS is not responding.
>
> To work around this problem, if the xHC does not flag
> anything on SRE and HCE, we can skip the CSS
> timeout and allow the system to continue the suspend. Once the
> system resume happens we can internally reset the controller
> using XHCI_RESET_ON_RESUME quirk
>
> Signed-off-by: Shyam Sundar S K <[email protected]>
> Signed-off-by: Sandeep Singh <[email protected]>
> cc: Nehal Shah <[email protected]>
> ---
> Changes since v1:
>
> -> New Variable based decision making when SNPS issue happens hence
> quirk interdependency removed.
> -> Removed STS conditional check in suspend function.
>
> drivers/usb/host/xhci-pci.c | 4 ++++
> drivers/usb/host/xhci.c | 26 ++++++++++++++++++++++----
> drivers/usb/host/xhci.h | 3 +++
> 3 files changed, 29 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
> index 01c5705..72493c4 100644
> --- a/drivers/usb/host/xhci-pci.c
> +++ b/drivers/usb/host/xhci-pci.c
> @@ -139,6 +139,10 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
> pdev->device == 0x43bb))
> xhci->quirks |= XHCI_SUSPEND_DELAY;
>
> + if (pdev->vendor == PCI_VENDOR_ID_AMD &&
> + (pdev->device == 0x15e0 || pdev->device == 0x15e1))
> + xhci->quirks |= XHCI_SNPS_BROKEN_SUSPEND;
> +
> if (pdev->vendor == PCI_VENDOR_ID_AMD)
> xhci->quirks |= XHCI_TRUST_TX_LENGTH;
>
> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
> index 0420eef..808677d 100644
> --- a/drivers/usb/host/xhci.c
> +++ b/drivers/usb/host/xhci.c
> @@ -970,6 +970,7 @@ int xhci_suspend(struct xhci_hcd *xhci, bool do_wakeup)
> unsigned int delay = XHCI_MAX_HALT_USEC;
> struct usb_hcd *hcd = xhci_to_hcd(xhci);
> u32 command;
> + u32 res;
>
> if (!hcd->state)
> return 0;
> @@ -1023,11 +1024,28 @@ int xhci_suspend(struct xhci_hcd *xhci, bool do_wakeup)
> command = readl(&xhci->op_regs->command);
> command |= CMD_CSS;
> writel(command, &xhci->op_regs->command);
> + xhci->broken_suspend = 0;
> if (xhci_handshake(&xhci->op_regs->status,
> STS_SAVE, 0, 10 * 1000)) {
> - xhci_warn(xhci, "WARN: xHC save state timeout\n");
> - spin_unlock_irq(&xhci->lock);
> - return -ETIMEDOUT;
> + /*
> + * AMD SNPS xHC 3.0 occasionally does not clear the
> + * SSS bit of USBSTS and when driver tries to poll
> + * to see if the xHC clears BIT(8) which never happens
> + * and driver assumes that controller is not responding
> + * and times out. To workaround this, its good to check
> + * if SRE and HCE bits are not set (as per xhci
> + * Section 5.4.2) and bypass the timeout.
> + */
> + res = readl(&xhci->op_regs->status);
> + if ((xhci->quirks & XHCI_SNPS_BROKEN_SUSPEND) &&
> + (((res & STS_SRE) == 0) &&
> + ((res & STS_HCE) == 0))) {
> + xhci->broken_suspend = 1;
> + } else {
> + xhci_warn(xhci, "WARN: xHC save state timeout\n");
> + spin_unlock_irq(&xhci->lock);
> + return -ETIMEDOUT;
> + }
> }
> spin_unlock_irq(&xhci->lock);
>
> @@ -1080,7 +1098,7 @@ int xhci_resume(struct xhci_hcd *xhci, bool hibernated)
> set_bit(HCD_FLAG_HW_ACCESSIBLE, &xhci->shared_hcd->flags);
>
> spin_lock_irq(&xhci->lock);
> - if (xhci->quirks & XHCI_RESET_ON_RESUME)
> + if ((xhci->quirks & XHCI_RESET_ON_RESUME) || xhci->broken_suspend)
> hibernated = true;
>
> if (!hibernated) {
> diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
> index bf0b369..d5d19b2 100644
> --- a/drivers/usb/host/xhci.h
> +++ b/drivers/usb/host/xhci.h
> @@ -1849,6 +1849,7 @@ struct xhci_hcd {
> #define XHCI_INTEL_USB_ROLE_SW BIT_ULL(31)
> #define XHCI_ZERO_64B_REGS BIT_ULL(32)
> #define XHCI_DEFAULT_PM_RUNTIME_ALLOW BIT_ULL(33)
> +#define XHCI_SNPS_BROKEN_SUSPEND BIT_ULL(34)

This bit is already in use by a another patch, so please update its value.

Kai-Heng

>
> unsigned int num_active_eps;
> unsigned int limit_active_eps;
> @@ -1878,6 +1879,8 @@ struct xhci_hcd {
> void *dbc;
> /* platform-specific data -- must come last */
> unsigned long priv[0] __aligned(sizeof(s64));
> + /* Broken Suspend flag for SNPS Suspend resume issue */
> + u8 broken_suspend;
> };
>
> /* Platform specific overrides to generic XHCI hc_driver ops */
> --
> 2.7.4
>


2018-11-24 08:38:34

by Singh, Sandeep

[permalink] [raw]
Subject: Re: [PATCH v2] xhci: workaround CSS timeout on AMD SNPS 3.0 xHC

Hi Kai-heng,

On 11/23/2018 2:59 PM, Kai Heng Feng wrote:
> Hi Sandeep,
>
>> On Nov 22, 2018, at 12:23 PM, Singh, Sandeep <[email protected]> wrote:
>>
>> From: Sandeep Singh <[email protected]>
>>
>> Occasionally AMD SNPS 3.0 xHC does not respond to
>> CSS when set, also it does not flag anything on SRE and HCE
>> to point the internal xHC errors on USBSTS register. This stalls
>> the entire system wide suspend and there is no point in stalling
>> just because of xHC CSS is not responding.
>>
>> To work around this problem, if the xHC does not flag
>> anything on SRE and HCE, we can skip the CSS
>> timeout and allow the system to continue the suspend. Once the
>> system resume happens we can internally reset the controller
>> using XHCI_RESET_ON_RESUME quirk
>>
>> Signed-off-by: Shyam Sundar S K <[email protected]>
>> Signed-off-by: Sandeep Singh <[email protected]>
>> cc: Nehal Shah <[email protected]>
>> ---
>> Changes since v1:
>>
>> -> New Variable based decision making when SNPS issue happens hence
>> quirk interdependency removed.
>> -> Removed STS conditional check in suspend function.
>>
>> drivers/usb/host/xhci-pci.c | 4 ++++
>> drivers/usb/host/xhci.c | 26 ++++++++++++++++++++++----
>> drivers/usb/host/xhci.h | 3 +++
>> 3 files changed, 29 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
>> index 01c5705..72493c4 100644
>> --- a/drivers/usb/host/xhci-pci.c
>> +++ b/drivers/usb/host/xhci-pci.c
>> @@ -139,6 +139,10 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
>> pdev->device == 0x43bb))
>> xhci->quirks |= XHCI_SUSPEND_DELAY;
>>
>> + if (pdev->vendor == PCI_VENDOR_ID_AMD &&
>> + (pdev->device == 0x15e0 || pdev->device == 0x15e1))
>> + xhci->quirks |= XHCI_SNPS_BROKEN_SUSPEND;
>> +
>> if (pdev->vendor == PCI_VENDOR_ID_AMD)
>> xhci->quirks |= XHCI_TRUST_TX_LENGTH;
>>
>> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
>> index 0420eef..808677d 100644
>> --- a/drivers/usb/host/xhci.c
>> +++ b/drivers/usb/host/xhci.c
>> @@ -970,6 +970,7 @@ int xhci_suspend(struct xhci_hcd *xhci, bool do_wakeup)
>> unsigned int delay = XHCI_MAX_HALT_USEC;
>> struct usb_hcd *hcd = xhci_to_hcd(xhci);
>> u32 command;
>> + u32 res;
>>
>> if (!hcd->state)
>> return 0;
>> @@ -1023,11 +1024,28 @@ int xhci_suspend(struct xhci_hcd *xhci, bool do_wakeup)
>> command = readl(&xhci->op_regs->command);
>> command |= CMD_CSS;
>> writel(command, &xhci->op_regs->command);
>> + xhci->broken_suspend = 0;
>> if (xhci_handshake(&xhci->op_regs->status,
>> STS_SAVE, 0, 10 * 1000)) {
>> - xhci_warn(xhci, "WARN: xHC save state timeout\n");
>> - spin_unlock_irq(&xhci->lock);
>> - return -ETIMEDOUT;
>> + /*
>> + * AMD SNPS xHC 3.0 occasionally does not clear the
>> + * SSS bit of USBSTS and when driver tries to poll
>> + * to see if the xHC clears BIT(8) which never happens
>> + * and driver assumes that controller is not responding
>> + * and times out. To workaround this, its good to check
>> + * if SRE and HCE bits are not set (as per xhci
>> + * Section 5.4.2) and bypass the timeout.
>> + */
>> + res = readl(&xhci->op_regs->status);
>> + if ((xhci->quirks & XHCI_SNPS_BROKEN_SUSPEND) &&
>> + (((res & STS_SRE) == 0) &&
>> + ((res & STS_HCE) == 0))) {
>> + xhci->broken_suspend = 1;
>> + } else {
>> + xhci_warn(xhci, "WARN: xHC save state timeout\n");
>> + spin_unlock_irq(&xhci->lock);
>> + return -ETIMEDOUT;
>> + }
>> }
>> spin_unlock_irq(&xhci->lock);
>>
>> @@ -1080,7 +1098,7 @@ int xhci_resume(struct xhci_hcd *xhci, bool hibernated)
>> set_bit(HCD_FLAG_HW_ACCESSIBLE, &xhci->shared_hcd->flags);
>>
>> spin_lock_irq(&xhci->lock);
>> - if (xhci->quirks & XHCI_RESET_ON_RESUME)
>> + if ((xhci->quirks & XHCI_RESET_ON_RESUME) || xhci->broken_suspend)
>> hibernated = true;
>>
>> if (!hibernated) {
>> diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h
>> index bf0b369..d5d19b2 100644
>> --- a/drivers/usb/host/xhci.h
>> +++ b/drivers/usb/host/xhci.h
>> @@ -1849,6 +1849,7 @@ struct xhci_hcd {
>> #define XHCI_INTEL_USB_ROLE_SW BIT_ULL(31)
>> #define XHCI_ZERO_64B_REGS BIT_ULL(32)
>> #define XHCI_DEFAULT_PM_RUNTIME_ALLOW BIT_ULL(33)
>> +#define XHCI_SNPS_BROKEN_SUSPEND BIT_ULL(34)
>
> This bit is already in use by a another patch, so please update its value.
Thanks for addressing we will be sending patch v3.
>
> Kai-Heng
>
>>
>> unsigned int num_active_eps;
>> unsigned int limit_active_eps;
>> @@ -1878,6 +1879,8 @@ struct xhci_hcd {
>> void *dbc;
>> /* platform-specific data -- must come last */
>> unsigned long priv[0] __aligned(sizeof(s64));
>> + /* Broken Suspend flag for SNPS Suspend resume issue */
>> + u8 broken_suspend;
>> };
>>
>> /* Platform specific overrides to generic XHCI hc_driver ops */
>> --
>> 2.7.4
>>
>