2020-01-08 11:39:38

by Pawel Laszczak

[permalink] [raw]
Subject: [PATCH] usb: cdns3: Fix: ARM core hang after connect/disconnect operation.

The ARM core hang when access USB register after tens of thousands
connect/disconnect operation.

The issue was observed on platform with android system and is not easy
to reproduce. During test controller works at HS device mode with host
connected.

The test is based on continuous disabling/enabling USB device function
what cause continuous setting DEVDS/DEVEN bit in USB_CONF register.

For testing was used composite device consisting from ADP and RNDIS
function.

Presumably the problem was caused by DMA transfer made after setting
DEVDS bit. To resolve this issue fix stops all DMA transfer before
setting DEVDS bit.

Signed-off-by: Pawel Laszczak <[email protected]>
Signed-off-by: Peter Chan <[email protected]>
Reported-by: Peter Chan <[email protected]>
Fixes: 7733f6c32e36 ("usb: cdns3: Add Cadence USB3 DRD Driver")
---
drivers/usb/cdns3/gadget.c | 84 ++++++++++++++++++++++++++------------
drivers/usb/cdns3/gadget.h | 1 +
2 files changed, 58 insertions(+), 27 deletions(-)

diff --git a/drivers/usb/cdns3/gadget.c b/drivers/usb/cdns3/gadget.c
index 4c1e75509303..277ed8484032 100644
--- a/drivers/usb/cdns3/gadget.c
+++ b/drivers/usb/cdns3/gadget.c
@@ -1516,6 +1516,49 @@ static int cdns3_ep_onchip_buffer_reserve(struct cdns3_device *priv_dev,
return 0;
}

+static int cdns3_disable_reset_ep(struct cdns3_device *priv_dev,
+ struct cdns3_endpoint *priv_ep)
+{
+ unsigned long flags;
+ u32 val;
+ int ret;
+
+ spin_lock_irqsave(&priv_dev->lock, flags);
+
+ if (priv_ep->flags & EP_HW_RESETED) {
+ spin_unlock_irqrestore(&priv_dev->lock, flags);
+ return 0;
+ }
+
+ cdns3_select_ep(priv_dev, priv_ep->endpoint.desc->bEndpointAddress);
+
+ val = readl(&priv_dev->regs->ep_cfg);
+ val &= ~EP_CFG_ENABLE;
+ writel(val, &priv_dev->regs->ep_cfg);
+
+ /**
+ * Driver needs some time before resetting endpoint.
+ * It need waits for clearing DBUSY bit or for timeout expired.
+ * 10us is enough time for controller to stop transfer.
+ */
+ readl_poll_timeout_atomic(&priv_dev->regs->ep_sts, val,
+ !(val & EP_STS_DBUSY), 1, 10);
+ writel(EP_CMD_EPRST, &priv_dev->regs->ep_cmd);
+
+ ret = readl_poll_timeout_atomic(&priv_dev->regs->ep_cmd, val,
+ !(val & (EP_CMD_CSTALL | EP_CMD_EPRST)),
+ 1, 1000);
+
+ if (unlikely(ret))
+ dev_err(priv_dev->dev, "Timeout: %s resetting failed.\n",
+ priv_ep->name);
+
+ priv_ep->flags |= EP_HW_RESETED;
+ spin_unlock_irqrestore(&priv_dev->lock, flags);
+
+ return ret;
+}
+
void cdns3_configure_dmult(struct cdns3_device *priv_dev,
struct cdns3_endpoint *priv_ep)
{
@@ -1893,8 +1936,6 @@ static int cdns3_gadget_ep_disable(struct usb_ep *ep)
struct usb_request *request;
unsigned long flags;
int ret = 0;
- u32 ep_cfg;
- int val;

if (!ep) {
pr_err("usbss: invalid parameters\n");
@@ -1908,32 +1949,11 @@ static int cdns3_gadget_ep_disable(struct usb_ep *ep)
"%s is already disabled\n", priv_ep->name))
return 0;

- spin_lock_irqsave(&priv_dev->lock, flags);
-
trace_cdns3_gadget_ep_disable(priv_ep);

- cdns3_select_ep(priv_dev, ep->desc->bEndpointAddress);
-
- ep_cfg = readl(&priv_dev->regs->ep_cfg);
- ep_cfg &= ~EP_CFG_ENABLE;
- writel(ep_cfg, &priv_dev->regs->ep_cfg);
-
- /**
- * Driver needs some time before resetting endpoint.
- * It need waits for clearing DBUSY bit or for timeout expired.
- * 10us is enough time for controller to stop transfer.
- */
- readl_poll_timeout_atomic(&priv_dev->regs->ep_sts, val,
- !(val & EP_STS_DBUSY), 1, 10);
- writel(EP_CMD_EPRST, &priv_dev->regs->ep_cmd);
-
- readl_poll_timeout_atomic(&priv_dev->regs->ep_cmd, val,
- !(val & (EP_CMD_CSTALL | EP_CMD_EPRST)),
- 1, 1000);
- if (unlikely(ret))
- dev_err(priv_dev->dev, "Timeout: %s resetting failed.\n",
- priv_ep->name);
+ cdns3_disable_reset_ep(priv_dev, priv_ep);

+ spin_lock_irqsave(&priv_dev->lock, flags);
while (!list_empty(&priv_ep->pending_req_list)) {
request = cdns3_next_request(&priv_ep->pending_req_list);

@@ -1962,6 +1982,7 @@ static int cdns3_gadget_ep_disable(struct usb_ep *ep)

ep->desc = NULL;
priv_ep->flags &= ~EP_ENABLED;
+ priv_ep->flags |= EP_CLAIMED | EP_HW_RESETED;

spin_unlock_irqrestore(&priv_dev->lock, flags);

@@ -2282,11 +2303,20 @@ static int cdns3_gadget_set_selfpowered(struct usb_gadget *gadget,
static int cdns3_gadget_pullup(struct usb_gadget *gadget, int is_on)
{
struct cdns3_device *priv_dev = gadget_to_cdns3_device(gadget);
+ int i;

- if (is_on)
+ if (is_on) {
writel(USB_CONF_DEVEN, &priv_dev->regs->usb_conf);
- else
+ } else {
+ for (i = 1; i < CDNS3_ENDPOINTS_MAX_COUNT; i++) {
+ if (priv_dev->eps[i] &&
+ priv_dev->eps[i]->flags & EP_ENABLED)
+ cdns3_disable_reset_ep(priv_dev,
+ priv_dev->eps[i]);
+ }
+
writel(USB_CONF_DEVDS, &priv_dev->regs->usb_conf);
+ }

return 0;
}
diff --git a/drivers/usb/cdns3/gadget.h b/drivers/usb/cdns3/gadget.h
index bc4024041ef2..b6cc222b9f58 100644
--- a/drivers/usb/cdns3/gadget.h
+++ b/drivers/usb/cdns3/gadget.h
@@ -1142,6 +1142,7 @@ struct cdns3_endpoint {
#define EP_QUIRK_END_TRANSFER BIT(11)
#define EP_QUIRK_EXTRA_BUF_DET BIT(12)
#define EP_QUIRK_EXTRA_BUF_EN BIT(13)
+#define EP_HW_RESETED BIT(14)
u32 flags;

struct cdns3_request *descmis_req;
--
2.17.1


2020-01-08 14:50:20

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] usb: cdns3: Fix: ARM core hang after connect/disconnect operation.

On Wed, Jan 08, 2020 at 12:37:18PM +0100, Pawel Laszczak wrote:
> The ARM core hang when access USB register after tens of thousands
> connect/disconnect operation.
>
> The issue was observed on platform with android system and is not easy
> to reproduce. During test controller works at HS device mode with host
> connected.
>
> The test is based on continuous disabling/enabling USB device function
> what cause continuous setting DEVDS/DEVEN bit in USB_CONF register.
>
> For testing was used composite device consisting from ADP and RNDIS
> function.
>
> Presumably the problem was caused by DMA transfer made after setting
> DEVDS bit. To resolve this issue fix stops all DMA transfer before
> setting DEVDS bit.
>
> Signed-off-by: Pawel Laszczak <[email protected]>
> Signed-off-by: Peter Chan <[email protected]>
> Reported-by: Peter Chan <[email protected]>
> Fixes: 7733f6c32e36 ("usb: cdns3: Add Cadence USB3 DRD Driver")
> ---
> drivers/usb/cdns3/gadget.c | 84 ++++++++++++++++++++++++++------------
> drivers/usb/cdns3/gadget.h | 1 +
> 2 files changed, 58 insertions(+), 27 deletions(-)

Any reason to forget [email protected] for usb patches?

thanks,

greg k-h

2020-01-08 15:46:31

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] usb: cdns3: Fix: ARM core hang after connect/disconnect operation.

On Wed, Jan 08, 2020 at 12:37:18PM +0100, Pawel Laszczak wrote:
> The ARM core hang when access USB register after tens of thousands
> connect/disconnect operation.
>
> The issue was observed on platform with android system and is not easy
> to reproduce. During test controller works at HS device mode with host
> connected.
>
> The test is based on continuous disabling/enabling USB device function
> what cause continuous setting DEVDS/DEVEN bit in USB_CONF register.
>
> For testing was used composite device consisting from ADP and RNDIS
> function.
>
> Presumably the problem was caused by DMA transfer made after setting
> DEVDS bit. To resolve this issue fix stops all DMA transfer before
> setting DEVDS bit.
>
> Signed-off-by: Pawel Laszczak <[email protected]>
> Signed-off-by: Peter Chan <[email protected]>
> Reported-by: Peter Chan <[email protected]>
> Fixes: 7733f6c32e36 ("usb: cdns3: Add Cadence USB3 DRD Driver")

As this is in the 5.4 kernel release, you should have a "Cc: stable..."
line in here, right?

> ---
> drivers/usb/cdns3/gadget.c | 84 ++++++++++++++++++++++++++------------
> drivers/usb/cdns3/gadget.h | 1 +
> 2 files changed, 58 insertions(+), 27 deletions(-)
>
> diff --git a/drivers/usb/cdns3/gadget.c b/drivers/usb/cdns3/gadget.c
> index 4c1e75509303..277ed8484032 100644
> --- a/drivers/usb/cdns3/gadget.c
> +++ b/drivers/usb/cdns3/gadget.c
> @@ -1516,6 +1516,49 @@ static int cdns3_ep_onchip_buffer_reserve(struct cdns3_device *priv_dev,
> return 0;
> }
>
> +static int cdns3_disable_reset_ep(struct cdns3_device *priv_dev,
> + struct cdns3_endpoint *priv_ep)
> +{
> + unsigned long flags;
> + u32 val;
> + int ret;
> +
> + spin_lock_irqsave(&priv_dev->lock, flags);
> +
> + if (priv_ep->flags & EP_HW_RESETED) {

Is "reseted" a word? :)

> + spin_unlock_irqrestore(&priv_dev->lock, flags);
> + return 0;
> + }
> +
> + cdns3_select_ep(priv_dev, priv_ep->endpoint.desc->bEndpointAddress);
> +
> + val = readl(&priv_dev->regs->ep_cfg);
> + val &= ~EP_CFG_ENABLE;
> + writel(val, &priv_dev->regs->ep_cfg);
> +
> + /**

No need for kernel-doc comment style please.

> + * Driver needs some time before resetting endpoint.
> + * It need waits for clearing DBUSY bit or for timeout expired.
> + * 10us is enough time for controller to stop transfer.
> + */
> + readl_poll_timeout_atomic(&priv_dev->regs->ep_sts, val,
> + !(val & EP_STS_DBUSY), 1, 10);

You don't care if you time out?

> + writel(EP_CMD_EPRST, &priv_dev->regs->ep_cmd);
> +
> + ret = readl_poll_timeout_atomic(&priv_dev->regs->ep_cmd, val,
> + !(val & (EP_CMD_CSTALL | EP_CMD_EPRST)),
> + 1, 1000);
> +
> + if (unlikely(ret))

Unless you can measure the difference of using/not using a
unlikely/likely mark, NEVER use it. The compiler and cpu can almost
always do better than you can, we have the tests to prove it.

> + dev_err(priv_dev->dev, "Timeout: %s resetting failed.\n",
> + priv_ep->name);
> +
> + priv_ep->flags |= EP_HW_RESETED;

So an error happens, but you still claim the device is reset? What can
a user do about this error?

Yes, I know you copied this from other code in this driver, but it
doesn't look right to me.

> + spin_unlock_irqrestore(&priv_dev->lock, flags);

Why print while a spinlock is held? That's mean, you could have printed
here, right?

> +
> + return ret;
> +}
> +
> void cdns3_configure_dmult(struct cdns3_device *priv_dev,
> struct cdns3_endpoint *priv_ep)
> {
> @@ -1893,8 +1936,6 @@ static int cdns3_gadget_ep_disable(struct usb_ep *ep)
> struct usb_request *request;
> unsigned long flags;
> int ret = 0;
> - u32 ep_cfg;
> - int val;
>
> if (!ep) {
> pr_err("usbss: invalid parameters\n");
> @@ -1908,32 +1949,11 @@ static int cdns3_gadget_ep_disable(struct usb_ep *ep)
> "%s is already disabled\n", priv_ep->name))
> return 0;
>
> - spin_lock_irqsave(&priv_dev->lock, flags);
> -
> trace_cdns3_gadget_ep_disable(priv_ep);
>
> - cdns3_select_ep(priv_dev, ep->desc->bEndpointAddress);
> -
> - ep_cfg = readl(&priv_dev->regs->ep_cfg);
> - ep_cfg &= ~EP_CFG_ENABLE;
> - writel(ep_cfg, &priv_dev->regs->ep_cfg);
> -
> - /**
> - * Driver needs some time before resetting endpoint.
> - * It need waits for clearing DBUSY bit or for timeout expired.
> - * 10us is enough time for controller to stop transfer.
> - */
> - readl_poll_timeout_atomic(&priv_dev->regs->ep_sts, val,
> - !(val & EP_STS_DBUSY), 1, 10);
> - writel(EP_CMD_EPRST, &priv_dev->regs->ep_cmd);
> -
> - readl_poll_timeout_atomic(&priv_dev->regs->ep_cmd, val,
> - !(val & (EP_CMD_CSTALL | EP_CMD_EPRST)),
> - 1, 1000);
> - if (unlikely(ret))
> - dev_err(priv_dev->dev, "Timeout: %s resetting failed.\n",
> - priv_ep->name);
> + cdns3_disable_reset_ep(priv_dev, priv_ep);
>
> + spin_lock_irqsave(&priv_dev->lock, flags);
> while (!list_empty(&priv_ep->pending_req_list)) {
> request = cdns3_next_request(&priv_ep->pending_req_list);
>
> @@ -1962,6 +1982,7 @@ static int cdns3_gadget_ep_disable(struct usb_ep *ep)
>
> ep->desc = NULL;
> priv_ep->flags &= ~EP_ENABLED;
> + priv_ep->flags |= EP_CLAIMED | EP_HW_RESETED;

Why do you now set EP_CLAIMED here when you never set that before? Is
this a different type of change?

thanks,

greg k-h

2020-01-09 04:19:11

by Pawel Laszczak

[permalink] [raw]
Subject: RE: [PATCH] usb: cdns3: Fix: ARM core hang after connect/disconnect operation.

Hi,

>
>
>On Wed, Jan 08, 2020 at 12:37:18PM +0100, Pawel Laszczak wrote:
>> The ARM core hang when access USB register after tens of thousands
>> connect/disconnect operation.
>>
>> The issue was observed on platform with android system and is not easy
>> to reproduce. During test controller works at HS device mode with host
>> connected.
>>
>> The test is based on continuous disabling/enabling USB device function
>> what cause continuous setting DEVDS/DEVEN bit in USB_CONF register.
>>
>> For testing was used composite device consisting from ADP and RNDIS
>> function.
>>
>> Presumably the problem was caused by DMA transfer made after setting
>> DEVDS bit. To resolve this issue fix stops all DMA transfer before
>> setting DEVDS bit.
>>
>> Signed-off-by: Pawel Laszczak <[email protected]>
>> Signed-off-by: Peter Chan <[email protected]>
>> Reported-by: Peter Chan <[email protected]>
>> Fixes: 7733f6c32e36 ("usb: cdns3: Add Cadence USB3 DRD Driver")
>> ---
>> drivers/usb/cdns3/gadget.c | 84 ++++++++++++++++++++++++++------------
>> drivers/usb/cdns3/gadget.h | 1 +
>> 2 files changed, 58 insertions(+), 27 deletions(-)
>
>Any reason to forget [email protected] for usb patches?

No reason. I missed it.
+ [email protected]

Thanks,
Pawell

>
>thanks,
>
>greg k-h

2020-01-09 06:28:13

by Pawel Laszczak

[permalink] [raw]
Subject: RE: [PATCH] usb: cdns3: Fix: ARM core hang after connect/disconnect operation.

Hi Greg,

+ [email protected]

>
>On Wed, Jan 08, 2020 at 12:37:18PM +0100, Pawel Laszczak wrote:
>> The ARM core hang when access USB register after tens of thousands
>> connect/disconnect operation.
>>
>> The issue was observed on platform with android system and is not easy
>> to reproduce. During test controller works at HS device mode with host
>> connected.
>>
>> The test is based on continuous disabling/enabling USB device function
>> what cause continuous setting DEVDS/DEVEN bit in USB_CONF register.
>>
>> For testing was used composite device consisting from ADP and RNDIS
>> function.
>>
>> Presumably the problem was caused by DMA transfer made after setting
>> DEVDS bit. To resolve this issue fix stops all DMA transfer before
>> setting DEVDS bit.
>>
>> Signed-off-by: Pawel Laszczak <[email protected]>
>> Signed-off-by: Peter Chan <[email protected]>
>> Reported-by: Peter Chan <[email protected]>
>> Fixes: 7733f6c32e36 ("usb: cdns3: Add Cadence USB3 DRD Driver")
>
>As this is in the 5.4 kernel release, you should have a "Cc: stable..."
>line in here, right?

Ok,

>
>> ---
>> drivers/usb/cdns3/gadget.c | 84 ++++++++++++++++++++++++++------------
>> drivers/usb/cdns3/gadget.h | 1 +
>> 2 files changed, 58 insertions(+), 27 deletions(-)
>>
>> diff --git a/drivers/usb/cdns3/gadget.c b/drivers/usb/cdns3/gadget.c
>> index 4c1e75509303..277ed8484032 100644
>> --- a/drivers/usb/cdns3/gadget.c
>> +++ b/drivers/usb/cdns3/gadget.c
>> @@ -1516,6 +1516,49 @@ static int cdns3_ep_onchip_buffer_reserve(struct cdns3_device *priv_dev,
>> return 0;
>> }
>>
>> +static int cdns3_disable_reset_ep(struct cdns3_device *priv_dev,
>> + struct cdns3_endpoint *priv_ep)
>> +{
>> + unsigned long flags;
>> + u32 val;
>> + int ret;
>> +
>> + spin_lock_irqsave(&priv_dev->lock, flags);
>> +
>> + if (priv_ep->flags & EP_HW_RESETED) {
>
>Is "reseted" a word? :)
>
>> + spin_unlock_irqrestore(&priv_dev->lock, flags);
>> + return 0;
>> + }
>> +
>> + cdns3_select_ep(priv_dev, priv_ep->endpoint.desc->bEndpointAddress);
>> +
>> + val = readl(&priv_dev->regs->ep_cfg);
>> + val &= ~EP_CFG_ENABLE;
>> + writel(val, &priv_dev->regs->ep_cfg);
>> +
>> + /**
>
>No need for kernel-doc comment style please.
>
>> + * Driver needs some time before resetting endpoint.
>> + * It need waits for clearing DBUSY bit or for timeout expired.
>> + * 10us is enough time for controller to stop transfer.
>> + */
>> + readl_poll_timeout_atomic(&priv_dev->regs->ep_sts, val,
>> + !(val & EP_STS_DBUSY), 1, 10);
>
>You don't care if you time out?
I don't need care of it. The next timeout is critical.
>
>> + writel(EP_CMD_EPRST, &priv_dev->regs->ep_cmd);
>> +
>> + ret = readl_poll_timeout_atomic(&priv_dev->regs->ep_cmd, val,
>> + !(val & (EP_CMD_CSTALL | EP_CMD_EPRST)),
>> + 1, 1000);
>> +
>> + if (unlikely(ret))
>
>Unless you can measure the difference of using/not using a
>unlikely/likely mark, NEVER use it. The compiler and cpu can almost
>always do better than you can, we have the tests to prove it.
>

The both of the above timeout should never occur. If they occurred it would be a
critical controller bug. In this case driver can only inform about this event.
For timeouts used in driver I've never see an errors. Because debugging these
kind of errors is very hard I decided to leave dev_err in such case to inform that
probably something is wrong in HW project.

I will remove unlikely.

>> + dev_err(priv_dev->dev, "Timeout: %s resetting failed.\n",
>> + priv_ep->name);
>> +
>> + priv_ep->flags |= EP_HW_RESETED;
>
>So an error happens, but you still claim the device is reset? What can
>a user do about this error?

The error should never occur.

>
>Yes, I know you copied this from other code in this driver, but it
>doesn't look right to me.

>
>> + spin_unlock_irqrestore(&priv_dev->lock, flags);
>
>Why print while a spinlock is held? That's mean, you could have printed
>here, right?
>

Yes, it's true.

>> +
>> + return ret;
>> +}
>> +
>> void cdns3_configure_dmult(struct cdns3_device *priv_dev,
>> struct cdns3_endpoint *priv_ep)
>> {
>> @@ -1893,8 +1936,6 @@ static int cdns3_gadget_ep_disable(struct usb_ep *ep)
>> struct usb_request *request;
>> unsigned long flags;
>> int ret = 0;
>> - u32 ep_cfg;
>> - int val;
>>
>> if (!ep) {
>> pr_err("usbss: invalid parameters\n");
>> @@ -1908,32 +1949,11 @@ static int cdns3_gadget_ep_disable(struct usb_ep *ep)
>> "%s is already disabled\n", priv_ep->name))
>> return 0;
>>
>> - spin_lock_irqsave(&priv_dev->lock, flags);
>> -
>> trace_cdns3_gadget_ep_disable(priv_ep);
>>
>> - cdns3_select_ep(priv_dev, ep->desc->bEndpointAddress);
>> -
>> - ep_cfg = readl(&priv_dev->regs->ep_cfg);
>> - ep_cfg &= ~EP_CFG_ENABLE;
>> - writel(ep_cfg, &priv_dev->regs->ep_cfg);
>> -
>> - /**
>> - * Driver needs some time before resetting endpoint.
>> - * It need waits for clearing DBUSY bit or for timeout expired.
>> - * 10us is enough time for controller to stop transfer.
>> - */
>> - readl_poll_timeout_atomic(&priv_dev->regs->ep_sts, val,
>> - !(val & EP_STS_DBUSY), 1, 10);
>> - writel(EP_CMD_EPRST, &priv_dev->regs->ep_cmd);
>> -
>> - readl_poll_timeout_atomic(&priv_dev->regs->ep_cmd, val,
>> - !(val & (EP_CMD_CSTALL | EP_CMD_EPRST)),
>> - 1, 1000);
>> - if (unlikely(ret))
>> - dev_err(priv_dev->dev, "Timeout: %s resetting failed.\n",
>> - priv_ep->name);
>> + cdns3_disable_reset_ep(priv_dev, priv_ep);
>>
>> + spin_lock_irqsave(&priv_dev->lock, flags);
>> while (!list_empty(&priv_ep->pending_req_list)) {
>> request = cdns3_next_request(&priv_ep->pending_req_list);
>>
>> @@ -1962,6 +1982,7 @@ static int cdns3_gadget_ep_disable(struct usb_ep *ep)
>>
>> ep->desc = NULL;
>> priv_ep->flags &= ~EP_ENABLED;
>> + priv_ep->flags |= EP_CLAIMED | EP_HW_RESETED;
>
>Why do you now set EP_CLAIMED here when you never set that before? Is
>this a different type of change?

My mistake. Thanks for letting me know. I merged two different patches fixing the same issue and
I missed it.

>
>thanks,
>
>greg k-h

Thanks,
Pawell

2020-01-09 06:39:43

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] usb: cdns3: Fix: ARM core hang after connect/disconnect operation.

On Thu, Jan 09, 2020 at 06:27:02AM +0000, Pawel Laszczak wrote:
> >> + writel(EP_CMD_EPRST, &priv_dev->regs->ep_cmd);
> >> +
> >> + ret = readl_poll_timeout_atomic(&priv_dev->regs->ep_cmd, val,
> >> + !(val & (EP_CMD_CSTALL | EP_CMD_EPRST)),
> >> + 1, 1000);
> >> +
> >> + if (unlikely(ret))
> >
> >Unless you can measure the difference of using/not using a
> >unlikely/likely mark, NEVER use it. The compiler and cpu can almost
> >always do better than you can, we have the tests to prove it.
> >
>
> The both of the above timeout should never occur. If they occurred it would be a
> critical controller bug. In this case driver can only inform about this event.

"Should never occur" is a fun thing to say :)

If it can never occur, then don't even check for it.

If it can, then check for it and handle it properly.

What about this controller in systems with removable busses (like PCI?)
What happens then (hint, I bet this could occur...)

> For timeouts used in driver I've never see an errors. Because debugging these
> kind of errors is very hard I decided to leave dev_err in such case to inform that
> probably something is wrong in HW project.
>
> I will remove unlikely.
>
> >> + dev_err(priv_dev->dev, "Timeout: %s resetting failed.\n",
> >> + priv_ep->name);
> >> +
> >> + priv_ep->flags |= EP_HW_RESETED;
> >
> >So an error happens, but you still claim the device is reset? What can
> >a user do about this error?
>
> The error should never occur.

Then no need to warn anyone, just wait and move on.

Or properly handle it.

thanks,

greg k-h

2020-01-09 08:36:34

by Pawel Laszczak

[permalink] [raw]
Subject: RE: [PATCH] usb: cdns3: Fix: ARM core hang after connect/disconnect operation.

>
>On Thu, Jan 09, 2020 at 06:27:02AM +0000, Pawel Laszczak wrote:
>> >> + writel(EP_CMD_EPRST, &priv_dev->regs->ep_cmd);
>> >> +
>> >> + ret = readl_poll_timeout_atomic(&priv_dev->regs->ep_cmd, val,
>> >> + !(val & (EP_CMD_CSTALL | EP_CMD_EPRST)),
>> >> + 1, 1000);
>> >> +
>> >> + if (unlikely(ret))
>> >
>> >Unless you can measure the difference of using/not using a
>> >unlikely/likely mark, NEVER use it. The compiler and cpu can almost
>> >always do better than you can, we have the tests to prove it.
>> >
>>
>> The both of the above timeout should never occur. If they occurred it would be a
>> critical controller bug. In this case driver can only inform about this event.
>
>"Should never occur" is a fun thing to say :)
>
>If it can never occur, then don't even check for it.

Yes, on existing platforms it can never occur.

>
>If it can, then check for it and handle it properly.
>
>What about this controller in systems with removable busses (like PCI?)
>What happens then (hint, I bet this could occur...)

It's good question. Nobody from our customer currently use such system.
The only platform with PCI is used by me for testing purpose.
I will talk with HW team about such cases. Maybe in the feature such improvement
will be necessary.
It's important fix for our customer so I will delete this checking for now.

>
>> For timeouts used in driver I've never see an errors. Because debugging these
>> kind of errors is very hard I decided to leave dev_err in such case to inform that
>> probably something is wrong in HW project.
>>
>> I will remove unlikely.
>>
>> >> + dev_err(priv_dev->dev, "Timeout: %s resetting failed.\n",
>> >> + priv_ep->name);
>> >> +
>> >> + priv_ep->flags |= EP_HW_RESETED;
>> >
>> >So an error happens, but you still claim the device is reset? What can
>> >a user do about this error?
>>
>> The error should never occur.
>
>Then no need to warn anyone, just wait and move on.
>
>Or properly handle it.
>
>thanks,
>
>greg k-h

Thanks,

Pawell

2020-01-09 10:54:02

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] usb: cdns3: Fix: ARM core hang after connect/disconnect operation.

On Thu, Jan 09, 2020 at 08:34:12AM +0000, Pawel Laszczak wrote:
> >
> >On Thu, Jan 09, 2020 at 06:27:02AM +0000, Pawel Laszczak wrote:
> >> >> + writel(EP_CMD_EPRST, &priv_dev->regs->ep_cmd);
> >> >> +
> >> >> + ret = readl_poll_timeout_atomic(&priv_dev->regs->ep_cmd, val,
> >> >> + !(val & (EP_CMD_CSTALL | EP_CMD_EPRST)),
> >> >> + 1, 1000);
> >> >> +
> >> >> + if (unlikely(ret))
> >> >
> >> >Unless you can measure the difference of using/not using a
> >> >unlikely/likely mark, NEVER use it. The compiler and cpu can almost
> >> >always do better than you can, we have the tests to prove it.
> >> >
> >>
> >> The both of the above timeout should never occur. If they occurred it would be a
> >> critical controller bug. In this case driver can only inform about this event.
> >
> >"Should never occur" is a fun thing to say :)
> >
> >If it can never occur, then don't even check for it.
>
> Yes, on existing platforms it can never occur.
>
> >
> >If it can, then check for it and handle it properly.
> >
> >What about this controller in systems with removable busses (like PCI?)
> >What happens then (hint, I bet this could occur...)
>
> It's good question. Nobody from our customer currently use such system.
> The only platform with PCI is used by me for testing purpose.

So if you do have a PCI device, then you need to handle PCI reads
failing and returning all 1s. Hopefully you can gracefully handle this :)

Adding timeout handling here, where it is totally obvious to do so,
would be a good thing.

thanks,

greg k-h

2020-01-14 08:58:39

by Peter Chen

[permalink] [raw]
Subject: Re: [PATCH] usb: cdns3: Fix: ARM core hang after connect/disconnect operation.

On Thu, Jan 9, 2020 at 5:39 PM Greg KH <[email protected]> wrote:
>
> On Thu, Jan 09, 2020 at 08:34:12AM +0000, Pawel Laszczak wrote:
> > >
> > >On Thu, Jan 09, 2020 at 06:27:02AM +0000, Pawel Laszczak wrote:
> > >> >> + writel(EP_CMD_EPRST, &priv_dev->regs->ep_cmd);
> > >> >> +
> > >> >> + ret = readl_poll_timeout_atomic(&priv_dev->regs->ep_cmd, val,
> > >> >> + !(val & (EP_CMD_CSTALL | EP_CMD_EPRST)),
> > >> >> + 1, 1000);
> > >> >> +
> > >> >> + if (unlikely(ret))
> > >> >
> > >> >Unless you can measure the difference of using/not using a
> > >> >unlikely/likely mark, NEVER use it. The compiler and cpu can almost
> > >> >always do better than you can, we have the tests to prove it.
> > >> >
> > >>
> > >> The both of the above timeout should never occur. If they occurred it would be a
> > >> critical controller bug. In this case driver can only inform about this event.
> > >
> > >"Should never occur" is a fun thing to say :)
> > >
> > >If it can never occur, then don't even check for it.
> >
> > Yes, on existing platforms it can never occur.
> >
> > >
> > >If it can, then check for it and handle it properly.
> > >
> > >What about this controller in systems with removable busses (like PCI?)
> > >What happens then (hint, I bet this could occur...)
> >
> > It's good question. Nobody from our customer currently use such system.
> > The only platform with PCI is used by me for testing purpose.
>
> So if you do have a PCI device, then you need to handle PCI reads
> failing and returning all 1s. Hopefully you can gracefully handle this :)
>
> Adding timeout handling here, where it is totally obvious to do so,
> would be a good thing.
>
> thanks,
>
> greg k-h

Hi Pawel,

My email is: [email protected], please change it when your send next version.

Peter

2020-01-14 09:07:46

by Pawel Laszczak

[permalink] [raw]
Subject: RE: [PATCH] usb: cdns3: Fix: ARM core hang after connect/disconnect operation.

>
>On Thu, Jan 9, 2020 at 5:39 PM Greg KH <[email protected]> wrote:
>>
>> On Thu, Jan 09, 2020 at 08:34:12AM +0000, Pawel Laszczak wrote:
>> > >
>> > >On Thu, Jan 09, 2020 at 06:27:02AM +0000, Pawel Laszczak wrote:
>> > >> >> + writel(EP_CMD_EPRST, &priv_dev->regs->ep_cmd);
>> > >> >> +
>> > >> >> + ret = readl_poll_timeout_atomic(&priv_dev->regs->ep_cmd, val,
>> > >> >> + !(val & (EP_CMD_CSTALL | EP_CMD_EPRST)),
>> > >> >> + 1, 1000);
>> > >> >> +
>> > >> >> + if (unlikely(ret))
>> > >> >
>> > >> >Unless you can measure the difference of using/not using a
>> > >> >unlikely/likely mark, NEVER use it. The compiler and cpu can almost
>> > >> >always do better than you can, we have the tests to prove it.
>> > >> >
>> > >>
>> > >> The both of the above timeout should never occur. If they occurred it would be a
>> > >> critical controller bug. In this case driver can only inform about this event.
>> > >
>> > >"Should never occur" is a fun thing to say :)
>> > >
>> > >If it can never occur, then don't even check for it.
>> >
>> > Yes, on existing platforms it can never occur.
>> >
>> > >
>> > >If it can, then check for it and handle it properly.
>> > >
>> > >What about this controller in systems with removable busses (like PCI?)
>> > >What happens then (hint, I bet this could occur...)
>> >
>> > It's good question. Nobody from our customer currently use such system.
>> > The only platform with PCI is used by me for testing purpose.
>>
>> So if you do have a PCI device, then you need to handle PCI reads
>> failing and returning all 1s. Hopefully you can gracefully handle this :)
>>
>> Adding timeout handling here, where it is totally obvious to do so,
>> would be a good thing.
>>
>> thanks,
>>
>> greg k-h
>
>Hi Pawel,
>
>My email is: [email protected], please change it when your send next version.

Hi Peter,

I'm sorry for that.

BTW. I need postpone sending the next version of this patch because there is some issue with synchronization of repositories regarding the patch "usb: cdns3: Add streams support to cadence USB3 DRD driver"
I don't want generate new conflicts.

Regards,
Pawell
>
>Peter

2020-01-14 09:11:02

by Pawel Laszczak

[permalink] [raw]
Subject: RE: [PATCH] usb: cdns3: Fix: ARM core hang after connect/disconnect operation.

>
>On Thu, Jan 9, 2020 at 5:39 PM Greg KH <[email protected]> wrote:
>>
>> On Thu, Jan 09, 2020 at 08:34:12AM +0000, Pawel Laszczak wrote:
>> > >
>> > >On Thu, Jan 09, 2020 at 06:27:02AM +0000, Pawel Laszczak wrote:
>> > >> >> + writel(EP_CMD_EPRST, &priv_dev->regs->ep_cmd);
>> > >> >> +
>> > >> >> + ret = readl_poll_timeout_atomic(&priv_dev->regs->ep_cmd, val,
>> > >> >> + !(val & (EP_CMD_CSTALL | EP_CMD_EPRST)),
>> > >> >> + 1, 1000);
>> > >> >> +
>> > >> >> + if (unlikely(ret))
>> > >> >
>> > >> >Unless you can measure the difference of using/not using a
>> > >> >unlikely/likely mark, NEVER use it. The compiler and cpu can almost
>> > >> >always do better than you can, we have the tests to prove it.
>> > >> >
>> > >>
>> > >> The both of the above timeout should never occur. If they occurred it would be a
>> > >> critical controller bug. In this case driver can only inform about this event.
>> > >
>> > >"Should never occur" is a fun thing to say :)
>> > >
>> > >If it can never occur, then don't even check for it.
>> >
>> > Yes, on existing platforms it can never occur.
>> >
>> > >
>> > >If it can, then check for it and handle it properly.
>> > >
>> > >What about this controller in systems with removable busses (like PCI?)
>> > >What happens then (hint, I bet this could occur...)
>> >
>> > It's good question. Nobody from our customer currently use such system.
>> > The only platform with PCI is used by me for testing purpose.
>>
>> So if you do have a PCI device, then you need to handle PCI reads
>> failing and returning all 1s. Hopefully you can gracefully handle this :)
>>
>> Adding timeout handling here, where it is totally obvious to do so,
>> would be a good thing.
>>
>> thanks,
>>
>> greg k-h
>
>Hi Pawel,
>
>My email is: [email protected], please change it when your send next version.

Hi Peter,

I'm sorry for that.

BTW. I need postpone sending the next version of this patch because there is some issue with synchronization of repositories regarding the patch "usb: cdns3: Add streams support to cadence USB3 DRD driver"
I don't want generate new conflicts.

Regards,
Pawell

>
>Peter