2018-02-27 11:24:43

by Roger Quadros

[permalink] [raw]
Subject: [PATCH] usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume

In the following test we get stuck by sleeping forever in _dwc3_set_mode()
after which dual-role switching doesn't work.

On dra7-evm's dual-role port,
- Load g_zero gadget driver and enumerate to host
- suspend to mem
- disconnect USB cable to host and connect otg cable with Pen drive in it.
- resume system
- we sleep indefinitely in _dwc3_set_mode due to.
dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()->
dwc3_gadget_stop()->wait_event_lock_irq()

Let's clear the DWC3_EP_END_TRANSFER_PENDING flag on all endpoints
so we don't wait in dwc3_gadget_stop().

Signed-off-by: Roger Quadros <[email protected]>
---
drivers/usb/dwc3/gadget.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 2bda4eb..0a360da 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -3273,6 +3273,20 @@ int dwc3_gadget_init(struct dwc3 *dwc)

void dwc3_gadget_exit(struct dwc3 *dwc)
{
+ int epnum;
+ unsigned long flags;
+
+ spin_lock_irqsave(&dwc->lock, flags);
+ for (epnum = 2; epnum < DWC3_ENDPOINTS_NUM; epnum++) {
+ struct dwc3_ep *dep = dwc->eps[epnum];
+
+ if (!dep)
+ continue;
+
+ dep->flags &= ~DWC3_EP_END_TRANSFER_PENDING;
+ }
+ spin_unlock_irqrestore(&dwc->lock, flags);
+
usb_del_gadget_udc(&dwc->gadget);
dwc3_gadget_free_endpoints(dwc);
dma_free_coherent(dwc->sysdev, DWC3_BOUNCE_SIZE, dwc->bounce,
--
cheers,
-roger

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki



2018-02-28 03:05:42

by Baolin Wang

[permalink] [raw]
Subject: Re: [PATCH] usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume

Hi Roger,

On 27 February 2018 at 19:22, Roger Quadros <[email protected]> wrote:
> In the following test we get stuck by sleeping forever in _dwc3_set_mode()
> after which dual-role switching doesn't work.
>
> On dra7-evm's dual-role port,
> - Load g_zero gadget driver and enumerate to host
> - suspend to mem
> - disconnect USB cable to host and connect otg cable with Pen drive in it.
> - resume system
> - we sleep indefinitely in _dwc3_set_mode due to.
> dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()->
> dwc3_gadget_stop()->wait_event_lock_irq()
>
> Let's clear the DWC3_EP_END_TRANSFER_PENDING flag on all endpoints
> so we don't wait in dwc3_gadget_stop().

I am curious why the DWC3_DEPEVT_EPCMDCMPLT event was not triggered
any more when you executed the DWC3_DEPCMD_ENDTRANSFER command?

>
> Signed-off-by: Roger Quadros <[email protected]>
> ---
> drivers/usb/dwc3/gadget.c | 14 ++++++++++++++
> 1 file changed, 14 insertions(+)
>
> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> index 2bda4eb..0a360da 100644
> --- a/drivers/usb/dwc3/gadget.c
> +++ b/drivers/usb/dwc3/gadget.c
> @@ -3273,6 +3273,20 @@ int dwc3_gadget_init(struct dwc3 *dwc)
>
> void dwc3_gadget_exit(struct dwc3 *dwc)
> {
> + int epnum;
> + unsigned long flags;
> +
> + spin_lock_irqsave(&dwc->lock, flags);
> + for (epnum = 2; epnum < DWC3_ENDPOINTS_NUM; epnum++) {
> + struct dwc3_ep *dep = dwc->eps[epnum];
> +
> + if (!dep)
> + continue;
> +
> + dep->flags &= ~DWC3_EP_END_TRANSFER_PENDING;
> + }
> + spin_unlock_irqrestore(&dwc->lock, flags);
> +
> usb_del_gadget_udc(&dwc->gadget);
> dwc3_gadget_free_endpoints(dwc);
> dma_free_coherent(dwc->sysdev, DWC3_BOUNCE_SIZE, dwc->bounce,
> --

--
Baolin.wang
Best Regards

2018-02-28 07:55:46

by Felipe Balbi

[permalink] [raw]
Subject: Re: [PATCH] usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume


Hi,

Roger Quadros <[email protected]> writes:
> In the following test we get stuck by sleeping forever in _dwc3_set_mode()
> after which dual-role switching doesn't work.
>
> On dra7-evm's dual-role port,
> - Load g_zero gadget driver and enumerate to host
> - suspend to mem
> - disconnect USB cable to host and connect otg cable with Pen drive in it.
> - resume system
> - we sleep indefinitely in _dwc3_set_mode due to.
> dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()->
> dwc3_gadget_stop()->wait_event_lock_irq()
>
> Let's clear the DWC3_EP_END_TRANSFER_PENDING flag on all endpoints
> so we don't wait in dwc3_gadget_stop().
>
> Signed-off-by: Roger Quadros <[email protected]>
> ---
> drivers/usb/dwc3/gadget.c | 14 ++++++++++++++
> 1 file changed, 14 insertions(+)
>
> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> index 2bda4eb..0a360da 100644
> --- a/drivers/usb/dwc3/gadget.c
> +++ b/drivers/usb/dwc3/gadget.c
> @@ -3273,6 +3273,20 @@ int dwc3_gadget_init(struct dwc3 *dwc)
>
> void dwc3_gadget_exit(struct dwc3 *dwc)
> {
> + int epnum;
> + unsigned long flags;
> +
> + spin_lock_irqsave(&dwc->lock, flags);
> + for (epnum = 2; epnum < DWC3_ENDPOINTS_NUM; epnum++) {
> + struct dwc3_ep *dep = dwc->eps[epnum];
> +
> + if (!dep)
> + continue;
> +
> + dep->flags &= ~DWC3_EP_END_TRANSFER_PENDING;
> + }
> + spin_unlock_irqrestore(&dwc->lock, flags);
> +
> usb_del_gadget_udc(&dwc->gadget);
> dwc3_gadget_free_endpoints(dwc);

free endpoints is a better place for this. It's already going to free
the memory anyway. Might as well clear all flags to 0 there.

--
balbi


Attachments:
signature.asc (847.00 B)

2018-02-28 09:56:32

by Roger Quadros

[permalink] [raw]
Subject: Re: [PATCH] usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume

Hi Baolin,

On 28/02/18 05:04, Baolin Wang wrote:
> Hi Roger,
>
> On 27 February 2018 at 19:22, Roger Quadros <[email protected]> wrote:
>> In the following test we get stuck by sleeping forever in _dwc3_set_mode()
>> after which dual-role switching doesn't work.
>>
>> On dra7-evm's dual-role port,
>> - Load g_zero gadget driver and enumerate to host
>> - suspend to mem
>> - disconnect USB cable to host and connect otg cable with Pen drive in it.
>> - resume system
>> - we sleep indefinitely in _dwc3_set_mode due to.
>> dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()->
>> dwc3_gadget_stop()->wait_event_lock_irq()
>>
>> Let's clear the DWC3_EP_END_TRANSFER_PENDING flag on all endpoints
>> so we don't wait in dwc3_gadget_stop().
>
> I am curious why the DWC3_DEPEVT_EPCMDCMPLT event was not triggered
> any more when you executed the DWC3_DEPCMD_ENDTRANSFER command?

In this particular case the USB gadget has been disconnected from the host so
we shouldn't be expecting any command completion events.

>
>>
>> Signed-off-by: Roger Quadros <[email protected]>
>> ---
>> drivers/usb/dwc3/gadget.c | 14 ++++++++++++++
>> 1 file changed, 14 insertions(+)
>>
>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>> index 2bda4eb..0a360da 100644
>> --- a/drivers/usb/dwc3/gadget.c
>> +++ b/drivers/usb/dwc3/gadget.c
>> @@ -3273,6 +3273,20 @@ int dwc3_gadget_init(struct dwc3 *dwc)
>>
>> void dwc3_gadget_exit(struct dwc3 *dwc)
>> {
>> + int epnum;
>> + unsigned long flags;
>> +
>> + spin_lock_irqsave(&dwc->lock, flags);
>> + for (epnum = 2; epnum < DWC3_ENDPOINTS_NUM; epnum++) {
>> + struct dwc3_ep *dep = dwc->eps[epnum];
>> +
>> + if (!dep)
>> + continue;
>> +
>> + dep->flags &= ~DWC3_EP_END_TRANSFER_PENDING;
>> + }
>> + spin_unlock_irqrestore(&dwc->lock, flags);
>> +
>> usb_del_gadget_udc(&dwc->gadget);
>> dwc3_gadget_free_endpoints(dwc);
>> dma_free_coherent(dwc->sysdev, DWC3_BOUNCE_SIZE, dwc->bounce,
>> --
>

--
cheers,
-roger

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2018-02-28 10:00:45

by Roger Quadros

[permalink] [raw]
Subject: Re: [PATCH] usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume

Felipe,

On 28/02/18 09:53, Felipe Balbi wrote:
>
> Hi,
>
> Roger Quadros <[email protected]> writes:
>> In the following test we get stuck by sleeping forever in _dwc3_set_mode()
>> after which dual-role switching doesn't work.
>>
>> On dra7-evm's dual-role port,
>> - Load g_zero gadget driver and enumerate to host
>> - suspend to mem
>> - disconnect USB cable to host and connect otg cable with Pen drive in it.
>> - resume system
>> - we sleep indefinitely in _dwc3_set_mode due to.
>> dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()->
>> dwc3_gadget_stop()->wait_event_lock_irq()
>>
>> Let's clear the DWC3_EP_END_TRANSFER_PENDING flag on all endpoints
>> so we don't wait in dwc3_gadget_stop().
>>
>> Signed-off-by: Roger Quadros <[email protected]>
>> ---
>> drivers/usb/dwc3/gadget.c | 14 ++++++++++++++
>> 1 file changed, 14 insertions(+)
>>
>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>> index 2bda4eb..0a360da 100644
>> --- a/drivers/usb/dwc3/gadget.c
>> +++ b/drivers/usb/dwc3/gadget.c
>> @@ -3273,6 +3273,20 @@ int dwc3_gadget_init(struct dwc3 *dwc)
>>
>> void dwc3_gadget_exit(struct dwc3 *dwc)
>> {
>> + int epnum;
>> + unsigned long flags;
>> +
>> + spin_lock_irqsave(&dwc->lock, flags);
>> + for (epnum = 2; epnum < DWC3_ENDPOINTS_NUM; epnum++) {
>> + struct dwc3_ep *dep = dwc->eps[epnum];
>> +
>> + if (!dep)
>> + continue;
>> +
>> + dep->flags &= ~DWC3_EP_END_TRANSFER_PENDING;
>> + }
>> + spin_unlock_irqrestore(&dwc->lock, flags);
>> +
>> usb_del_gadget_udc(&dwc->gadget);
>> dwc3_gadget_free_endpoints(dwc);
>
> free endpoints is a better place for this. It's already going to free
> the memory anyway. Might as well clear all flags to 0 there.
>

But it won't solve the deadlock issue. Since dwc3_gadget_free_endpoints()
is called after usb_del_gadget_udc() and the deadlock happens when

usb_del_gadget_udc()->udc_stop()->dwc3_gadget_stop()->wait_event_lock_irq()

and DWC3_EP_END_TRANSFER_PENDING flag is set.

--
cheers,
-roger

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2018-03-05 09:02:02

by Felipe Balbi

[permalink] [raw]
Subject: Re: [PATCH] usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume


Hi,

Roger Quadros <[email protected]> writes:
>> Roger Quadros <[email protected]> writes:
>>> In the following test we get stuck by sleeping forever in _dwc3_set_mode()
>>> after which dual-role switching doesn't work.
>>>
>>> On dra7-evm's dual-role port,
>>> - Load g_zero gadget driver and enumerate to host
>>> - suspend to mem
>>> - disconnect USB cable to host and connect otg cable with Pen drive in it.
>>> - resume system
>>> - we sleep indefinitely in _dwc3_set_mode due to.
>>> dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()->
>>> dwc3_gadget_stop()->wait_event_lock_irq()
>>>
>>> Let's clear the DWC3_EP_END_TRANSFER_PENDING flag on all endpoints
>>> so we don't wait in dwc3_gadget_stop().
>>>
>>> Signed-off-by: Roger Quadros <[email protected]>
>>> ---
>>> drivers/usb/dwc3/gadget.c | 14 ++++++++++++++
>>> 1 file changed, 14 insertions(+)
>>>
>>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>>> index 2bda4eb..0a360da 100644
>>> --- a/drivers/usb/dwc3/gadget.c
>>> +++ b/drivers/usb/dwc3/gadget.c
>>> @@ -3273,6 +3273,20 @@ int dwc3_gadget_init(struct dwc3 *dwc)
>>>
>>> void dwc3_gadget_exit(struct dwc3 *dwc)
>>> {
>>> + int epnum;
>>> + unsigned long flags;
>>> +
>>> + spin_lock_irqsave(&dwc->lock, flags);
>>> + for (epnum = 2; epnum < DWC3_ENDPOINTS_NUM; epnum++) {
>>> + struct dwc3_ep *dep = dwc->eps[epnum];
>>> +
>>> + if (!dep)
>>> + continue;
>>> +
>>> + dep->flags &= ~DWC3_EP_END_TRANSFER_PENDING;
>>> + }
>>> + spin_unlock_irqrestore(&dwc->lock, flags);
>>> +
>>> usb_del_gadget_udc(&dwc->gadget);
>>> dwc3_gadget_free_endpoints(dwc);
>>
>> free endpoints is a better place for this. It's already going to free
>> the memory anyway. Might as well clear all flags to 0 there.
>>
>
> But it won't solve the deadlock issue. Since dwc3_gadget_free_endpoints()
> is called after usb_del_gadget_udc() and the deadlock happens when
>
> usb_del_gadget_udc()->udc_stop()->dwc3_gadget_stop()->wait_event_lock_irq()
>
> and DWC3_EP_END_TRANSFER_PENDING flag is set.

indeed. Iterating twice over the entire endpoint list seems
wasteful. Perhaps we just shouldn't wait when removing the UDC since
that's essentially what this patch will do, right? If you clear the flag
before calling ->udc_stop(), this means the loop in dwc3_gadget_stop()
will do nothing. Might as well remove it.

--
balbi


Attachments:
signature.asc (847.00 B)

2018-03-05 09:58:44

by Roger Quadros

[permalink] [raw]
Subject: Re: [PATCH] usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume

Felipe,

On 05/03/18 10:49, Felipe Balbi wrote:
>
> Hi,
>
> Roger Quadros <[email protected]> writes:
>>> Roger Quadros <[email protected]> writes:
>>>> In the following test we get stuck by sleeping forever in _dwc3_set_mode()
>>>> after which dual-role switching doesn't work.
>>>>
>>>> On dra7-evm's dual-role port,
>>>> - Load g_zero gadget driver and enumerate to host
>>>> - suspend to mem
>>>> - disconnect USB cable to host and connect otg cable with Pen drive in it.
>>>> - resume system
>>>> - we sleep indefinitely in _dwc3_set_mode due to.
>>>> dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()->
>>>> dwc3_gadget_stop()->wait_event_lock_irq()
>>>>
>>>> Let's clear the DWC3_EP_END_TRANSFER_PENDING flag on all endpoints
>>>> so we don't wait in dwc3_gadget_stop().
>>>>
>>>> Signed-off-by: Roger Quadros <[email protected]>
>>>> ---
>>>> drivers/usb/dwc3/gadget.c | 14 ++++++++++++++
>>>> 1 file changed, 14 insertions(+)
>>>>
>>>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>>>> index 2bda4eb..0a360da 100644
>>>> --- a/drivers/usb/dwc3/gadget.c
>>>> +++ b/drivers/usb/dwc3/gadget.c
>>>> @@ -3273,6 +3273,20 @@ int dwc3_gadget_init(struct dwc3 *dwc)
>>>>
>>>> void dwc3_gadget_exit(struct dwc3 *dwc)
>>>> {
>>>> + int epnum;
>>>> + unsigned long flags;
>>>> +
>>>> + spin_lock_irqsave(&dwc->lock, flags);
>>>> + for (epnum = 2; epnum < DWC3_ENDPOINTS_NUM; epnum++) {
>>>> + struct dwc3_ep *dep = dwc->eps[epnum];
>>>> +
>>>> + if (!dep)
>>>> + continue;
>>>> +
>>>> + dep->flags &= ~DWC3_EP_END_TRANSFER_PENDING;
>>>> + }
>>>> + spin_unlock_irqrestore(&dwc->lock, flags);
>>>> +
>>>> usb_del_gadget_udc(&dwc->gadget);
>>>> dwc3_gadget_free_endpoints(dwc);
>>>
>>> free endpoints is a better place for this. It's already going to free
>>> the memory anyway. Might as well clear all flags to 0 there.
>>>
>>
>> But it won't solve the deadlock issue. Since dwc3_gadget_free_endpoints()
>> is called after usb_del_gadget_udc() and the deadlock happens when
>>
>> usb_del_gadget_udc()->udc_stop()->dwc3_gadget_stop()->wait_event_lock_irq()
>>
>> and DWC3_EP_END_TRANSFER_PENDING flag is set.
>
> indeed. Iterating twice over the entire endpoint list seems
> wasteful. Perhaps we just shouldn't wait when removing the UDC since
> that's essentially what this patch will do, right? If you clear the flag
> before calling ->udc_stop(), this means the loop in dwc3_gadget_stop()
> will do nothing. Might as well remove it.
>

This means that we will never wait for DWC3_EP_END_TRANSFER_PENDING to clear
in dwc3_gadget_stop() like we used to. This is perfectly fine, right?

It makes sense to me as dwc3_gadget_stop() calls __dwc3_gadget_stop() which
masks all interrupts and nobody will ever clear that flag if it was set.

And there is no point in clearing the DWC3_EP_END_TRANSFER_PENDING flag
in dwc3_gadget_free_endpoints() since we're freeing the dwc3_ep memory there.

dwc3_gadget_init_endpoints() will start with a clean slate.

--
cheers,
-roger

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2018-03-05 10:44:38

by Baolin Wang

[permalink] [raw]
Subject: Re: [PATCH] usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume

Hi Roger,

On 5 March 2018 at 17:45, Roger Quadros <[email protected]> wrote:
> Felipe,
>
> On 05/03/18 10:49, Felipe Balbi wrote:
>>
>> Hi,
>>
>> Roger Quadros <[email protected]> writes:
>>>> Roger Quadros <[email protected]> writes:
>>>>> In the following test we get stuck by sleeping forever in _dwc3_set_mode()
>>>>> after which dual-role switching doesn't work.
>>>>>
>>>>> On dra7-evm's dual-role port,
>>>>> - Load g_zero gadget driver and enumerate to host
>>>>> - suspend to mem
>>>>> - disconnect USB cable to host and connect otg cable with Pen drive in it.
>>>>> - resume system
>>>>> - we sleep indefinitely in _dwc3_set_mode due to.
>>>>> dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()->
>>>>> dwc3_gadget_stop()->wait_event_lock_irq()
>>>>>
>>>>> Let's clear the DWC3_EP_END_TRANSFER_PENDING flag on all endpoints
>>>>> so we don't wait in dwc3_gadget_stop().
>>>>>
>>>>> Signed-off-by: Roger Quadros <[email protected]>
>>>>> ---
>>>>> drivers/usb/dwc3/gadget.c | 14 ++++++++++++++
>>>>> 1 file changed, 14 insertions(+)
>>>>>
>>>>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>>>>> index 2bda4eb..0a360da 100644
>>>>> --- a/drivers/usb/dwc3/gadget.c
>>>>> +++ b/drivers/usb/dwc3/gadget.c
>>>>> @@ -3273,6 +3273,20 @@ int dwc3_gadget_init(struct dwc3 *dwc)
>>>>>
>>>>> void dwc3_gadget_exit(struct dwc3 *dwc)
>>>>> {
>>>>> + int epnum;
>>>>> + unsigned long flags;
>>>>> +
>>>>> + spin_lock_irqsave(&dwc->lock, flags);
>>>>> + for (epnum = 2; epnum < DWC3_ENDPOINTS_NUM; epnum++) {
>>>>> + struct dwc3_ep *dep = dwc->eps[epnum];
>>>>> +
>>>>> + if (!dep)
>>>>> + continue;
>>>>> +
>>>>> + dep->flags &= ~DWC3_EP_END_TRANSFER_PENDING;
>>>>> + }
>>>>> + spin_unlock_irqrestore(&dwc->lock, flags);
>>>>> +
>>>>> usb_del_gadget_udc(&dwc->gadget);
>>>>> dwc3_gadget_free_endpoints(dwc);
>>>>
>>>> free endpoints is a better place for this. It's already going to free
>>>> the memory anyway. Might as well clear all flags to 0 there.
>>>>
>>>
>>> But it won't solve the deadlock issue. Since dwc3_gadget_free_endpoints()
>>> is called after usb_del_gadget_udc() and the deadlock happens when
>>>
>>> usb_del_gadget_udc()->udc_stop()->dwc3_gadget_stop()->wait_event_lock_irq()
>>>
>>> and DWC3_EP_END_TRANSFER_PENDING flag is set.
>>
>> indeed. Iterating twice over the entire endpoint list seems
>> wasteful. Perhaps we just shouldn't wait when removing the UDC since
>> that's essentially what this patch will do, right? If you clear the flag
>> before calling ->udc_stop(), this means the loop in dwc3_gadget_stop()
>> will do nothing. Might as well remove it.
>>
>
> This means that we will never wait for DWC3_EP_END_TRANSFER_PENDING to clear
> in dwc3_gadget_stop() like we used to. This is perfectly fine, right?
>
> It makes sense to me as dwc3_gadget_stop() calls __dwc3_gadget_stop() which
> masks all interrupts and nobody will ever clear that flag if it was set.

I don't think so. It can not mask the endpoint events, please check
the events which will be masked in DEVTEN register. The reason why we
should wait for DWC3_EP_END_TRANSFER_PENDING to clear is that,
sometimes the DWC3_DEPEVT_EPCMDCMPLT event will be triggered later
than 100us, but now we may have freed the gadget irq which will cause
crash.

--
Baolin.wang
Best Regards

2018-03-05 11:54:29

by Felipe Balbi

[permalink] [raw]
Subject: Re: [PATCH] usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume


Hi,

Roger Quadros <[email protected]> writes:
> On 05/03/18 13:06, Felipe Balbi wrote:
>>
>> Hi,
>>
>> Baolin Wang <[email protected]> writes:
>>>>> Roger Quadros <[email protected]> writes:
>>>>>>> Roger Quadros <[email protected]> writes:
>>>>>>>> In the following test we get stuck by sleeping forever in _dwc3_set_mode()
>>>>>>>> after which dual-role switching doesn't work.
>>>>>>>>
>>>>>>>> On dra7-evm's dual-role port,
>>>>>>>> - Load g_zero gadget driver and enumerate to host
>>>>>>>> - suspend to mem
>>>>>>>> - disconnect USB cable to host and connect otg cable with Pen drive in it.
>>>>>>>> - resume system
>>>>>>>> - we sleep indefinitely in _dwc3_set_mode due to.
>>>>>>>> dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()->
>>>>>>>> dwc3_gadget_stop()->wait_event_lock_irq()
>>>>>>>>
>>>>>>>> Let's clear the DWC3_EP_END_TRANSFER_PENDING flag on all endpoints
>>>>>>>> so we don't wait in dwc3_gadget_stop().
>>>>>>>>
>>>>>>>> Signed-off-by: Roger Quadros <[email protected]>
>>>>>>>> ---
>>>>>>>> drivers/usb/dwc3/gadget.c | 14 ++++++++++++++
>>>>>>>> 1 file changed, 14 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>>>>>>>> index 2bda4eb..0a360da 100644
>>>>>>>> --- a/drivers/usb/dwc3/gadget.c
>>>>>>>> +++ b/drivers/usb/dwc3/gadget.c
>>>>>>>> @@ -3273,6 +3273,20 @@ int dwc3_gadget_init(struct dwc3 *dwc)
>>>>>>>>
>>>>>>>> void dwc3_gadget_exit(struct dwc3 *dwc)
>>>>>>>> {
>>>>>>>> + int epnum;
>>>>>>>> + unsigned long flags;
>>>>>>>> +
>>>>>>>> + spin_lock_irqsave(&dwc->lock, flags);
>>>>>>>> + for (epnum = 2; epnum < DWC3_ENDPOINTS_NUM; epnum++) {
>>>>>>>> + struct dwc3_ep *dep = dwc->eps[epnum];
>>>>>>>> +
>>>>>>>> + if (!dep)
>>>>>>>> + continue;
>>>>>>>> +
>>>>>>>> + dep->flags &= ~DWC3_EP_END_TRANSFER_PENDING;
>>>>>>>> + }
>>>>>>>> + spin_unlock_irqrestore(&dwc->lock, flags);
>>>>>>>> +
>>>>>>>> usb_del_gadget_udc(&dwc->gadget);
>>>>>>>> dwc3_gadget_free_endpoints(dwc);
>>>>>>>
>>>>>>> free endpoints is a better place for this. It's already going to free
>>>>>>> the memory anyway. Might as well clear all flags to 0 there.
>>>>>>>
>>>>>>
>>>>>> But it won't solve the deadlock issue. Since dwc3_gadget_free_endpoints()
>>>>>> is called after usb_del_gadget_udc() and the deadlock happens when
>>>>>>
>>>>>> usb_del_gadget_udc()->udc_stop()->dwc3_gadget_stop()->wait_event_lock_irq()
>>>>>>
>>>>>> and DWC3_EP_END_TRANSFER_PENDING flag is set.
>>>>>
>>>>> indeed. Iterating twice over the entire endpoint list seems
>>>>> wasteful. Perhaps we just shouldn't wait when removing the UDC since
>>>>> that's essentially what this patch will do, right? If you clear the flag
>>>>> before calling ->udc_stop(), this means the loop in dwc3_gadget_stop()
>>>>> will do nothing. Might as well remove it.
>>>>>
>>>>
>>>> This means that we will never wait for DWC3_EP_END_TRANSFER_PENDING to clear
>>>> in dwc3_gadget_stop() like we used to. This is perfectly fine, right?
>>>>
>>>> It makes sense to me as dwc3_gadget_stop() calls __dwc3_gadget_stop() which
>>>> masks all interrupts and nobody will ever clear that flag if it was set.
>>>
>>> I don't think so. It can not mask the endpoint events, please check
>>> the events which will be masked in DEVTEN register. The reason why we
>>> should wait for DWC3_EP_END_TRANSFER_PENDING to clear is that,
>>> sometimes the DWC3_DEPEVT_EPCMDCMPLT event will be triggered later
>>> than 100us, but now we may have freed the gadget irq which will cause
>>> crash.
>>
>> We could mask command complete events as soon as ->udc_stop() is called,
>> right? Hmm, actually, __dwc3_gadget_stop() already clears DEVTEN
>> completely.
>
> But which bit in DEVTEN says Endpoint events are disabled?
>
>>
>> /me goes check databook
>>
>> At least on revision 2.60a of the databook, bit 10 is reserved. I wonder
>> if that's the start of all the problems. Anybody has access to older and
>> newer databook revisions so we can cross-check?
>>
>
> I can access v2.40 and v3.10 books.
>
> bit 10 is reserved on both
>
> Differences in v2.4 vs v3.10 are:
>
> bit 8 reserved vs L1SUSPEN
> bit 13 reserved vs StopOnDisconnectEn
> bit 14 reserved vs L1WKUPEVTEN

odd, at some point we lost command complete interrupt :-(

That line exists since first commit (see below), so that would mean it
existed in 1.73a (the revision the original was written against), but
vanished on 2.40a. Perhaps 2.00a still had it.

Hey John, do you know, off the top of your head, when we lost DEVTEN[10]
as mask/unmask bit for EP Command Completion IRQs?


commit 72246da40f3719af3bfd104a2365b32537c27d83
Author: Felipe Balbi <[email protected]>
Date: Fri Aug 19 18:10:58 2011 +0300

usb: Introduce DesignWare USB3 DRD Driver

--
balbi


Attachments:
signature.asc (847.00 B)

2018-03-05 11:57:35

by Felipe Balbi

[permalink] [raw]
Subject: Re: [PATCH] usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume


Hi,

Baolin Wang <[email protected]> writes:
>>>>>>>>> void dwc3_gadget_exit(struct dwc3 *dwc)
>>>>>>>>> {
>>>>>>>>> + int epnum;
>>>>>>>>> + unsigned long flags;
>>>>>>>>> +
>>>>>>>>> + spin_lock_irqsave(&dwc->lock, flags);
>>>>>>>>> + for (epnum = 2; epnum < DWC3_ENDPOINTS_NUM; epnum++) {
>>>>>>>>> + struct dwc3_ep *dep = dwc->eps[epnum];
>>>>>>>>> +
>>>>>>>>> + if (!dep)
>>>>>>>>> + continue;
>>>>>>>>> +
>>>>>>>>> + dep->flags &= ~DWC3_EP_END_TRANSFER_PENDING;
>>>>>>>>> + }
>>>>>>>>> + spin_unlock_irqrestore(&dwc->lock, flags);
>>>>>>>>> +
>>>>>>>>> usb_del_gadget_udc(&dwc->gadget);
>>>>>>>>> dwc3_gadget_free_endpoints(dwc);
>>>>>>>>
>>>>>>>> free endpoints is a better place for this. It's already going to free
>>>>>>>> the memory anyway. Might as well clear all flags to 0 there.
>>>>>>>>
>>>>>>>
>>>>>>> But it won't solve the deadlock issue. Since dwc3_gadget_free_endpoints()
>>>>>>> is called after usb_del_gadget_udc() and the deadlock happens when
>>>>>>>
>>>>>>> usb_del_gadget_udc()->udc_stop()->dwc3_gadget_stop()->wait_event_lock_irq()
>>>>>>>
>>>>>>> and DWC3_EP_END_TRANSFER_PENDING flag is set.
>>>>>>
>>>>>> indeed. Iterating twice over the entire endpoint list seems
>>>>>> wasteful. Perhaps we just shouldn't wait when removing the UDC since
>>>>>> that's essentially what this patch will do, right? If you clear the flag
>>>>>> before calling ->udc_stop(), this means the loop in dwc3_gadget_stop()
>>>>>> will do nothing. Might as well remove it.
>>>>>>
>>>>>
>>>>> This means that we will never wait for DWC3_EP_END_TRANSFER_PENDING to clear
>>>>> in dwc3_gadget_stop() like we used to. This is perfectly fine, right?
>>>>>
>>>>> It makes sense to me as dwc3_gadget_stop() calls __dwc3_gadget_stop() which
>>>>> masks all interrupts and nobody will ever clear that flag if it was set.
>>>>
>>>> I don't think so. It can not mask the endpoint events, please check
>>>> the events which will be masked in DEVTEN register. The reason why we
>>>> should wait for DWC3_EP_END_TRANSFER_PENDING to clear is that,
>>>> sometimes the DWC3_DEPEVT_EPCMDCMPLT event will be triggered later
>>>> than 100us, but now we may have freed the gadget irq which will cause
>>>> crash.
>>>
>>> We could mask command complete events as soon as ->udc_stop() is called,
>>> right? Hmm, actually, __dwc3_gadget_stop() already clears DEVTEN
>>> completely.
>>
>> But which bit in DEVTEN says Endpoint events are disabled?
>
> When we set up the DWC3_DEPCMD_ENDTRANSFER command in
> dwc3_stop_active_transfer(), we can do not set DWC3_DEPCMD_CMDIOC,
> then there will no endpoint command complete interrupts I think.
>
> cmd |= DWC3_DEPCMD_CMDIOC;

I remember some part of the databook mandating CMDIOC to be set. We
could test it out without and see if anything blows up. I would,
however, require a lengthy comment explaining that we're deviating from
databook revision x.yya, section foobar because $reasons. :-)

--
balbi


Attachments:
signature.asc (847.00 B)

2018-03-05 13:41:47

by Felipe Balbi

[permalink] [raw]
Subject: Re: [PATCH] usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume


Hi,

Baolin Wang <[email protected]> writes:
>>> Roger Quadros <[email protected]> writes:
>>>>> Roger Quadros <[email protected]> writes:
>>>>>> In the following test we get stuck by sleeping forever in _dwc3_set_mode()
>>>>>> after which dual-role switching doesn't work.
>>>>>>
>>>>>> On dra7-evm's dual-role port,
>>>>>> - Load g_zero gadget driver and enumerate to host
>>>>>> - suspend to mem
>>>>>> - disconnect USB cable to host and connect otg cable with Pen drive in it.
>>>>>> - resume system
>>>>>> - we sleep indefinitely in _dwc3_set_mode due to.
>>>>>> dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()->
>>>>>> dwc3_gadget_stop()->wait_event_lock_irq()
>>>>>>
>>>>>> Let's clear the DWC3_EP_END_TRANSFER_PENDING flag on all endpoints
>>>>>> so we don't wait in dwc3_gadget_stop().
>>>>>>
>>>>>> Signed-off-by: Roger Quadros <[email protected]>
>>>>>> ---
>>>>>> drivers/usb/dwc3/gadget.c | 14 ++++++++++++++
>>>>>> 1 file changed, 14 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>>>>>> index 2bda4eb..0a360da 100644
>>>>>> --- a/drivers/usb/dwc3/gadget.c
>>>>>> +++ b/drivers/usb/dwc3/gadget.c
>>>>>> @@ -3273,6 +3273,20 @@ int dwc3_gadget_init(struct dwc3 *dwc)
>>>>>>
>>>>>> void dwc3_gadget_exit(struct dwc3 *dwc)
>>>>>> {
>>>>>> + int epnum;
>>>>>> + unsigned long flags;
>>>>>> +
>>>>>> + spin_lock_irqsave(&dwc->lock, flags);
>>>>>> + for (epnum = 2; epnum < DWC3_ENDPOINTS_NUM; epnum++) {
>>>>>> + struct dwc3_ep *dep = dwc->eps[epnum];
>>>>>> +
>>>>>> + if (!dep)
>>>>>> + continue;
>>>>>> +
>>>>>> + dep->flags &= ~DWC3_EP_END_TRANSFER_PENDING;
>>>>>> + }
>>>>>> + spin_unlock_irqrestore(&dwc->lock, flags);
>>>>>> +
>>>>>> usb_del_gadget_udc(&dwc->gadget);
>>>>>> dwc3_gadget_free_endpoints(dwc);
>>>>>
>>>>> free endpoints is a better place for this. It's already going to free
>>>>> the memory anyway. Might as well clear all flags to 0 there.
>>>>>
>>>>
>>>> But it won't solve the deadlock issue. Since dwc3_gadget_free_endpoints()
>>>> is called after usb_del_gadget_udc() and the deadlock happens when
>>>>
>>>> usb_del_gadget_udc()->udc_stop()->dwc3_gadget_stop()->wait_event_lock_irq()
>>>>
>>>> and DWC3_EP_END_TRANSFER_PENDING flag is set.
>>>
>>> indeed. Iterating twice over the entire endpoint list seems
>>> wasteful. Perhaps we just shouldn't wait when removing the UDC since
>>> that's essentially what this patch will do, right? If you clear the flag
>>> before calling ->udc_stop(), this means the loop in dwc3_gadget_stop()
>>> will do nothing. Might as well remove it.
>>>
>>
>> This means that we will never wait for DWC3_EP_END_TRANSFER_PENDING to clear
>> in dwc3_gadget_stop() like we used to. This is perfectly fine, right?
>>
>> It makes sense to me as dwc3_gadget_stop() calls __dwc3_gadget_stop() which
>> masks all interrupts and nobody will ever clear that flag if it was set.
>
> I don't think so. It can not mask the endpoint events, please check
> the events which will be masked in DEVTEN register. The reason why we
> should wait for DWC3_EP_END_TRANSFER_PENDING to clear is that,
> sometimes the DWC3_DEPEVT_EPCMDCMPLT event will be triggered later
> than 100us, but now we may have freed the gadget irq which will cause
> crash.

We could mask command complete events as soon as ->udc_stop() is called,
right? Hmm, actually, __dwc3_gadget_stop() already clears DEVTEN
completely.

/me goes check databook

At least on revision 2.60a of the databook, bit 10 is reserved. I wonder
if that's the start of all the problems. Anybody has access to older and
newer databook revisions so we can cross-check?

best

--
balbi


Attachments:
signature.asc (847.00 B)

2018-03-05 13:43:58

by Roger Quadros

[permalink] [raw]
Subject: Re: [PATCH] usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume

On 05/03/18 13:06, Felipe Balbi wrote:
>
> Hi,
>
> Baolin Wang <[email protected]> writes:
>>>> Roger Quadros <[email protected]> writes:
>>>>>> Roger Quadros <[email protected]> writes:
>>>>>>> In the following test we get stuck by sleeping forever in _dwc3_set_mode()
>>>>>>> after which dual-role switching doesn't work.
>>>>>>>
>>>>>>> On dra7-evm's dual-role port,
>>>>>>> - Load g_zero gadget driver and enumerate to host
>>>>>>> - suspend to mem
>>>>>>> - disconnect USB cable to host and connect otg cable with Pen drive in it.
>>>>>>> - resume system
>>>>>>> - we sleep indefinitely in _dwc3_set_mode due to.
>>>>>>> dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()->
>>>>>>> dwc3_gadget_stop()->wait_event_lock_irq()
>>>>>>>
>>>>>>> Let's clear the DWC3_EP_END_TRANSFER_PENDING flag on all endpoints
>>>>>>> so we don't wait in dwc3_gadget_stop().
>>>>>>>
>>>>>>> Signed-off-by: Roger Quadros <[email protected]>
>>>>>>> ---
>>>>>>> drivers/usb/dwc3/gadget.c | 14 ++++++++++++++
>>>>>>> 1 file changed, 14 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>>>>>>> index 2bda4eb..0a360da 100644
>>>>>>> --- a/drivers/usb/dwc3/gadget.c
>>>>>>> +++ b/drivers/usb/dwc3/gadget.c
>>>>>>> @@ -3273,6 +3273,20 @@ int dwc3_gadget_init(struct dwc3 *dwc)
>>>>>>>
>>>>>>> void dwc3_gadget_exit(struct dwc3 *dwc)
>>>>>>> {
>>>>>>> + int epnum;
>>>>>>> + unsigned long flags;
>>>>>>> +
>>>>>>> + spin_lock_irqsave(&dwc->lock, flags);
>>>>>>> + for (epnum = 2; epnum < DWC3_ENDPOINTS_NUM; epnum++) {
>>>>>>> + struct dwc3_ep *dep = dwc->eps[epnum];
>>>>>>> +
>>>>>>> + if (!dep)
>>>>>>> + continue;
>>>>>>> +
>>>>>>> + dep->flags &= ~DWC3_EP_END_TRANSFER_PENDING;
>>>>>>> + }
>>>>>>> + spin_unlock_irqrestore(&dwc->lock, flags);
>>>>>>> +
>>>>>>> usb_del_gadget_udc(&dwc->gadget);
>>>>>>> dwc3_gadget_free_endpoints(dwc);
>>>>>>
>>>>>> free endpoints is a better place for this. It's already going to free
>>>>>> the memory anyway. Might as well clear all flags to 0 there.
>>>>>>
>>>>>
>>>>> But it won't solve the deadlock issue. Since dwc3_gadget_free_endpoints()
>>>>> is called after usb_del_gadget_udc() and the deadlock happens when
>>>>>
>>>>> usb_del_gadget_udc()->udc_stop()->dwc3_gadget_stop()->wait_event_lock_irq()
>>>>>
>>>>> and DWC3_EP_END_TRANSFER_PENDING flag is set.
>>>>
>>>> indeed. Iterating twice over the entire endpoint list seems
>>>> wasteful. Perhaps we just shouldn't wait when removing the UDC since
>>>> that's essentially what this patch will do, right? If you clear the flag
>>>> before calling ->udc_stop(), this means the loop in dwc3_gadget_stop()
>>>> will do nothing. Might as well remove it.
>>>>
>>>
>>> This means that we will never wait for DWC3_EP_END_TRANSFER_PENDING to clear
>>> in dwc3_gadget_stop() like we used to. This is perfectly fine, right?
>>>
>>> It makes sense to me as dwc3_gadget_stop() calls __dwc3_gadget_stop() which
>>> masks all interrupts and nobody will ever clear that flag if it was set.
>>
>> I don't think so. It can not mask the endpoint events, please check
>> the events which will be masked in DEVTEN register. The reason why we
>> should wait for DWC3_EP_END_TRANSFER_PENDING to clear is that,
>> sometimes the DWC3_DEPEVT_EPCMDCMPLT event will be triggered later
>> than 100us, but now we may have freed the gadget irq which will cause
>> crash.
>
> We could mask command complete events as soon as ->udc_stop() is called,
> right? Hmm, actually, __dwc3_gadget_stop() already clears DEVTEN
> completely.

But which bit in DEVTEN says Endpoint events are disabled?

>
> /me goes check databook
>
> At least on revision 2.60a of the databook, bit 10 is reserved. I wonder
> if that's the start of all the problems. Anybody has access to older and
> newer databook revisions so we can cross-check?
>

I can access v2.40 and v3.10 books.

bit 10 is reserved on both

Differences in v2.4 vs v3.10 are:

bit 8 reserved vs L1SUSPEN
bit 13 reserved vs StopOnDisconnectEn
bit 14 reserved vs L1WKUPEVTEN

--
cheers,
-roger

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2018-03-05 13:54:43

by Roger Quadros

[permalink] [raw]
Subject: Re: [PATCH] usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume

On 05/03/18 12:41, Baolin Wang wrote:
> Hi Roger,
>
> On 5 March 2018 at 17:45, Roger Quadros <[email protected]> wrote:
>> Felipe,
>>
>> On 05/03/18 10:49, Felipe Balbi wrote:
>>>
>>> Hi,
>>>
>>> Roger Quadros <[email protected]> writes:
>>>>> Roger Quadros <[email protected]> writes:
>>>>>> In the following test we get stuck by sleeping forever in _dwc3_set_mode()
>>>>>> after which dual-role switching doesn't work.
>>>>>>
>>>>>> On dra7-evm's dual-role port,
>>>>>> - Load g_zero gadget driver and enumerate to host
>>>>>> - suspend to mem
>>>>>> - disconnect USB cable to host and connect otg cable with Pen drive in it.
>>>>>> - resume system
>>>>>> - we sleep indefinitely in _dwc3_set_mode due to.
>>>>>> dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()->
>>>>>> dwc3_gadget_stop()->wait_event_lock_irq()
>>>>>>
>>>>>> Let's clear the DWC3_EP_END_TRANSFER_PENDING flag on all endpoints
>>>>>> so we don't wait in dwc3_gadget_stop().
>>>>>>
>>>>>> Signed-off-by: Roger Quadros <[email protected]>
>>>>>> ---
>>>>>> drivers/usb/dwc3/gadget.c | 14 ++++++++++++++
>>>>>> 1 file changed, 14 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>>>>>> index 2bda4eb..0a360da 100644
>>>>>> --- a/drivers/usb/dwc3/gadget.c
>>>>>> +++ b/drivers/usb/dwc3/gadget.c
>>>>>> @@ -3273,6 +3273,20 @@ int dwc3_gadget_init(struct dwc3 *dwc)
>>>>>>
>>>>>> void dwc3_gadget_exit(struct dwc3 *dwc)
>>>>>> {
>>>>>> + int epnum;
>>>>>> + unsigned long flags;
>>>>>> +
>>>>>> + spin_lock_irqsave(&dwc->lock, flags);
>>>>>> + for (epnum = 2; epnum < DWC3_ENDPOINTS_NUM; epnum++) {
>>>>>> + struct dwc3_ep *dep = dwc->eps[epnum];
>>>>>> +
>>>>>> + if (!dep)
>>>>>> + continue;
>>>>>> +
>>>>>> + dep->flags &= ~DWC3_EP_END_TRANSFER_PENDING;
>>>>>> + }
>>>>>> + spin_unlock_irqrestore(&dwc->lock, flags);
>>>>>> +
>>>>>> usb_del_gadget_udc(&dwc->gadget);
>>>>>> dwc3_gadget_free_endpoints(dwc);
>>>>>
>>>>> free endpoints is a better place for this. It's already going to free
>>>>> the memory anyway. Might as well clear all flags to 0 there.
>>>>>
>>>>
>>>> But it won't solve the deadlock issue. Since dwc3_gadget_free_endpoints()
>>>> is called after usb_del_gadget_udc() and the deadlock happens when
>>>>
>>>> usb_del_gadget_udc()->udc_stop()->dwc3_gadget_stop()->wait_event_lock_irq()
>>>>
>>>> and DWC3_EP_END_TRANSFER_PENDING flag is set.
>>>
>>> indeed. Iterating twice over the entire endpoint list seems
>>> wasteful. Perhaps we just shouldn't wait when removing the UDC since
>>> that's essentially what this patch will do, right? If you clear the flag
>>> before calling ->udc_stop(), this means the loop in dwc3_gadget_stop()
>>> will do nothing. Might as well remove it.
>>>
>>
>> This means that we will never wait for DWC3_EP_END_TRANSFER_PENDING to clear
>> in dwc3_gadget_stop() like we used to. This is perfectly fine, right?
>>
>> It makes sense to me as dwc3_gadget_stop() calls __dwc3_gadget_stop() which
>> masks all interrupts and nobody will ever clear that flag if it was set.
>
> I don't think so. It can not mask the endpoint events, please check
> the events which will be masked in DEVTEN register. The reason why we

Correct, endpoint events are not managed by DEVTEN.

> should wait for DWC3_EP_END_TRANSFER_PENDING to clear is that,
> sometimes the DWC3_DEPEVT_EPCMDCMPLT event will be triggered later
> than 100us, but now we may have freed the gadget irq which will cause
> crash.
>

OK. So what is the right approach here?
In the test case I mentioned in the commit log the endpoint interrupt never
happens and it waits forever in dwc3_gadget_stop().

Since we know we're winding up, can we explicitly disable the endpoint events
here?

--
cheers,
-roger

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2018-03-05 13:56:26

by Baolin Wang

[permalink] [raw]
Subject: Re: [PATCH] usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume

On 5 March 2018 at 19:14, Roger Quadros <[email protected]> wrote:
> On 05/03/18 13:06, Felipe Balbi wrote:
>>
>> Hi,
>>
>> Baolin Wang <[email protected]> writes:
>>>>> Roger Quadros <[email protected]> writes:
>>>>>>> Roger Quadros <[email protected]> writes:
>>>>>>>> In the following test we get stuck by sleeping forever in _dwc3_set_mode()
>>>>>>>> after which dual-role switching doesn't work.
>>>>>>>>
>>>>>>>> On dra7-evm's dual-role port,
>>>>>>>> - Load g_zero gadget driver and enumerate to host
>>>>>>>> - suspend to mem
>>>>>>>> - disconnect USB cable to host and connect otg cable with Pen drive in it.
>>>>>>>> - resume system
>>>>>>>> - we sleep indefinitely in _dwc3_set_mode due to.
>>>>>>>> dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()->
>>>>>>>> dwc3_gadget_stop()->wait_event_lock_irq()
>>>>>>>>
>>>>>>>> Let's clear the DWC3_EP_END_TRANSFER_PENDING flag on all endpoints
>>>>>>>> so we don't wait in dwc3_gadget_stop().
>>>>>>>>
>>>>>>>> Signed-off-by: Roger Quadros <[email protected]>
>>>>>>>> ---
>>>>>>>> drivers/usb/dwc3/gadget.c | 14 ++++++++++++++
>>>>>>>> 1 file changed, 14 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>>>>>>>> index 2bda4eb..0a360da 100644
>>>>>>>> --- a/drivers/usb/dwc3/gadget.c
>>>>>>>> +++ b/drivers/usb/dwc3/gadget.c
>>>>>>>> @@ -3273,6 +3273,20 @@ int dwc3_gadget_init(struct dwc3 *dwc)
>>>>>>>>
>>>>>>>> void dwc3_gadget_exit(struct dwc3 *dwc)
>>>>>>>> {
>>>>>>>> + int epnum;
>>>>>>>> + unsigned long flags;
>>>>>>>> +
>>>>>>>> + spin_lock_irqsave(&dwc->lock, flags);
>>>>>>>> + for (epnum = 2; epnum < DWC3_ENDPOINTS_NUM; epnum++) {
>>>>>>>> + struct dwc3_ep *dep = dwc->eps[epnum];
>>>>>>>> +
>>>>>>>> + if (!dep)
>>>>>>>> + continue;
>>>>>>>> +
>>>>>>>> + dep->flags &= ~DWC3_EP_END_TRANSFER_PENDING;
>>>>>>>> + }
>>>>>>>> + spin_unlock_irqrestore(&dwc->lock, flags);
>>>>>>>> +
>>>>>>>> usb_del_gadget_udc(&dwc->gadget);
>>>>>>>> dwc3_gadget_free_endpoints(dwc);
>>>>>>>
>>>>>>> free endpoints is a better place for this. It's already going to free
>>>>>>> the memory anyway. Might as well clear all flags to 0 there.
>>>>>>>
>>>>>>
>>>>>> But it won't solve the deadlock issue. Since dwc3_gadget_free_endpoints()
>>>>>> is called after usb_del_gadget_udc() and the deadlock happens when
>>>>>>
>>>>>> usb_del_gadget_udc()->udc_stop()->dwc3_gadget_stop()->wait_event_lock_irq()
>>>>>>
>>>>>> and DWC3_EP_END_TRANSFER_PENDING flag is set.
>>>>>
>>>>> indeed. Iterating twice over the entire endpoint list seems
>>>>> wasteful. Perhaps we just shouldn't wait when removing the UDC since
>>>>> that's essentially what this patch will do, right? If you clear the flag
>>>>> before calling ->udc_stop(), this means the loop in dwc3_gadget_stop()
>>>>> will do nothing. Might as well remove it.
>>>>>
>>>>
>>>> This means that we will never wait for DWC3_EP_END_TRANSFER_PENDING to clear
>>>> in dwc3_gadget_stop() like we used to. This is perfectly fine, right?
>>>>
>>>> It makes sense to me as dwc3_gadget_stop() calls __dwc3_gadget_stop() which
>>>> masks all interrupts and nobody will ever clear that flag if it was set.
>>>
>>> I don't think so. It can not mask the endpoint events, please check
>>> the events which will be masked in DEVTEN register. The reason why we
>>> should wait for DWC3_EP_END_TRANSFER_PENDING to clear is that,
>>> sometimes the DWC3_DEPEVT_EPCMDCMPLT event will be triggered later
>>> than 100us, but now we may have freed the gadget irq which will cause
>>> crash.
>>
>> We could mask command complete events as soon as ->udc_stop() is called,
>> right? Hmm, actually, __dwc3_gadget_stop() already clears DEVTEN
>> completely.
>
> But which bit in DEVTEN says Endpoint events are disabled?

When we set up the DWC3_DEPCMD_ENDTRANSFER command in
dwc3_stop_active_transfer(), we can do not set DWC3_DEPCMD_CMDIOC,
then there will no endpoint command complete interrupts I think.

cmd |= DWC3_DEPCMD_CMDIOC;

>
>>
>> /me goes check databook
>>
>> At least on revision 2.60a of the databook, bit 10 is reserved. I wonder
>> if that's the start of all the problems. Anybody has access to older and
>> newer databook revisions so we can cross-check?
>>
>
> I can access v2.40 and v3.10 books.
>
> bit 10 is reserved on both
>
> Differences in v2.4 vs v3.10 are:
>
> bit 8 reserved vs L1SUSPEN
> bit 13 reserved vs StopOnDisconnectEn
> bit 14 reserved vs L1WKUPEVTEN
>
> --
> cheers,
> -roger
>
> Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki



--
Baolin.wang
Best Regards

2018-03-09 09:20:50

by Roger Quadros

[permalink] [raw]
Subject: Re: [PATCH] usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume

On 05/03/18 13:27, Felipe Balbi wrote:
>
> Hi,
>
> Baolin Wang <[email protected]> writes:
>>>>>>>>>> void dwc3_gadget_exit(struct dwc3 *dwc)
>>>>>>>>>> {
>>>>>>>>>> + int epnum;
>>>>>>>>>> + unsigned long flags;
>>>>>>>>>> +
>>>>>>>>>> + spin_lock_irqsave(&dwc->lock, flags);
>>>>>>>>>> + for (epnum = 2; epnum < DWC3_ENDPOINTS_NUM; epnum++) {
>>>>>>>>>> + struct dwc3_ep *dep = dwc->eps[epnum];
>>>>>>>>>> +
>>>>>>>>>> + if (!dep)
>>>>>>>>>> + continue;
>>>>>>>>>> +
>>>>>>>>>> + dep->flags &= ~DWC3_EP_END_TRANSFER_PENDING;
>>>>>>>>>> + }
>>>>>>>>>> + spin_unlock_irqrestore(&dwc->lock, flags);
>>>>>>>>>> +
>>>>>>>>>> usb_del_gadget_udc(&dwc->gadget);
>>>>>>>>>> dwc3_gadget_free_endpoints(dwc);
>>>>>>>>>
>>>>>>>>> free endpoints is a better place for this. It's already going to free
>>>>>>>>> the memory anyway. Might as well clear all flags to 0 there.
>>>>>>>>>
>>>>>>>>
>>>>>>>> But it won't solve the deadlock issue. Since dwc3_gadget_free_endpoints()
>>>>>>>> is called after usb_del_gadget_udc() and the deadlock happens when
>>>>>>>>
>>>>>>>> usb_del_gadget_udc()->udc_stop()->dwc3_gadget_stop()->wait_event_lock_irq()
>>>>>>>>
>>>>>>>> and DWC3_EP_END_TRANSFER_PENDING flag is set.
>>>>>>>
>>>>>>> indeed. Iterating twice over the entire endpoint list seems
>>>>>>> wasteful. Perhaps we just shouldn't wait when removing the UDC since
>>>>>>> that's essentially what this patch will do, right? If you clear the flag
>>>>>>> before calling ->udc_stop(), this means the loop in dwc3_gadget_stop()
>>>>>>> will do nothing. Might as well remove it.
>>>>>>>
>>>>>>
>>>>>> This means that we will never wait for DWC3_EP_END_TRANSFER_PENDING to clear
>>>>>> in dwc3_gadget_stop() like we used to. This is perfectly fine, right?
>>>>>>
>>>>>> It makes sense to me as dwc3_gadget_stop() calls __dwc3_gadget_stop() which
>>>>>> masks all interrupts and nobody will ever clear that flag if it was set.
>>>>>
>>>>> I don't think so. It can not mask the endpoint events, please check
>>>>> the events which will be masked in DEVTEN register. The reason why we
>>>>> should wait for DWC3_EP_END_TRANSFER_PENDING to clear is that,
>>>>> sometimes the DWC3_DEPEVT_EPCMDCMPLT event will be triggered later
>>>>> than 100us, but now we may have freed the gadget irq which will cause
>>>>> crash.
>>>>
>>>> We could mask command complete events as soon as ->udc_stop() is called,
>>>> right? Hmm, actually, __dwc3_gadget_stop() already clears DEVTEN
>>>> completely.
>>>
>>> But which bit in DEVTEN says Endpoint events are disabled?
>>
>> When we set up the DWC3_DEPCMD_ENDTRANSFER command in
>> dwc3_stop_active_transfer(), we can do not set DWC3_DEPCMD_CMDIOC,
>> then there will no endpoint command complete interrupts I think.
>>
>> cmd |= DWC3_DEPCMD_CMDIOC;
>
> I remember some part of the databook mandating CMDIOC to be set. We
> could test it out without and see if anything blows up. I would,
> however, require a lengthy comment explaining that we're deviating from
> databook revision x.yya, section foobar because $reasons. :-)
>

This is what the v3.10 databook says

"When issuing an End Transfer command, software must set the CmdIOC bit (field 8) so that an Endpoint
Command Complete event is generated after the transfer ends. This is necessary to synchronize the
conclusion of system bus traffic before the End Transfer command is completed."

with a note
"If GUCTL2[Rst_actbitlater] is set, Software can poll the completion of the End Transfer
command by polling the command active bit to be cleared to 0."

fyi.
Rst_actbitlater - "Enable clearing of the command active bit for the ENDXFER
command after the command execution is completed.
This bit is valid in device mode only."

So I'd prefer not to clear CMDIOC for all cases.

Could we some how just tackle the dwc3_gadget_exit case like I did in this patch?

--
cheers,
-roger

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2018-03-09 09:24:58

by Felipe Balbi

[permalink] [raw]
Subject: Re: [PATCH] usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume


Hi,

Roger Quadros <[email protected]> writes:

<snip>

>>> When we set up the DWC3_DEPCMD_ENDTRANSFER command in
>>> dwc3_stop_active_transfer(), we can do not set DWC3_DEPCMD_CMDIOC,
>>> then there will no endpoint command complete interrupts I think.
>>>
>>> cmd |= DWC3_DEPCMD_CMDIOC;
>>
>> I remember some part of the databook mandating CMDIOC to be set. We
>> could test it out without and see if anything blows up. I would,
>> however, require a lengthy comment explaining that we're deviating from
>> databook revision x.yya, section foobar because $reasons. :-)
>>
>
> This is what the v3.10 databook says
>
> "When issuing an End Transfer command, software must set the CmdIOC
> bit (field 8) so that an Endpoint Command Complete event is generated
> after the transfer ends. This is necessary to synchronize the
> conclusion of system bus traffic before the End Transfer command is
> completed."
>
> with a note
>
> "If GUCTL2[Rst_actbitlater] is set, Software can poll the completion
> of the End Transfer command by polling the command active bit to be
> cleared to 0."
>
> fyi.
>
> Rst_actbitlater - "Enable clearing of the command active bit for the
> ENDXFER command after the command execution is completed. This bit is
> valid in device mode only."
>
> So I'd prefer not to clear CMDIOC for all cases.
>
> Could we some how just tackle the dwc3_gadget_exit case like I did in
> this patch?

if you can send a version that doesn't iterate over all endpoints twice,
sure. We still need a comment somewhere, and I fear we may get
interrupts later in some cases. How would we deal with that?

--
balbi


Attachments:
signature.asc (847.00 B)

2018-03-09 09:27:54

by Roger Quadros

[permalink] [raw]
Subject: Re: [PATCH] usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume

On 09/03/18 11:23, Felipe Balbi wrote:
>
> Hi,
>
> Roger Quadros <[email protected]> writes:
>
> <snip>
>
>>>> When we set up the DWC3_DEPCMD_ENDTRANSFER command in
>>>> dwc3_stop_active_transfer(), we can do not set DWC3_DEPCMD_CMDIOC,
>>>> then there will no endpoint command complete interrupts I think.
>>>>
>>>> cmd |= DWC3_DEPCMD_CMDIOC;
>>>
>>> I remember some part of the databook mandating CMDIOC to be set. We
>>> could test it out without and see if anything blows up. I would,
>>> however, require a lengthy comment explaining that we're deviating from
>>> databook revision x.yya, section foobar because $reasons. :-)
>>>
>>
>> This is what the v3.10 databook says
>>
>> "When issuing an End Transfer command, software must set the CmdIOC
>> bit (field 8) so that an Endpoint Command Complete event is generated
>> after the transfer ends. This is necessary to synchronize the
>> conclusion of system bus traffic before the End Transfer command is
>> completed."
>>
>> with a note
>>
>> "If GUCTL2[Rst_actbitlater] is set, Software can poll the completion
>> of the End Transfer command by polling the command active bit to be
>> cleared to 0."
>>
>> fyi.
>>
>> Rst_actbitlater - "Enable clearing of the command active bit for the
>> ENDXFER command after the command execution is completed. This bit is
>> valid in device mode only."
>>
>> So I'd prefer not to clear CMDIOC for all cases.
>>
>> Could we some how just tackle the dwc3_gadget_exit case like I did in
>> this patch?
>
> if you can send a version that doesn't iterate over all endpoints twice,
> sure. We still need a comment somewhere, and I fear we may get
> interrupts later in some cases. How would we deal with that?
>

how about explicitly masking that interrupt? Is it possible?

--
cheers,
-roger

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2018-03-09 09:51:33

by Roger Quadros

[permalink] [raw]
Subject: Re: [PATCH] usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume

On 09/03/18 11:26, Roger Quadros wrote:
> On 09/03/18 11:23, Felipe Balbi wrote:
>>
>> Hi,
>>
>> Roger Quadros <[email protected]> writes:
>>
>> <snip>
>>
>>>>> When we set up the DWC3_DEPCMD_ENDTRANSFER command in
>>>>> dwc3_stop_active_transfer(), we can do not set DWC3_DEPCMD_CMDIOC,
>>>>> then there will no endpoint command complete interrupts I think.
>>>>>
>>>>> cmd |= DWC3_DEPCMD_CMDIOC;
>>>>
>>>> I remember some part of the databook mandating CMDIOC to be set. We
>>>> could test it out without and see if anything blows up. I would,
>>>> however, require a lengthy comment explaining that we're deviating from
>>>> databook revision x.yya, section foobar because $reasons. :-)
>>>>
>>>
>>> This is what the v3.10 databook says
>>>
>>> "When issuing an End Transfer command, software must set the CmdIOC
>>> bit (field 8) so that an Endpoint Command Complete event is generated
>>> after the transfer ends. This is necessary to synchronize the
>>> conclusion of system bus traffic before the End Transfer command is
>>> completed."
>>>
>>> with a note
>>>
>>> "If GUCTL2[Rst_actbitlater] is set, Software can poll the completion
>>> of the End Transfer command by polling the command active bit to be
>>> cleared to 0."
>>>
>>> fyi.
>>>
>>> Rst_actbitlater - "Enable clearing of the command active bit for the
>>> ENDXFER command after the command execution is completed. This bit is
>>> valid in device mode only."
>>>
>>> So I'd prefer not to clear CMDIOC for all cases.
>>>
>>> Could we some how just tackle the dwc3_gadget_exit case like I did in
>>> this patch?
>>
>> if you can send a version that doesn't iterate over all endpoints twice,
>> sure. We still need a comment somewhere, and I fear we may get
>> interrupts later in some cases. How would we deal with that?
>>
>
> how about explicitly masking that interrupt? Is it possible?
>

Other easy option is to use wait_event_interruptible_lock_irq_timeout()
instead of wait_event_lock_irq() in dwc3_gadget_stop().

Is a 200ms timeout sufficient? And after the first timeout we assume all
will timeout so no point in waiting 200ms for each endpoint.

--
cheers,
-roger

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2018-03-09 10:38:54

by Felipe Balbi

[permalink] [raw]
Subject: Re: [PATCH] usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume


Hi,

Roger Quadros <[email protected]> writes:
>>> This is what the v3.10 databook says
>>>
>>> "When issuing an End Transfer command, software must set the CmdIOC
>>> bit (field 8) so that an Endpoint Command Complete event is generated
>>> after the transfer ends. This is necessary to synchronize the
>>> conclusion of system bus traffic before the End Transfer command is
>>> completed."
>>>
>>> with a note
>>>
>>> "If GUCTL2[Rst_actbitlater] is set, Software can poll the completion
>>> of the End Transfer command by polling the command active bit to be
>>> cleared to 0."
>>>
>>> fyi.
>>>
>>> Rst_actbitlater - "Enable clearing of the command active bit for the
>>> ENDXFER command after the command execution is completed. This bit is
>>> valid in device mode only."
>>>
>>> So I'd prefer not to clear CMDIOC for all cases.
>>>
>>> Could we some how just tackle the dwc3_gadget_exit case like I did in
>>> this patch?
>>
>> if you can send a version that doesn't iterate over all endpoints twice,
>> sure. We still need a comment somewhere, and I fear we may get
>> interrupts later in some cases. How would we deal with that?
>>
>
> how about explicitly masking that interrupt? Is it possible?

I think I showed that the bit is reserved on recent dwc3 core releases
(anytyhing 2.40a+, at least).

--
balbi


Attachments:
signature.asc (847.00 B)

2018-03-09 10:40:36

by Felipe Balbi

[permalink] [raw]
Subject: Re: [PATCH] usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume


Hi,

Roger Quadros <[email protected]> writes:
>>>>>> When we set up the DWC3_DEPCMD_ENDTRANSFER command in
>>>>>> dwc3_stop_active_transfer(), we can do not set DWC3_DEPCMD_CMDIOC,
>>>>>> then there will no endpoint command complete interrupts I think.
>>>>>>
>>>>>> cmd |= DWC3_DEPCMD_CMDIOC;
>>>>>
>>>>> I remember some part of the databook mandating CMDIOC to be set. We
>>>>> could test it out without and see if anything blows up. I would,
>>>>> however, require a lengthy comment explaining that we're deviating from
>>>>> databook revision x.yya, section foobar because $reasons. :-)
>>>>>
>>>>
>>>> This is what the v3.10 databook says
>>>>
>>>> "When issuing an End Transfer command, software must set the CmdIOC
>>>> bit (field 8) so that an Endpoint Command Complete event is generated
>>>> after the transfer ends. This is necessary to synchronize the
>>>> conclusion of system bus traffic before the End Transfer command is
>>>> completed."
>>>>
>>>> with a note
>>>>
>>>> "If GUCTL2[Rst_actbitlater] is set, Software can poll the completion
>>>> of the End Transfer command by polling the command active bit to be
>>>> cleared to 0."
>>>>
>>>> fyi.
>>>>
>>>> Rst_actbitlater - "Enable clearing of the command active bit for the
>>>> ENDXFER command after the command execution is completed. This bit is
>>>> valid in device mode only."
>>>>
>>>> So I'd prefer not to clear CMDIOC for all cases.
>>>>
>>>> Could we some how just tackle the dwc3_gadget_exit case like I did in
>>>> this patch?
>>>
>>> if you can send a version that doesn't iterate over all endpoints twice,
>>> sure. We still need a comment somewhere, and I fear we may get
>>> interrupts later in some cases. How would we deal with that?
>>>
>>
>> how about explicitly masking that interrupt? Is it possible?
>>
>
> Other easy option is to use wait_event_interruptible_lock_irq_timeout()
> instead of wait_event_lock_irq() in dwc3_gadget_stop().
>
> Is a 200ms timeout sufficient? And after the first timeout we assume all
> will timeout so no point in waiting 200ms for each endpoint.

We can do that. And I think some 5ms is more than enough :-) I'd be
surprised if it takes anything over some 200us for the EndTransfer
command to complete.

--
balbi


Attachments:
signature.asc (847.00 B)

2018-03-09 12:48:20

by Roger Quadros

[permalink] [raw]
Subject: [PATCH v2] usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume

In the following test we get stuck by sleeping forever in _dwc3_set_mode()
after which dual-role switching doesn't work.

On dra7-evm's dual-role port,
- Load g_zero gadget driver and enumerate to host
- suspend to mem
- disconnect USB cable to host and connect otg cable with Pen drive in it.
- resume system
- we sleep indefinitely in _dwc3_set_mode due to.
dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()->
dwc3_gadget_stop()->wait_event_lock_irq()

To fix this instead of waiting indefinitely with wait_event_lock_irq()
we use wait_event_interruptible_lock_irq_timeout() and print
and error message if there was a timeout.

Signed-off-by: Roger Quadros <[email protected]>
---

Changelog:

v2:
- use wait_event_interruptible_lock_irq_timeout() instead of wait_event_lock_irq()

drivers/usb/dwc3/gadget.c | 23 ++++++++++++++++++++---
1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 2bda4eb..7c3a6e4 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -1950,6 +1950,7 @@ static int dwc3_gadget_stop(struct usb_gadget *g)
struct dwc3 *dwc = gadget_to_dwc(g);
unsigned long flags;
int epnum;
+ u32 tmo_eps = 0;

spin_lock_irqsave(&dwc->lock, flags);

@@ -1960,6 +1961,7 @@ static int dwc3_gadget_stop(struct usb_gadget *g)

for (epnum = 2; epnum < DWC3_ENDPOINTS_NUM; epnum++) {
struct dwc3_ep *dep = dwc->eps[epnum];
+ int ret;

if (!dep)
continue;
@@ -1967,9 +1969,24 @@ static int dwc3_gadget_stop(struct usb_gadget *g)
if (!(dep->flags & DWC3_EP_END_TRANSFER_PENDING))
continue;

- wait_event_lock_irq(dep->wait_end_transfer,
- !(dep->flags & DWC3_EP_END_TRANSFER_PENDING),
- dwc->lock);
+ ret = wait_event_interruptible_lock_irq_timeout(dep->wait_end_transfer,
+ !(dep->flags & DWC3_EP_END_TRANSFER_PENDING),
+ dwc->lock, msecs_to_jiffies(5));
+
+ if (ret <= 0) {
+ /* Timed out or interrupted! There's nothing much
+ * we can do so we just log here and print which
+ * endpoints timed out at the end.
+ */
+ tmo_eps |= 1 << epnum;
+ dep->flags &= DWC3_EP_END_TRANSFER_PENDING;
+ }
+ }
+
+ if (tmo_eps) {
+ dev_err(dwc->dev,
+ "end transfer timed out on endpoints 0x%x [bitmap]\n",
+ tmo_eps);
}

out:
--
cheers,
-roger

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki



2018-03-16 10:35:35

by Roger Quadros

[permalink] [raw]
Subject: Re: [PATCH v2] usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume

Hi Felipe,

On 09/03/18 14:47, Roger Quadros wrote:
> In the following test we get stuck by sleeping forever in _dwc3_set_mode()
> after which dual-role switching doesn't work.
>
> On dra7-evm's dual-role port,
> - Load g_zero gadget driver and enumerate to host
> - suspend to mem
> - disconnect USB cable to host and connect otg cable with Pen drive in it.
> - resume system
> - we sleep indefinitely in _dwc3_set_mode due to.
> dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()->
> dwc3_gadget_stop()->wait_event_lock_irq()
>
> To fix this instead of waiting indefinitely with wait_event_lock_irq()
> we use wait_event_interruptible_lock_irq_timeout() and print
> and error message if there was a timeout.
>
> Signed-off-by: Roger Quadros <[email protected]>

Thanks for picking this for -next.
Is it better to have this in v4.16-rc fixes?
and also stable? v4.12+

> ---
>
> Changelog:
>
> v2:
> - use wait_event_interruptible_lock_irq_timeout() instead of wait_event_lock_irq()
>
> drivers/usb/dwc3/gadget.c | 23 ++++++++++++++++++++---
> 1 file changed, 20 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> index 2bda4eb..7c3a6e4 100644
> --- a/drivers/usb/dwc3/gadget.c
> +++ b/drivers/usb/dwc3/gadget.c
> @@ -1950,6 +1950,7 @@ static int dwc3_gadget_stop(struct usb_gadget *g)
> struct dwc3 *dwc = gadget_to_dwc(g);
> unsigned long flags;
> int epnum;
> + u32 tmo_eps = 0;
>
> spin_lock_irqsave(&dwc->lock, flags);
>
> @@ -1960,6 +1961,7 @@ static int dwc3_gadget_stop(struct usb_gadget *g)
>
> for (epnum = 2; epnum < DWC3_ENDPOINTS_NUM; epnum++) {
> struct dwc3_ep *dep = dwc->eps[epnum];
> + int ret;
>
> if (!dep)
> continue;
> @@ -1967,9 +1969,24 @@ static int dwc3_gadget_stop(struct usb_gadget *g)
> if (!(dep->flags & DWC3_EP_END_TRANSFER_PENDING))
> continue;
>
> - wait_event_lock_irq(dep->wait_end_transfer,
> - !(dep->flags & DWC3_EP_END_TRANSFER_PENDING),
> - dwc->lock);
> + ret = wait_event_interruptible_lock_irq_timeout(dep->wait_end_transfer,
> + !(dep->flags & DWC3_EP_END_TRANSFER_PENDING),
> + dwc->lock, msecs_to_jiffies(5));
> +
> + if (ret <= 0) {
> + /* Timed out or interrupted! There's nothing much
> + * we can do so we just log here and print which
> + * endpoints timed out at the end.
> + */
> + tmo_eps |= 1 << epnum;
> + dep->flags &= DWC3_EP_END_TRANSFER_PENDING;
> + }
> + }
> +
> + if (tmo_eps) {
> + dev_err(dwc->dev,
> + "end transfer timed out on endpoints 0x%x [bitmap]\n",
> + tmo_eps);
> }
>
> out:
>

--
cheers,
-roger

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2018-03-16 11:02:25

by Felipe Balbi

[permalink] [raw]
Subject: Re: [PATCH v2] usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume


Hi,

Roger Quadros <[email protected]> writes:

> Hi Felipe,
>
> On 09/03/18 14:47, Roger Quadros wrote:
>> In the following test we get stuck by sleeping forever in _dwc3_set_mode()
>> after which dual-role switching doesn't work.
>>
>> On dra7-evm's dual-role port,
>> - Load g_zero gadget driver and enumerate to host
>> - suspend to mem
>> - disconnect USB cable to host and connect otg cable with Pen drive in it.
>> - resume system
>> - we sleep indefinitely in _dwc3_set_mode due to.
>> dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()->
>> dwc3_gadget_stop()->wait_event_lock_irq()
>>
>> To fix this instead of waiting indefinitely with wait_event_lock_irq()
>> we use wait_event_interruptible_lock_irq_timeout() and print
>> and error message if there was a timeout.
>>
>> Signed-off-by: Roger Quadros <[email protected]>
>
> Thanks for picking this for -next.
> Is it better to have this in v4.16-rc fixes?
> and also stable? v4.12+

Well, there was no "Fixes: foobar" or "Cc: stable" lines in the commit
log ;-)

The best we can do now, is wait for -rc1 and manually send the commit to
stable.

--
balbi

2018-03-16 11:05:02

by Roger Quadros

[permalink] [raw]
Subject: Re: [PATCH v2] usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume

On 16/03/18 13:00, Felipe Balbi wrote:
>
> Hi,
>
> Roger Quadros <[email protected]> writes:
>
>> Hi Felipe,
>>
>> On 09/03/18 14:47, Roger Quadros wrote:
>>> In the following test we get stuck by sleeping forever in _dwc3_set_mode()
>>> after which dual-role switching doesn't work.
>>>
>>> On dra7-evm's dual-role port,
>>> - Load g_zero gadget driver and enumerate to host
>>> - suspend to mem
>>> - disconnect USB cable to host and connect otg cable with Pen drive in it.
>>> - resume system
>>> - we sleep indefinitely in _dwc3_set_mode due to.
>>> dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()->
>>> dwc3_gadget_stop()->wait_event_lock_irq()
>>>
>>> To fix this instead of waiting indefinitely with wait_event_lock_irq()
>>> we use wait_event_interruptible_lock_irq_timeout() and print
>>> and error message if there was a timeout.
>>>
>>> Signed-off-by: Roger Quadros <[email protected]>
>>
>> Thanks for picking this for -next.
>> Is it better to have this in v4.16-rc fixes?
>> and also stable? v4.12+
>
> Well, there was no "Fixes: foobar" or "Cc: stable" lines in the commit
> log ;-)
>
> The best we can do now, is wait for -rc1 and manually send the commit to
> stable.
>

That's fine. Thanks.

--
cheers,
-roger

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2018-03-16 11:46:54

by Minas Harutyunyan

[permalink] [raw]
Subject: Re: [PATCH v2] usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume

Hi,

On 3/16/2018 3:03 PM, Roger Quadros wrote:
> On 16/03/18 13:00, Felipe Balbi wrote:
>>
>> Hi,
>>
>> Roger Quadros <[email protected]> writes:
>>
>>> Hi Felipe,
>>>
>>> On 09/03/18 14:47, Roger Quadros wrote:
>>>> In the following test we get stuck by sleeping forever in _dwc3_set_mode()
>>>> after which dual-role switching doesn't work.
>>>>
>>>> On dra7-evm's dual-role port,
>>>> - Load g_zero gadget driver and enumerate to host
>>>> - suspend to mem
>>>> - disconnect USB cable to host and connect otg cable with Pen drive in it.
>>>> - resume system
>>>> - we sleep indefinitely in _dwc3_set_mode due to.
>>>> dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()->
>>>> dwc3_gadget_stop()->wait_event_lock_irq()
>>>>
>>>> To fix this instead of waiting indefinitely with wait_event_lock_irq()
>>>> we use wait_event_interruptible_lock_irq_timeout() and print
>>>> and error message if there was a timeout.
>>>>
>>>> Signed-off-by: Roger Quadros <[email protected]>
>>>
>>> Thanks for picking this for -next.
>>> Is it better to have this in v4.16-rc fixes?
>>> and also stable? v4.12+
>>
>> Well, there was no "Fixes: foobar" or "Cc: stable" lines in the commit
>> log ;-)
>>
>> The best we can do now, is wait for -rc1 and manually send the commit to
>> stable.
>>
>
> That's fine. Thanks.
>

Same issue seen in dwc3_gadget_ep_dequeue() function where also used
wait_event_lock_irq() - as result infinite loop.
Actually to fix this issue I updated condition of wait function
from:
!(dep->flags & DWC3_EP_END_TRANSFER_PENDING)
to:
!(dep->flags & DWC3_EP_END_TRANSFER_PENDING & DWC3_EP_ENABLED)
Not, sure that this fix is fully correct because I'm familiar with dwc3,
but this fix allow us to go forward with request dequeue. I think, need
deeper investigation of infinite loop to catch root cause of it, before
accept any of fixes.

Thanks,
Minas

2018-03-16 12:26:36

by Felipe Balbi

[permalink] [raw]
Subject: Re: [PATCH v2] usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume


Hi,

Minas Harutyunyan <[email protected]> writes:
>>>> On 09/03/18 14:47, Roger Quadros wrote:
>>>>> In the following test we get stuck by sleeping forever in _dwc3_set_mode()
>>>>> after which dual-role switching doesn't work.
>>>>>
>>>>> On dra7-evm's dual-role port,
>>>>> - Load g_zero gadget driver and enumerate to host
>>>>> - suspend to mem
>>>>> - disconnect USB cable to host and connect otg cable with Pen drive in it.
>>>>> - resume system
>>>>> - we sleep indefinitely in _dwc3_set_mode due to.
>>>>> dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()->
>>>>> dwc3_gadget_stop()->wait_event_lock_irq()
>>>>>
>>>>> To fix this instead of waiting indefinitely with wait_event_lock_irq()
>>>>> we use wait_event_interruptible_lock_irq_timeout() and print
>>>>> and error message if there was a timeout.
>>>>>
>>>>> Signed-off-by: Roger Quadros <[email protected]>
>>>>
>>>> Thanks for picking this for -next.
>>>> Is it better to have this in v4.16-rc fixes?
>>>> and also stable? v4.12+
>>>
>>> Well, there was no "Fixes: foobar" or "Cc: stable" lines in the commit
>>> log ;-)
>>>
>>> The best we can do now, is wait for -rc1 and manually send the commit to
>>> stable.
>>>
>>
>> That's fine. Thanks.
>>
>
> Same issue seen in dwc3_gadget_ep_dequeue() function where also used
> wait_event_lock_irq() - as result infinite loop.

how did this happen? During rmmod dwc3? Or, perhaps, after you unloaded
a gadget driver?

> Actually to fix this issue I updated condition of wait function
> from:
> !(dep->flags & DWC3_EP_END_TRANSFER_PENDING)
> to:
> !(dep->flags & DWC3_EP_END_TRANSFER_PENDING & DWC3_EP_ENABLED)

you're not fixing anything. You're, essentially, removing the entire
end transfer pending logic. The whole idea of this is that we can
disable the endpoint and wait for the End Transfer interrupt. When you
add a check for the endpoint being enabled, then that code will never
run and, thus, never wait for the End Transfer IRQ.

If you manage to find a more reliable way of reproducing this, then make
sure to capture dwc3 tracepoints (see the documentation for details) and
let's start trying to figure out what's going on.

cheers

--
balbi

2018-03-17 06:34:25

by Minas Harutyunyan

[permalink] [raw]
Subject: Re: [PATCH v2] usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume

Hi,

On 3/16/2018 4:25 PM, Felipe Balbi wrote:
>
> Hi,
>
> Minas Harutyunyan <[email protected]> writes:
>>>>> On 09/03/18 14:47, Roger Quadros wrote:
>>>>>> In the following test we get stuck by sleeping forever in _dwc3_set_mode()
>>>>>> after which dual-role switching doesn't work.
>>>>>>
>>>>>> On dra7-evm's dual-role port,
>>>>>> - Load g_zero gadget driver and enumerate to host
>>>>>> - suspend to mem
>>>>>> - disconnect USB cable to host and connect otg cable with Pen drive in it.
>>>>>> - resume system
>>>>>> - we sleep indefinitely in _dwc3_set_mode due to.
>>>>>> dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()->
>>>>>> dwc3_gadget_stop()->wait_event_lock_irq()
>>>>>>
>>>>>> To fix this instead of waiting indefinitely with wait_event_lock_irq()
>>>>>> we use wait_event_interruptible_lock_irq_timeout() and print
>>>>>> and error message if there was a timeout.
>>>>>>
>>>>>> Signed-off-by: Roger Quadros <[email protected]>
>>>>>
>>>>> Thanks for picking this for -next.
>>>>> Is it better to have this in v4.16-rc fixes?
>>>>> and also stable? v4.12+
>>>>
>>>> Well, there was no "Fixes: foobar" or "Cc: stable" lines in the commit
>>>> log ;-)
>>>>
>>>> The best we can do now, is wait for -rc1 and manually send the commit to
>>>> stable.
>>>>
>>>
>>> That's fine. Thanks.
>>>
>>
>> Same issue seen in dwc3_gadget_ep_dequeue() function where also used
>> wait_event_lock_irq() - as result infinite loop.
>
> how did this happen? During rmmod dwc3? Or, perhaps, after you unloaded
> a gadget driver?
>
No, not during rmmod's.
We using our internal USB testing tool. Test case; ISOC OUT, transfer
size N frames. When host starts ISOC OUT traffic then the dwc3 based on
"Transfer not ready" event in frame F starts transfers staring from
frame F+4 (for bInterval=1) as result 4 requests, which already queued
on device side, remain incomplete. Function driver on some timeout
trying dequeue these 4 requests (without disabling EP) to complete test.
For IN ISOC's these requests completed on MISSED ISOC event, but for
ISOC OUT required call dequeue on some timeout.

>> Actually to fix this issue I updated condition of wait function
>> from:
>> !(dep->flags & DWC3_EP_END_TRANSFER_PENDING)
>> to:
>> !(dep->flags & DWC3_EP_END_TRANSFER_PENDING & DWC3_EP_ENABLED)
>
> you're not fixing anything. You're, essentially, removing the entire
> end transfer pending logic.
yes, you are right, but how to overcome this infinite loop? Replace
wait_event_lock_irq() by wait_event_interruptible_lock_irq_timeout()?

The whole idea of this is that we can
> disable the endpoint and wait for the End Transfer interrupt. When you
> add a check for the endpoint being enabled, then that code will never
> run and, thus, never wait for the End Transfer IRQ.
>
> If you manage to find a more reliable way of reproducing this, then make
> sure to capture dwc3 tracepoints (see the documentation for details) and
> let's start trying to figure out what's going on.
>
> cheers
>


2018-03-19 08:56:27

by Felipe Balbi

[permalink] [raw]
Subject: Re: [PATCH v2] usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume


Hi,

Minas Harutyunyan <[email protected]> writes:
>>>>>> Thanks for picking this for -next.
>>>>>> Is it better to have this in v4.16-rc fixes?
>>>>>> and also stable? v4.12+
>>>>>
>>>>> Well, there was no "Fixes: foobar" or "Cc: stable" lines in the commit
>>>>> log ;-)
>>>>>
>>>>> The best we can do now, is wait for -rc1 and manually send the commit to
>>>>> stable.
>>>>>
>>>>
>>>> That's fine. Thanks.
>>>>
>>>
>>> Same issue seen in dwc3_gadget_ep_dequeue() function where also used
>>> wait_event_lock_irq() - as result infinite loop.
>>
>> how did this happen? During rmmod dwc3? Or, perhaps, after you unloaded
>> a gadget driver?
>>
> No, not during rmmod's.
> We using our internal USB testing tool. Test case; ISOC OUT, transfer
> size N frames. When host starts ISOC OUT traffic then the dwc3 based on
> "Transfer not ready" event in frame F starts transfers staring from
> frame F+4 (for bInterval=1) as result 4 requests, which already queued
> on device side, remain incomplete. Function driver on some timeout
> trying dequeue these 4 requests (without disabling EP) to complete test.
> For IN ISOC's these requests completed on MISSED ISOC event, but for
> ISOC OUT required call dequeue on some timeout.

okay

>>> Actually to fix this issue I updated condition of wait function
>>> from:
>>> !(dep->flags & DWC3_EP_END_TRANSFER_PENDING)
>>> to:
>>> !(dep->flags & DWC3_EP_END_TRANSFER_PENDING & DWC3_EP_ENABLED)
>>
>> you're not fixing anything. You're, essentially, removing the entire
>> end transfer pending logic.
> yes, you are right, but how to overcome this infinite loop? Replace
> wait_event_lock_irq() by wait_event_interruptible_lock_irq_timeout()?

The best way here would be to figure why we're missing command complete
IRQ in those cases. According to documentation, we *should* receive that
interrupt, so why is it missing?

--
balbi


Attachments:
signature.asc (847.00 B)

2018-03-19 11:38:45

by Minas Harutyunyan

[permalink] [raw]
Subject: Re: [PATCH v2] usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume

Hi,

On 3/19/2018 12:55 PM, Felipe Balbi wrote:
>
> Hi,
>
> Minas Harutyunyan <[email protected]> writes:
>>>>>>> Thanks for picking this for -next.
>>>>>>> Is it better to have this in v4.16-rc fixes?
>>>>>>> and also stable? v4.12+
>>>>>>
>>>>>> Well, there was no "Fixes: foobar" or "Cc: stable" lines in the commit
>>>>>> log ;-)
>>>>>>
>>>>>> The best we can do now, is wait for -rc1 and manually send the commit to
>>>>>> stable.
>>>>>>
>>>>>
>>>>> That's fine. Thanks.
>>>>>
>>>>
>>>> Same issue seen in dwc3_gadget_ep_dequeue() function where also used
>>>> wait_event_lock_irq() - as result infinite loop.
>>>
>>> how did this happen? During rmmod dwc3? Or, perhaps, after you unloaded
>>> a gadget driver?
>>>
>> No, not during rmmod's.
>> We using our internal USB testing tool. Test case; ISOC OUT, transfer
>> size N frames. When host starts ISOC OUT traffic then the dwc3 based on
>> "Transfer not ready" event in frame F starts transfers staring from
>> frame F+4 (for bInterval=1) as result 4 requests, which already queued
>> on device side, remain incomplete. Function driver on some timeout
>> trying dequeue these 4 requests (without disabling EP) to complete test.
>> For IN ISOC's these requests completed on MISSED ISOC event, but for
>> ISOC OUT required call dequeue on some timeout.
>
> okay
>
>>>> Actually to fix this issue I updated condition of wait function
>>>> from:
>>>> !(dep->flags & DWC3_EP_END_TRANSFER_PENDING)
>>>> to:
>>>> !(dep->flags & DWC3_EP_END_TRANSFER_PENDING & DWC3_EP_ENABLED)
>>>
>>> you're not fixing anything. You're, essentially, removing the entire
>>> end transfer pending logic.
>> yes, you are right, but how to overcome this infinite loop? Replace
>> wait_event_lock_irq() by wait_event_interruptible_lock_irq_timeout()?
>
> The best way here would be to figure why we're missing command complete
> IRQ in those cases. According to documentation, we *should* receive that
> interrupt, so why is it missing?
>

Additional info on test. Core configuration is HS only mode, test speed
HS, core version v2.90a. Maybe it will help to understand cause of issue.
BTW, currently to pass above describe ISOC OUT test we just commented
wait_event_lock_irq() in dwc3_gadget_ep_dequeue() function and
successfully received request completion in function driver.
Thanks,
Minas

2018-03-19 13:56:16

by Minas Harutyunyan

[permalink] [raw]
Subject: Re: [PATCH v2] usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume

Hi,

On 3/19/2018 3:36 PM, Minas Harutyunyan wrote:
> Hi,
>
> On 3/19/2018 12:55 PM, Felipe Balbi wrote:
>>
>> Hi,
>>
>> Minas Harutyunyan <[email protected]> writes:
>>>>>>>> Thanks for picking this for -next.
>>>>>>>> Is it better to have this in v4.16-rc fixes?
>>>>>>>> and also stable? v4.12+
>>>>>>>
>>>>>>> Well, there was no "Fixes: foobar" or "Cc: stable" lines in the commit
>>>>>>> log ;-)
>>>>>>>
>>>>>>> The best we can do now, is wait for -rc1 and manually send the commit to
>>>>>>> stable.
>>>>>>>
>>>>>>
>>>>>> That's fine. Thanks.
>>>>>>
>>>>>
>>>>> Same issue seen in dwc3_gadget_ep_dequeue() function where also used
>>>>> wait_event_lock_irq() - as result infinite loop.
>>>>
>>>> how did this happen? During rmmod dwc3? Or, perhaps, after you unloaded
>>>> a gadget driver?
>>>>
>>> No, not during rmmod's.
>>> We using our internal USB testing tool. Test case; ISOC OUT, transfer
>>> size N frames. When host starts ISOC OUT traffic then the dwc3 based on
>>> "Transfer not ready" event in frame F starts transfers staring from
>>> frame F+4 (for bInterval=1) as result 4 requests, which already queued
>>> on device side, remain incomplete. Function driver on some timeout
>>> trying dequeue these 4 requests (without disabling EP) to complete test.
>>> For IN ISOC's these requests completed on MISSED ISOC event, but for
>>> ISOC OUT required call dequeue on some timeout.
>>
>> okay
>>
>>>>> Actually to fix this issue I updated condition of wait function
>>>>> from:
>>>>> !(dep->flags & DWC3_EP_END_TRANSFER_PENDING)
>>>>> to:
>>>>> !(dep->flags & DWC3_EP_END_TRANSFER_PENDING & DWC3_EP_ENABLED)
>>>>
>>>> you're not fixing anything. You're, essentially, removing the entire
>>>> end transfer pending logic.
>>> yes, you are right, but how to overcome this infinite loop? Replace
>>> wait_event_lock_irq() by wait_event_interruptible_lock_irq_timeout()?
>>
>> The best way here would be to figure why we're missing command complete
>> IRQ in those cases. According to documentation, we *should* receive that
>> interrupt, so why is it missing?
>>
>
> Additional info on test. Core configuration is HS only mode, test speed
> HS, core version v2.90a. Maybe it will help to understand cause of issue.
> BTW, currently to pass above describe ISOC OUT test we just commented
> wait_event_lock_irq() in dwc3_gadget_ep_dequeue() function and
> successfully received request completion in function driver.
> Thanks,
> Minas
>

One more info: while function driver call dequeue, host periodically
send control read command to get status of test from function - test In
Progress or Finished.
Thanks,
Minas

2018-04-10 06:32:49

by Minas Harutyunyan

[permalink] [raw]
Subject: Re: [PATCH v2] usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume

Hi Filipe,

On 3/19/2018 5:53 PM, Minas Harutyunyan wrote:
> Hi,
>
> On 3/19/2018 3:36 PM, Minas Harutyunyan wrote:
>> Hi,
>>
>> On 3/19/2018 12:55 PM, Felipe Balbi wrote:
>>>
>>> Hi,
>>>
>>> Minas Harutyunyan <[email protected]> writes:
>>>>>>>>> Thanks for picking this for -next.
>>>>>>>>> Is it better to have this in v4.16-rc fixes?
>>>>>>>>> and also stable? v4.12+
>>>>>>>>
>>>>>>>> Well, there was no "Fixes: foobar" or "Cc: stable" lines in the commit
>>>>>>>> log ;-)
>>>>>>>>
>>>>>>>> The best we can do now, is wait for -rc1 and manually send the commit to
>>>>>>>> stable.
>>>>>>>>
>>>>>>>
>>>>>>> That's fine. Thanks.
>>>>>>>
>>>>>>
>>>>>> Same issue seen in dwc3_gadget_ep_dequeue() function where also used
>>>>>> wait_event_lock_irq() - as result infinite loop.
>>>>>
>>>>> how did this happen? During rmmod dwc3? Or, perhaps, after you unloaded
>>>>> a gadget driver?
>>>>>
>>>> No, not during rmmod's.
>>>> We using our internal USB testing tool. Test case; ISOC OUT, transfer
>>>> size N frames. When host starts ISOC OUT traffic then the dwc3 based on
>>>> "Transfer not ready" event in frame F starts transfers staring from
>>>> frame F+4 (for bInterval=1) as result 4 requests, which already queued
>>>> on device side, remain incomplete. Function driver on some timeout
>>>> trying dequeue these 4 requests (without disabling EP) to complete test.
>>>> For IN ISOC's these requests completed on MISSED ISOC event, but for
>>>> ISOC OUT required call dequeue on some timeout.
>>>
>>> okay
>>>
>>>>>> Actually to fix this issue I updated condition of wait function
>>>>>> from:
>>>>>> !(dep->flags & DWC3_EP_END_TRANSFER_PENDING)
>>>>>> to:
>>>>>> !(dep->flags & DWC3_EP_END_TRANSFER_PENDING & DWC3_EP_ENABLED)
>>>>>
>>>>> you're not fixing anything. You're, essentially, removing the entire
>>>>> end transfer pending logic.
>>>> yes, you are right, but how to overcome this infinite loop? Replace
>>>> wait_event_lock_irq() by wait_event_interruptible_lock_irq_timeout()?
>>>
>>> The best way here would be to figure why we're missing command complete
>>> IRQ in those cases. According to documentation, we *should* receive that
>>> interrupt, so why is it missing?
>>>
>>
>> Additional info on test. Core configuration is HS only mode, test speed
>> HS, core version v2.90a. Maybe it will help to understand cause of issue.
>> BTW, currently to pass above describe ISOC OUT test we just commented
>> wait_event_lock_irq() in dwc3_gadget_ep_dequeue() function and
>> successfully received request completion in function driver.
>> Thanks,
>> Minas
>>
>
> One more info: while function driver call dequeue, host periodically
> send control read command to get status of test from function - test In
> Progress or Finished.
> Thanks,
> Minas
>

Your last dwc3 patch series allow us to successfully dequeuing remaining
requests without falling in to infinite loop.

Thank you,
Minas

2018-04-10 07:36:31

by Felipe Balbi

[permalink] [raw]
Subject: Re: [PATCH v2] usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume


Hi,

Minas Harutyunyan <[email protected]> writes:
>>>>>>> Actually to fix this issue I updated condition of wait function
>>>>>>> from:
>>>>>>> !(dep->flags & DWC3_EP_END_TRANSFER_PENDING)
>>>>>>> to:
>>>>>>> !(dep->flags & DWC3_EP_END_TRANSFER_PENDING & DWC3_EP_ENABLED)
>>>>>>
>>>>>> you're not fixing anything. You're, essentially, removing the entire
>>>>>> end transfer pending logic.
>>>>> yes, you are right, but how to overcome this infinite loop? Replace
>>>>> wait_event_lock_irq() by wait_event_interruptible_lock_irq_timeout()?
>>>>
>>>> The best way here would be to figure why we're missing command complete
>>>> IRQ in those cases. According to documentation, we *should* receive that
>>>> interrupt, so why is it missing?
>>>>
>>>
>>> Additional info on test. Core configuration is HS only mode, test speed
>>> HS, core version v2.90a. Maybe it will help to understand cause of issue.
>>> BTW, currently to pass above describe ISOC OUT test we just commented
>>> wait_event_lock_irq() in dwc3_gadget_ep_dequeue() function and
>>> successfully received request completion in function driver.
>>> Thanks,
>>> Minas
>>>
>>
>> One more info: while function driver call dequeue, host periodically
>> send control read command to get status of test from function - test In
>> Progress or Finished.
>> Thanks,
>> Minas
>>
>
> Your last dwc3 patch series allow us to successfully dequeuing remaining
> requests without falling in to infinite loop.

that's cool, thanks :-) I'll just fix the documentation bug I introduced
heh :-)

--
balbi


Attachments:
signature.asc (847.00 B)