2016-12-26 08:01:39

by Baolin Wang

[permalink] [raw]
Subject: [PATCH] usb: dwc3: gadget: Avoid race between dwc3 interrupt handler and irq thread handler

On some platfroms(like x86 platform), when one core is running the USB gadget
irq thread handler by dwc3_thread_interrupt(), meanwhile another core also can
respond other interrupts from dwc3 controller and modify the event buffer by
dwc3_interrupt() function, that will cause getting the wrong event count in
irq thread handler to make the USB function abnormal.

We should add spin_lock/unlock() in dwc3_check_event_buf() to avoid this race.

Signed-off-by: Baolin Wang <[email protected]>
---
drivers/usb/dwc3/gadget.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 6785595..1a1e1f4 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -2894,10 +2894,13 @@ static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
return IRQ_HANDLED;
}

+ spin_lock(&dwc->lock);
count = dwc3_readl(dwc->regs, DWC3_GEVNTCOUNT(0));
count &= DWC3_GEVNTCOUNT_MASK;
- if (!count)
+ if (!count) {
+ spin_unlock(&dwc->lock);
return IRQ_NONE;
+ }

evt->count = count;
evt->flags |= DWC3_EVENT_PENDING;
@@ -2914,6 +2917,7 @@ static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
memcpy(evt->cache, evt->buf, count - amount);

dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), count);
+ spin_unlock(&dwc->lock);

return IRQ_WAKE_THREAD;
}
--
1.7.9.5


2016-12-27 02:39:55

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH] usb: dwc3: gadget: Avoid race between dwc3 interrupt handler and irq thread handler

Hi,

On 12/26/2016 04:01 PM, Baolin Wang wrote:
> On some platfroms(like x86 platform), when one core is running the USB gadget
> irq thread handler by dwc3_thread_interrupt(), meanwhile another core also can
> respond other interrupts from dwc3 controller and modify the event buffer by
> dwc3_interrupt() function, that will cause getting the wrong event count in
> irq thread handler to make the USB function abnormal.
>
> We should add spin_lock/unlock() in dwc3_check_event_buf() to avoid this race.

Why not spin_lock_irq ones? This lock seems to be used in both
normal and interrupt threads. Or, I missed anything?

Best regards,
Lu Baolu

>
> Signed-off-by: Baolin Wang <[email protected]>
> ---
> drivers/usb/dwc3/gadget.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> index 6785595..1a1e1f4 100644
> --- a/drivers/usb/dwc3/gadget.c
> +++ b/drivers/usb/dwc3/gadget.c
> @@ -2894,10 +2894,13 @@ static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
> return IRQ_HANDLED;
> }
>
> + spin_lock(&dwc->lock);
> count = dwc3_readl(dwc->regs, DWC3_GEVNTCOUNT(0));
> count &= DWC3_GEVNTCOUNT_MASK;
> - if (!count)
> + if (!count) {
> + spin_unlock(&dwc->lock);
> return IRQ_NONE;
> + }
>
> evt->count = count;
> evt->flags |= DWC3_EVENT_PENDING;
> @@ -2914,6 +2917,7 @@ static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
> memcpy(evt->cache, evt->buf, count - amount);
>
> dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), count);
> + spin_unlock(&dwc->lock);
>
> return IRQ_WAKE_THREAD;
> }

2016-12-27 02:58:26

by Baolin Wang

[permalink] [raw]
Subject: Re: [PATCH] usb: dwc3: gadget: Avoid race between dwc3 interrupt handler and irq thread handler

Hi,

On 27 December 2016 at 10:39, Lu Baolu <[email protected]> wrote:
> Hi,
>
> On 12/26/2016 04:01 PM, Baolin Wang wrote:
>> On some platfroms(like x86 platform), when one core is running the USB gadget
>> irq thread handler by dwc3_thread_interrupt(), meanwhile another core also can
>> respond other interrupts from dwc3 controller and modify the event buffer by
>> dwc3_interrupt() function, that will cause getting the wrong event count in
>> irq thread handler to make the USB function abnormal.
>>
>> We should add spin_lock/unlock() in dwc3_check_event_buf() to avoid this race.
>
> Why not spin_lock_irq ones? This lock seems to be used in both
> normal and interrupt threads. Or, I missed anything?

I assumed there are no nested interrupts, when one core is running at
interrupt context, then it can not respond any other interrupts, which
means we don't need to disable local IRQ now, right?

--
Baolin.wang
Best Regards

2016-12-27 04:45:15

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH] usb: dwc3: gadget: Avoid race between dwc3 interrupt handler and irq thread handler

Hi,

On 12/27/2016 10:58 AM, Baolin Wang wrote:
> Hi,
>
> On 27 December 2016 at 10:39, Lu Baolu <[email protected]> wrote:
>> Hi,
>>
>> On 12/26/2016 04:01 PM, Baolin Wang wrote:
>>> On some platfroms(like x86 platform), when one core is running the USB gadget
>>> irq thread handler by dwc3_thread_interrupt(), meanwhile another core also can
>>> respond other interrupts from dwc3 controller and modify the event buffer by
>>> dwc3_interrupt() function, that will cause getting the wrong event count in
>>> irq thread handler to make the USB function abnormal.
>>>
>>> We should add spin_lock/unlock() in dwc3_check_event_buf() to avoid this race.
>> Why not spin_lock_irq ones? This lock seems to be used in both
>> normal and interrupt threads. Or, I missed anything?
> I assumed there are no nested interrupts, when one core is running at
> interrupt context, then it can not respond any other interrupts, which
> means we don't need to disable local IRQ now, right?
>

Fair enough. Thanks.

Best regards,
Lu Baolu

2016-12-27 10:52:24

by Janusz Dziedzic

[permalink] [raw]
Subject: Re: [PATCH] usb: dwc3: gadget: Avoid race between dwc3 interrupt handler and irq thread handler

2016-12-26 9:01 GMT+01:00 Baolin Wang <[email protected]>:
> On some platfroms(like x86 platform), when one core is running the USB gadget
> irq thread handler by dwc3_thread_interrupt(), meanwhile another core also can
> respond other interrupts from dwc3 controller and modify the event buffer by
> dwc3_interrupt() function, that will cause getting the wrong event count in
> irq thread handler to make the USB function abnormal.
>
> We should add spin_lock/unlock() in dwc3_check_event_buf() to avoid this race.
>
Interesting, I always think we mask interrupt in dwc3_interrupt() by setting
DWC3_GEVNTSIZ_INTMASK
And unmask interrupt when we end dwc3_thread_interrupt().

So, we shouldn't get any IRQ from HW during dwc3_thread_interrupt(),
or I miss something?
Do you have some traces that indicate this masking will not work correctly?

BTW, what value you get when problem occured, 0xFFFC?

BR
Janusz

> Signed-off-by: Baolin Wang <[email protected]>
> ---
> drivers/usb/dwc3/gadget.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> index 6785595..1a1e1f4 100644
> --- a/drivers/usb/dwc3/gadget.c
> +++ b/drivers/usb/dwc3/gadget.c
> @@ -2894,10 +2894,13 @@ static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
> return IRQ_HANDLED;
> }
>
> + spin_lock(&dwc->lock);
> count = dwc3_readl(dwc->regs, DWC3_GEVNTCOUNT(0));
> count &= DWC3_GEVNTCOUNT_MASK;
> - if (!count)
> + if (!count) {
> + spin_unlock(&dwc->lock);
> return IRQ_NONE;
> + }
>
> evt->count = count;
> evt->flags |= DWC3_EVENT_PENDING;
> @@ -2914,6 +2917,7 @@ static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
> memcpy(evt->cache, evt->buf, count - amount);
>
> dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), count);
> + spin_unlock(&dwc->lock);
>
> return IRQ_WAKE_THREAD;
> }
> --
> 1.7.9.5
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-usb" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html



--
Janusz Dziedzic

2016-12-27 11:06:51

by Baolin Wang

[permalink] [raw]
Subject: Re: [PATCH] usb: dwc3: gadget: Avoid race between dwc3 interrupt handler and irq thread handler

Hi,

On 27 December 2016 at 18:52, Janusz Dziedzic <[email protected]> wrote:
> 2016-12-26 9:01 GMT+01:00 Baolin Wang <[email protected]>:
>> On some platfroms(like x86 platform), when one core is running the USB gadget
>> irq thread handler by dwc3_thread_interrupt(), meanwhile another core also can
>> respond other interrupts from dwc3 controller and modify the event buffer by
>> dwc3_interrupt() function, that will cause getting the wrong event count in
>> irq thread handler to make the USB function abnormal.
>>
>> We should add spin_lock/unlock() in dwc3_check_event_buf() to avoid this race.
>>
> Interesting, I always think we mask interrupt in dwc3_interrupt() by setting
> DWC3_GEVNTSIZ_INTMASK
> And unmask interrupt when we end dwc3_thread_interrupt().
>
> So, we shouldn't get any IRQ from HW during dwc3_thread_interrupt(),
> or I miss something?
> Do you have some traces that indicate this masking will not work correctly?

Yes, but we just masked the interrupts described in DEVTEN register,
and we did not mask all the interrupts, like the endpoint command
complete event, transfer complete event and so on, so we can still get
interrupts.

>
> BTW, what value you get when problem occured, 0xFFFC?

Yes, something like this, the event count become huge.

>
> BR
> Janusz
>
>> Signed-off-by: Baolin Wang <[email protected]>
>> ---
>> drivers/usb/dwc3/gadget.c | 6 +++++-
>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>> index 6785595..1a1e1f4 100644
>> --- a/drivers/usb/dwc3/gadget.c
>> +++ b/drivers/usb/dwc3/gadget.c
>> @@ -2894,10 +2894,13 @@ static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
>> return IRQ_HANDLED;
>> }
>>
>> + spin_lock(&dwc->lock);
>> count = dwc3_readl(dwc->regs, DWC3_GEVNTCOUNT(0));
>> count &= DWC3_GEVNTCOUNT_MASK;
>> - if (!count)
>> + if (!count) {
>> + spin_unlock(&dwc->lock);
>> return IRQ_NONE;
>> + }
>>
>> evt->count = count;
>> evt->flags |= DWC3_EVENT_PENDING;
>> @@ -2914,6 +2917,7 @@ static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
>> memcpy(evt->cache, evt->buf, count - amount);
>>
>> dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), count);
>> + spin_unlock(&dwc->lock);
>>
>> return IRQ_WAKE_THREAD;
>> }
>> --
>> 1.7.9.5
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-usb" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
>
> --
> Janusz Dziedzic



--
Baolin.wang
Best Regards

2016-12-27 11:07:46

by Felipe Balbi

[permalink] [raw]
Subject: Re: [PATCH] usb: dwc3: gadget: Avoid race between dwc3 interrupt handler and irq thread handler

Hi,

Lu Baolu <[email protected]> writes:
> On 12/26/2016 04:01 PM, Baolin Wang wrote:
>> On some platfroms(like x86 platform), when one core is running the USB gadget
>> irq thread handler by dwc3_thread_interrupt(), meanwhile another core also can
>> respond other interrupts from dwc3 controller and modify the event buffer by
>> dwc3_interrupt() function, that will cause getting the wrong event count in
>> irq thread handler to make the USB function abnormal.
>>
>> We should add spin_lock/unlock() in dwc3_check_event_buf() to avoid this race.
>
> Why not spin_lock_irq ones? This lock seems to be used in both
> normal and interrupt threads. Or, I missed anything?

this is top half handler. Interrupts are already disabled.

--
balbi


Attachments:
signature.asc (832.00 B)

2016-12-27 11:09:15

by Felipe Balbi

[permalink] [raw]
Subject: Re: [PATCH] usb: dwc3: gadget: Avoid race between dwc3 interrupt handler and irq thread handler


Hi,

Janusz Dziedzic <[email protected]> writes:
> 2016-12-26 9:01 GMT+01:00 Baolin Wang <[email protected]>:
>> On some platfroms(like x86 platform), when one core is running the USB gadget
>> irq thread handler by dwc3_thread_interrupt(), meanwhile another core also can
>> respond other interrupts from dwc3 controller and modify the event buffer by
>> dwc3_interrupt() function, that will cause getting the wrong event count in
>> irq thread handler to make the USB function abnormal.
>>
>> We should add spin_lock/unlock() in dwc3_check_event_buf() to avoid this race.
>>
> Interesting, I always think we mask interrupt in dwc3_interrupt() by setting
> DWC3_GEVNTSIZ_INTMASK
> And unmask interrupt when we end dwc3_thread_interrupt().
>
> So, we shouldn't get any IRQ from HW during dwc3_thread_interrupt(),
> or I miss something?
> Do you have some traces that indicate this masking will not work correctly?

that's the very question I have. We are already masking interrupts from
this controller. The only thing this could race with is usb_ep_queue(),
but that gets nowhere close to anything we're doing in the top half
handler, so there's really no danger of anything bad happening.

I'd like to see traces as well.

--
balbi


Attachments:
signature.asc (832.00 B)

2016-12-27 11:13:07

by Felipe Balbi

[permalink] [raw]
Subject: Re: [PATCH] usb: dwc3: gadget: Avoid race between dwc3 interrupt handler and irq thread handler


Hi,

Baolin Wang <[email protected]> writes:
> Hi,
>
> On 27 December 2016 at 18:52, Janusz Dziedzic <[email protected]> wrote:
>> 2016-12-26 9:01 GMT+01:00 Baolin Wang <[email protected]>:
>>> On some platfroms(like x86 platform), when one core is running the USB gadget
>>> irq thread handler by dwc3_thread_interrupt(), meanwhile another core also can
>>> respond other interrupts from dwc3 controller and modify the event buffer by
>>> dwc3_interrupt() function, that will cause getting the wrong event count in
>>> irq thread handler to make the USB function abnormal.
>>>
>>> We should add spin_lock/unlock() in dwc3_check_event_buf() to avoid this race.
>>>
>> Interesting, I always think we mask interrupt in dwc3_interrupt() by setting
>> DWC3_GEVNTSIZ_INTMASK
>> And unmask interrupt when we end dwc3_thread_interrupt().
>>
>> So, we shouldn't get any IRQ from HW during dwc3_thread_interrupt(),
>> or I miss something?
>> Do you have some traces that indicate this masking will not work correctly?
>
> Yes, but we just masked the interrupts described in DEVTEN register,
> and we did not mask all the interrupts, like the endpoint command
> complete event, transfer complete event and so on, so we can still get
> interrupts.

not true, we masked interrupts for the entire event buffer:

> static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
> {
> struct dwc3 *dwc = evt->dwc;
> u32 count;
> u32 reg;
>
> if (pm_runtime_suspended(dwc->dev)) {
> pm_runtime_get(dwc->dev);
> disable_irq_nosync(dwc->irq_gadget);
> dwc->pending_events = true;
> return IRQ_HANDLED;
> }
>
> count = dwc3_readl(dwc->regs, DWC3_GEVNTCOUNT(0));
> count &= DWC3_GEVNTCOUNT_MASK;
> if (!count)
> return IRQ_NONE;
>
> evt->count = count;
> evt->flags |= DWC3_EVENT_PENDING;
>
> /* Mask interrupt */
> reg = dwc3_readl(dwc->regs, DWC3_GEVNTSIZ(0));
> reg |= DWC3_GEVNTSIZ_INTMASK;

See here ?!?

> dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0), reg);
>
> return IRQ_WAKE_THREAD;
> }

>> BTW, what value you get when problem occured, 0xFFFC?
>
> Yes, something like this, the event count become huge.

please send us tracepoint data. You probably need to compress
it. Something like 256k of trace data is probably enough, so:

# mkdir -p /t
# mount -t tracefs none /t
# cd /t
# echo 256 > buffer_size_kb
# echo 1 > events/dwc3/enable
# echo 0 > events/dwc3/dwc3_readl/enable
# echo 0 > events/dwc3/dwc3_writel/enable

(reproduce)

# cp /t/trace /path/to/non-volatile/media/trace.txt

--
balbi


Attachments:
signature.asc (832.00 B)

2016-12-27 12:16:58

by Baolin Wang

[permalink] [raw]
Subject: Re: [PATCH] usb: dwc3: gadget: Avoid race between dwc3 interrupt handler and irq thread handler

Hi,

On 27 December 2016 at 19:11, Felipe Balbi <[email protected]> wrote:
>
> Hi,
>
> Baolin Wang <[email protected]> writes:
>> Hi,
>>
>> On 27 December 2016 at 18:52, Janusz Dziedzic <[email protected]> wrote:
>>> 2016-12-26 9:01 GMT+01:00 Baolin Wang <[email protected]>:
>>>> On some platfroms(like x86 platform), when one core is running the USB gadget
>>>> irq thread handler by dwc3_thread_interrupt(), meanwhile another core also can
>>>> respond other interrupts from dwc3 controller and modify the event buffer by
>>>> dwc3_interrupt() function, that will cause getting the wrong event count in
>>>> irq thread handler to make the USB function abnormal.
>>>>
>>>> We should add spin_lock/unlock() in dwc3_check_event_buf() to avoid this race.
>>>>
>>> Interesting, I always think we mask interrupt in dwc3_interrupt() by setting
>>> DWC3_GEVNTSIZ_INTMASK
>>> And unmask interrupt when we end dwc3_thread_interrupt().
>>>
>>> So, we shouldn't get any IRQ from HW during dwc3_thread_interrupt(),
>>> or I miss something?
>>> Do you have some traces that indicate this masking will not work correctly?
>>
>> Yes, but we just masked the interrupts described in DEVTEN register,
>> and we did not mask all the interrupts, like the endpoint command
>> complete event, transfer complete event and so on, so we can still get
>> interrupts.
>
> not true, we masked interrupts for the entire event buffer:

Yes, you are right and I missed that. I should reproduce this problem
and analyse the real reason.

>
>> static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
>> {
>> struct dwc3 *dwc = evt->dwc;
>> u32 count;
>> u32 reg;
>>
>> if (pm_runtime_suspended(dwc->dev)) {
>> pm_runtime_get(dwc->dev);
>> disable_irq_nosync(dwc->irq_gadget);
>> dwc->pending_events = true;
>> return IRQ_HANDLED;
>> }
>>
>> count = dwc3_readl(dwc->regs, DWC3_GEVNTCOUNT(0));
>> count &= DWC3_GEVNTCOUNT_MASK;
>> if (!count)
>> return IRQ_NONE;
>>
>> evt->count = count;
>> evt->flags |= DWC3_EVENT_PENDING;
>>
>> /* Mask interrupt */
>> reg = dwc3_readl(dwc->regs, DWC3_GEVNTSIZ(0));
>> reg |= DWC3_GEVNTSIZ_INTMASK;
>
> See here ?!?
>
>> dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0), reg);
>>
>> return IRQ_WAKE_THREAD;
>> }
>
>>> BTW, what value you get when problem occured, 0xFFFC?
>>
>> Yes, something like this, the event count become huge.
>
> please send us tracepoint data. You probably need to compress
> it. Something like 256k of trace data is probably enough, so:
>
> # mkdir -p /t
> # mount -t tracefs none /t
> # cd /t
> # echo 256 > buffer_size_kb
> # echo 1 > events/dwc3/enable
> # echo 0 > events/dwc3/dwc3_readl/enable
> # echo 0 > events/dwc3/dwc3_writel/enable
>
> (reproduce)
>
> # cp /t/trace /path/to/non-volatile/media/trace.txt

Okay, I try to do that. Thanks.

--
Baolin.wang
Best Regards

2016-12-28 12:30:18

by Janusz Dziedzic

[permalink] [raw]
Subject: Re: [PATCH] usb: dwc3: gadget: Avoid race between dwc3 interrupt handler and irq thread handler

2016-12-27 13:16 GMT+01:00 Baolin Wang <[email protected]>:
> Hi,
>
> On 27 December 2016 at 19:11, Felipe Balbi <[email protected]> wrote:
>>
>> Hi,
>>
>> Baolin Wang <[email protected]> writes:
>>> Hi,
>>>
>>> On 27 December 2016 at 18:52, Janusz Dziedzic <[email protected]> wrote:
>>>> 2016-12-26 9:01 GMT+01:00 Baolin Wang <[email protected]>:
>>>>> On some platfroms(like x86 platform), when one core is running the USB gadget
>>>>> irq thread handler by dwc3_thread_interrupt(), meanwhile another core also can
>>>>> respond other interrupts from dwc3 controller and modify the event buffer by
>>>>> dwc3_interrupt() function, that will cause getting the wrong event count in
>>>>> irq thread handler to make the USB function abnormal.
>>>>>
>>>>> We should add spin_lock/unlock() in dwc3_check_event_buf() to avoid this race.
>>>>>
>>>> Interesting, I always think we mask interrupt in dwc3_interrupt() by setting
>>>> DWC3_GEVNTSIZ_INTMASK
>>>> And unmask interrupt when we end dwc3_thread_interrupt().
>>>>
>>>> So, we shouldn't get any IRQ from HW during dwc3_thread_interrupt(),
>>>> or I miss something?
>>>> Do you have some traces that indicate this masking will not work correctly?
>>>
>>> Yes, but we just masked the interrupts described in DEVTEN register,
>>> and we did not mask all the interrupts, like the endpoint command
>>> complete event, transfer complete event and so on, so we can still get
>>> interrupts.
>>
>> not true, we masked interrupts for the entire event buffer:
>
> Yes, you are right and I missed that. I should reproduce this problem
> and analyse the real reason.
>
>>
>>> static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
>>> {
>>> struct dwc3 *dwc = evt->dwc;
>>> u32 count;
>>> u32 reg;
>>>
>>> if (pm_runtime_suspended(dwc->dev)) {
>>> pm_runtime_get(dwc->dev);
>>> disable_irq_nosync(dwc->irq_gadget);
>>> dwc->pending_events = true;
>>> return IRQ_HANDLED;
>>> }
>>>
>>> count = dwc3_readl(dwc->regs, DWC3_GEVNTCOUNT(0));
>>> count &= DWC3_GEVNTCOUNT_MASK;
>>> if (!count)
>>> return IRQ_NONE;
>>>
>>> evt->count = count;
>>> evt->flags |= DWC3_EVENT_PENDING;
>>>
>>> /* Mask interrupt */
>>> reg = dwc3_readl(dwc->regs, DWC3_GEVNTSIZ(0));
>>> reg |= DWC3_GEVNTSIZ_INTMASK;
>>
>> See here ?!?
>>
>>> dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0), reg);
>>>
>>> return IRQ_WAKE_THREAD;
>>> }
>>
>>>> BTW, what value you get when problem occured, 0xFFFC?
>>>
>>> Yes, something like this, the event count become huge.
>>
Probably you have little bit different code than current community
version (depends how your PM works).

This is possible when we write:
dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), 0);
And after that
dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), 4);

After that we will get 0xFFFC (-4).

Possible races:
1) dwc3_event_buffers_setup/dwc3_event_buffers_cleanup - write 0
2) dwc3_thread - write 4

While [1] could be called in PM work or UM context (init in Android
case) spin_lock_irqsave() will only disable local irqs and still we
could get IRQ on different core, next update evt->count and run
thread...

So, seems your patch will solve this.

I am not sure this problem could be also visible in community current version.

Anyway, thanks for handling this.

BR
Janusz
>> please send us tracepoint data. You probably need to compress
>> it. Something like 256k of trace data is probably enough, so:
>>
>> # mkdir -p /t
>> # mount -t tracefs none /t
>> # cd /t
>> # echo 256 > buffer_size_kb
>> # echo 1 > events/dwc3/enable
>> # echo 0 > events/dwc3/dwc3_readl/enable
>> # echo 0 > events/dwc3/dwc3_writel/enable
>>
>> (reproduce)
>>
>> # cp /t/trace /path/to/non-volatile/media/trace.txt
>
> Okay, I try to do that. Thanks.
>
> --
> Baolin.wang
> Best Regards



--
Janusz Dziedzic

2016-12-28 15:27:45

by Janusz Dziedzic

[permalink] [raw]
Subject: Re: [PATCH] usb: dwc3: gadget: Avoid race between dwc3 interrupt handler and irq thread handler

2016-12-27 12:05 GMT+01:00 Felipe Balbi <[email protected]>:
> Hi,
>
> Lu Baolu <[email protected]> writes:
>> On 12/26/2016 04:01 PM, Baolin Wang wrote:
>>> On some platfroms(like x86 platform), when one core is running the USB gadget
>>> irq thread handler by dwc3_thread_interrupt(), meanwhile another core also can
>>> respond other interrupts from dwc3 controller and modify the event buffer by
>>> dwc3_interrupt() function, that will cause getting the wrong event count in
>>> irq thread handler to make the USB function abnormal.
>>>
>>> We should add spin_lock/unlock() in dwc3_check_event_buf() to avoid this race.
>>
>> Why not spin_lock_irq ones? This lock seems to be used in both
>> normal and interrupt threads. Or, I missed anything?
>
> this is top half handler. Interrupts are already disabled.
>
BTW,
We don't use spin_lock in top half handler.
Maybe we should/can switch all spin_lock_irqsave() to simple
spin_lock() in the thread/callbacks?
Or there is a reason to use irqsave() version?

BR
Janusz

> --
> balbi



--
Janusz Dziedzic

2016-12-28 16:20:48

by Felipe Balbi

[permalink] [raw]
Subject: Re: [PATCH] usb: dwc3: gadget: Avoid race between dwc3 interrupt handler and irq thread handler


Hi,

Janusz Dziedzic <[email protected]> writes:
>>>> On some platfroms(like x86 platform), when one core is running the USB gadget
>>>> irq thread handler by dwc3_thread_interrupt(), meanwhile another core also can
>>>> respond other interrupts from dwc3 controller and modify the event buffer by
>>>> dwc3_interrupt() function, that will cause getting the wrong event count in
>>>> irq thread handler to make the USB function abnormal.
>>>>
>>>> We should add spin_lock/unlock() in dwc3_check_event_buf() to avoid this race.
>>>
>>> Why not spin_lock_irq ones? This lock seems to be used in both
>>> normal and interrupt threads. Or, I missed anything?
>>
>> this is top half handler. Interrupts are already disabled.
>>
> BTW,
> We don't use spin_lock in top half handler.
> Maybe we should/can switch all spin_lock_irqsave() to simple
> spin_lock() in the thread/callbacks?

in theory, yes we've masked all interrupts from this controller for the
duration of the thread handler. However this breaks networking
gadgets. I can only guess network stack has a hard requirement to run
with IRQs disabled.

> Or there is a reason to use irqsave() version?

see above :-)

--
balbi

2016-12-29 01:39:42

by John Youn

[permalink] [raw]
Subject: RE: [PATCH] usb: dwc3: gadget: Avoid race between dwc3 interrupt handler and irq thread handler



> -----Original Message-----
> From: [email protected] [mailto:linux-usb-
> [email protected]] On Behalf Of Felipe Balbi
> Sent: Wednesday, December 28, 2016 8:19 AM
> To: Janusz Dziedzic <[email protected]>
> Cc: Lu Baolu <[email protected]>; Baolin Wang
> <[email protected]>; Greg KH <[email protected]>; USB
> <[email protected]>; LKML <[email protected]>; Linaro
> Kernel Mailman List <[email protected]>; Mark Brown
> <[email protected]>
> Subject: Re: [PATCH] usb: dwc3: gadget: Avoid race between dwc3 interrupt
> handler and irq thread handler
>
>
> Hi,
>
> Janusz Dziedzic <[email protected]> writes:
> >>>> On some platfroms(like x86 platform), when one core is running the
> USB gadget
> >>>> irq thread handler by dwc3_thread_interrupt(), meanwhile another
> core also can
> >>>> respond other interrupts from dwc3 controller and modify the event
> buffer by
> >>>> dwc3_interrupt() function, that will cause getting the wrong event
> count in
> >>>> irq thread handler to make the USB function abnormal.
> >>>>
> >>>> We should add spin_lock/unlock() in dwc3_check_event_buf() to avoid
> this race.
> >>>
> >>> Why not spin_lock_irq ones? This lock seems to be used in both
> >>> normal and interrupt threads. Or, I missed anything?
> >>
> >> this is top half handler. Interrupts are already disabled.
> >>
> > BTW,
> > We don't use spin_lock in top half handler.
> > Maybe we should/can switch all spin_lock_irqsave() to simple
> > spin_lock() in the thread/callbacks?
>
> in theory, yes we've masked all interrupts from this controller for the
> duration of the thread handler. However this breaks networking
> gadgets. I can only guess network stack has a hard requirement to run
> with IRQs disabled.
>

Hi,

Is this version 3.00a of the core?

That version has a STAR where the interrupts cannot be masked. That results in similar symptoms to what you're seeing here.

Regards,
John