2015-08-03 11:20:27

by fupan

[permalink] [raw]
Subject: [PATCH] usb: gadget: f_printer: fix the bug of deadlock caused by nested spinlock

From: fli <[email protected]>

Function printer_func_disable() has called spinlock on printer_dev->lock,
and it'll call function chain of

printer_reset_interface()
|
+---dwc3_gadget_ep_disable()
|
+---__dwc3_gadget_ep_disable()
|
+---dwc3_remove_requests()
|
+---dwc3_gadget_giveback()
|
+---rx_complete()

in the protected block.

However, rx_complete() in f_printer.c calls spinlock on printer_dev->lock again,
which will cause system hang.

The following steps can reproduce this hang:

1. Build the test program from Documentation/usb/gadget_printer.txt as g_printer
2. Plug in the USB device to a host(such as Ubuntu).
3. on the USB device system run:
#modprobe g_printer.ko
#./g_printer -read_data

4. Unplug the USB device from the host

The system will hang later.

In order to avoid this deadlock, moving the spinlock from printer_func_disable() into
printer_reset_interface() and excluding the block of calling dwc3_gadget_ep_disable(),
in which the critical resource will be protected by its spinlock in rx_complete().

This commit will fix the system hang with the following calltrace:

INFO: rcu_preempt detected stalls on CPUs/tasks: { 3} (detected by 0, t=21006 jiffies, g=524, c=523, q=2)
sending NMI to all CPUs:
NMI backtrace for cpu 3
CPU: 3 PID: 718 Comm: irq/22-dwc3 Not tainted 3.10.38-ltsi-WR6.0.0.11_standard #2
Hardware name: Intel Corp. VALLEYVIEW B3 PLATFORM/NOTEBOOK, BIOS BYTICRB1.86C.0092.R32.1410021707 10/02/2014
task: f44f4c20 ti: f40f6000 task.ti: f40f6000
EIP: 0060:[<c1824955>] EFLAGS: 00000097 CPU: 3
EIP is at _raw_spin_lock_irqsave+0x35/0x40
EAX: 00000076 EBX: f80fad00 ECX: 00000076 EDX: 00000075
ESI: 00000096 EDI: ffffff94 EBP: f40f7e20 ESP: f40f7e18
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: b77ac000 CR3: 01c30000 CR4: 001007f0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
Stack:
f474a720 f80fad00 f40f7e3c f80f93cc c135d486 00000000 f474a720 f468fb00
f4bea894 f40f7e54 f7e35f19 ffffff00 f468fb00 f468fb24 00000086 f40f7e64
f7e36577 f468fb00 f4bea810 f40f7e74 f7e365a8 f468fb00 f4bea894 f40f7e9c
Call Trace:
[<f80f93cc>] rx_complete+0x1c/0xb0 [g_printer]
[<c135d486>] ? vsnprintf+0x166/0x390
[<f7e35f19>] dwc3_gadget_giveback+0xc9/0xf0 [dwc3]
[<f7e36577>] dwc3_remove_requests+0x57/0x70 [dwc3]
[<f7e365a8>] __dwc3_gadget_ep_disable+0x18/0x60 [dwc3]
[<f7e366e9>] dwc3_gadget_ep_disable+0x89/0xf0 [dwc3]
[<f80f9031>] printer_reset_interface+0x31/0x50 [g_printer]
[<f80f9270>] printer_func_disable+0x20/0x30 [g_printer]
[<f80e6d8b>] composite_disconnect+0x4b/0x90 [libcomposite]
[<f7e39a8b>] dwc3_disconnect_gadget+0x38/0x43 [dwc3]
[<f7e39ad4>] dwc3_gadget_disconnect_interrupt+0x3e/0x5a [dwc3]
[<f7e373b8>] dwc3_thread_interrupt+0x5c8/0x610 [dwc3]
[<c10ac518>] irq_thread_fn+0x18/0x30
[<c10ac800>] irq_thread+0x100/0x130
[<c10ac500>] ? irq_finalize_oneshot.part.29+0xb0/0xb0
[<c10ac650>] ? wake_threads_waitq+0x40/0x40
[<c10ac700>] ? irq_thread_dtor+0xb0/0xb0
[<c1057224>] kthread+0x94/0xa0
[<c182b337>] ret_from_kernel_thread+0x1b/0x28
[<c1057190>] ? kthread_create_on_node+0xc0/0xc0

Signed-off-by: fupan li <[email protected]>
---
drivers/usb/gadget/function/f_printer.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/usb/gadget/function/f_printer.c b/drivers/usb/gadget/function/f_printer.c
index 44173df..a91c18a 100644
--- a/drivers/usb/gadget/function/f_printer.c
+++ b/drivers/usb/gadget/function/f_printer.c
@@ -804,6 +804,8 @@ done:

static void printer_reset_interface(struct printer_dev *dev)
{
+ unsigned long flags;
+
if (dev->interface < 0)
return;

@@ -815,9 +817,11 @@ static void printer_reset_interface(struct printer_dev *dev)
if (dev->out_ep->desc)
usb_ep_disable(dev->out_ep);

+ spin_lock_irqsave(&dev->lock, flags);
dev->in_ep->desc = NULL;
dev->out_ep->desc = NULL;
dev->interface = -1;
+ spin_unlock_irqrestore(&dev->lock, flags);
}

/* Change our operational Interface. */
@@ -1131,13 +1135,10 @@ static int printer_func_set_alt(struct usb_function *f,
static void printer_func_disable(struct usb_function *f)
{
struct printer_dev *dev = func_to_printer(f);
- unsigned long flags;

DBG(dev, "%s\n", __func__);

- spin_lock_irqsave(&dev->lock, flags);
printer_reset_interface(dev);
- spin_unlock_irqrestore(&dev->lock, flags);
}

static inline struct f_printer_opts
--
1.9.1


2015-08-03 14:47:50

by Felipe Balbi

[permalink] [raw]
Subject: Re: [PATCH] usb: gadget: f_printer: fix the bug of deadlock caused by nested spinlock

Hi,

On Mon, Aug 03, 2015 at 07:19:43PM +0800, [email protected] wrote:
> From: fli <[email protected]>
>
> Function printer_func_disable() has called spinlock on printer_dev->lock,
> and it'll call function chain of
>
> printer_reset_interface()
> |
> +---dwc3_gadget_ep_disable()
> |
> +---__dwc3_gadget_ep_disable()
> |
> +---dwc3_remove_requests()
> |
> +---dwc3_gadget_giveback()
> |
> +---rx_complete()
>
> in the protected block.
>
> However, rx_complete() in f_printer.c calls spinlock on printer_dev->lock again,
> which will cause system hang.
>
> The following steps can reproduce this hang:
>
> 1. Build the test program from Documentation/usb/gadget_printer.txt as g_printer
> 2. Plug in the USB device to a host(such as Ubuntu).
> 3. on the USB device system run:
> #modprobe g_printer.ko
> #./g_printer -read_data
>
> 4. Unplug the USB device from the host
>
> The system will hang later.
>
> In order to avoid this deadlock, moving the spinlock from printer_func_disable() into
> printer_reset_interface() and excluding the block of calling dwc3_gadget_ep_disable(),
> in which the critical resource will be protected by its spinlock in rx_complete().
>
> This commit will fix the system hang with the following calltrace:
>
> INFO: rcu_preempt detected stalls on CPUs/tasks: { 3} (detected by 0, t=21006 jiffies, g=524, c=523, q=2)
> sending NMI to all CPUs:
> NMI backtrace for cpu 3
> CPU: 3 PID: 718 Comm: irq/22-dwc3 Not tainted 3.10.38-ltsi-WR6.0.0.11_standard #2
> Hardware name: Intel Corp. VALLEYVIEW B3 PLATFORM/NOTEBOOK, BIOS BYTICRB1.86C.0092.R32.1410021707 10/02/2014
> task: f44f4c20 ti: f40f6000 task.ti: f40f6000
> EIP: 0060:[<c1824955>] EFLAGS: 00000097 CPU: 3
> EIP is at _raw_spin_lock_irqsave+0x35/0x40
> EAX: 00000076 EBX: f80fad00 ECX: 00000076 EDX: 00000075
> ESI: 00000096 EDI: ffffff94 EBP: f40f7e20 ESP: f40f7e18
> DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> CR0: 8005003b CR2: b77ac000 CR3: 01c30000 CR4: 001007f0
> DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> DR6: ffff0ff0 DR7: 00000400
> Stack:
> f474a720 f80fad00 f40f7e3c f80f93cc c135d486 00000000 f474a720 f468fb00
> f4bea894 f40f7e54 f7e35f19 ffffff00 f468fb00 f468fb24 00000086 f40f7e64
> f7e36577 f468fb00 f4bea810 f40f7e74 f7e365a8 f468fb00 f4bea894 f40f7e9c
> Call Trace:
> [<f80f93cc>] rx_complete+0x1c/0xb0 [g_printer]
> [<c135d486>] ? vsnprintf+0x166/0x390
> [<f7e35f19>] dwc3_gadget_giveback+0xc9/0xf0 [dwc3]
> [<f7e36577>] dwc3_remove_requests+0x57/0x70 [dwc3]
> [<f7e365a8>] __dwc3_gadget_ep_disable+0x18/0x60 [dwc3]
> [<f7e366e9>] dwc3_gadget_ep_disable+0x89/0xf0 [dwc3]
> [<f80f9031>] printer_reset_interface+0x31/0x50 [g_printer]
> [<f80f9270>] printer_func_disable+0x20/0x30 [g_printer]
> [<f80e6d8b>] composite_disconnect+0x4b/0x90 [libcomposite]
> [<f7e39a8b>] dwc3_disconnect_gadget+0x38/0x43 [dwc3]
> [<f7e39ad4>] dwc3_gadget_disconnect_interrupt+0x3e/0x5a [dwc3]
> [<f7e373b8>] dwc3_thread_interrupt+0x5c8/0x610 [dwc3]
> [<c10ac518>] irq_thread_fn+0x18/0x30
> [<c10ac800>] irq_thread+0x100/0x130
> [<c10ac500>] ? irq_finalize_oneshot.part.29+0xb0/0xb0
> [<c10ac650>] ? wake_threads_waitq+0x40/0x40
> [<c10ac700>] ? irq_thread_dtor+0xb0/0xb0
> [<c1057224>] kthread+0x94/0xa0
> [<c182b337>] ret_from_kernel_thread+0x1b/0x28
> [<c1057190>] ? kthread_create_on_node+0xc0/0xc0
>
> Signed-off-by: fupan li <[email protected]>

Thanks, out of curiosity, do you plan on sending a glue layer for
Windriver's DWC3 ?

cheers

--
balbi


Attachments:
(No filename) (3.66 kB)
signature.asc (819.00 B)
Digital signature
Download all attachments

2015-08-04 02:20:34

by fupan

[permalink] [raw]
Subject: Re: [PATCH] usb: gadget: f_printer: fix the bug of deadlock caused by nested spinlock

On 08/03/2015 10:47 PM, Felipe Balbi wrote:
> Hi,
>
> On Mon, Aug 03, 2015 at 07:19:43PM +0800, [email protected] wrote:
>> From: fli <[email protected]>
>>
>> Function printer_func_disable() has called spinlock on printer_dev->lock,
>> and it'll call function chain of
>>
>> printer_reset_interface()
>> |
>> +---dwc3_gadget_ep_disable()
>> |
>> +---__dwc3_gadget_ep_disable()
>> |
>> +---dwc3_remove_requests()
>> |
>> +---dwc3_gadget_giveback()
>> |
>> +---rx_complete()
>>
>> in the protected block.
>>
>> However, rx_complete() in f_printer.c calls spinlock on printer_dev->lock again,
>> which will cause system hang.
>>
>> The following steps can reproduce this hang:
>>
>> 1. Build the test program from Documentation/usb/gadget_printer.txt as g_printer
>> 2. Plug in the USB device to a host(such as Ubuntu).
>> 3. on the USB device system run:
>> #modprobe g_printer.ko
>> #./g_printer -read_data
>>
>> 4. Unplug the USB device from the host
>>
>> The system will hang later.
>>
>> In order to avoid this deadlock, moving the spinlock from printer_func_disable() into
>> printer_reset_interface() and excluding the block of calling dwc3_gadget_ep_disable(),
>> in which the critical resource will be protected by its spinlock in rx_complete().
>>
>> This commit will fix the system hang with the following calltrace:
>>
>> INFO: rcu_preempt detected stalls on CPUs/tasks: { 3} (detected by 0, t=21006 jiffies, g=524, c=523, q=2)
>> sending NMI to all CPUs:
>> NMI backtrace for cpu 3
>> CPU: 3 PID: 718 Comm: irq/22-dwc3 Not tainted 3.10.38-ltsi-WR6.0.0.11_standard #2
>> Hardware name: Intel Corp. VALLEYVIEW B3 PLATFORM/NOTEBOOK, BIOS BYTICRB1.86C.0092.R32.1410021707 10/02/2014
>> task: f44f4c20 ti: f40f6000 task.ti: f40f6000
>> EIP: 0060:[<c1824955>] EFLAGS: 00000097 CPU: 3
>> EIP is at _raw_spin_lock_irqsave+0x35/0x40
>> EAX: 00000076 EBX: f80fad00 ECX: 00000076 EDX: 00000075
>> ESI: 00000096 EDI: ffffff94 EBP: f40f7e20 ESP: f40f7e18
>> DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
>> CR0: 8005003b CR2: b77ac000 CR3: 01c30000 CR4: 001007f0
>> DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
>> DR6: ffff0ff0 DR7: 00000400
>> Stack:
>> f474a720 f80fad00 f40f7e3c f80f93cc c135d486 00000000 f474a720 f468fb00
>> f4bea894 f40f7e54 f7e35f19 ffffff00 f468fb00 f468fb24 00000086 f40f7e64
>> f7e36577 f468fb00 f4bea810 f40f7e74 f7e365a8 f468fb00 f4bea894 f40f7e9c
>> Call Trace:
>> [<f80f93cc>] rx_complete+0x1c/0xb0 [g_printer]
>> [<c135d486>] ? vsnprintf+0x166/0x390
>> [<f7e35f19>] dwc3_gadget_giveback+0xc9/0xf0 [dwc3]
>> [<f7e36577>] dwc3_remove_requests+0x57/0x70 [dwc3]
>> [<f7e365a8>] __dwc3_gadget_ep_disable+0x18/0x60 [dwc3]
>> [<f7e366e9>] dwc3_gadget_ep_disable+0x89/0xf0 [dwc3]
>> [<f80f9031>] printer_reset_interface+0x31/0x50 [g_printer]
>> [<f80f9270>] printer_func_disable+0x20/0x30 [g_printer]
>> [<f80e6d8b>] composite_disconnect+0x4b/0x90 [libcomposite]
>> [<f7e39a8b>] dwc3_disconnect_gadget+0x38/0x43 [dwc3]
>> [<f7e39ad4>] dwc3_gadget_disconnect_interrupt+0x3e/0x5a [dwc3]
>> [<f7e373b8>] dwc3_thread_interrupt+0x5c8/0x610 [dwc3]
>> [<c10ac518>] irq_thread_fn+0x18/0x30
>> [<c10ac800>] irq_thread+0x100/0x130
>> [<c10ac500>] ? irq_finalize_oneshot.part.29+0xb0/0xb0
>> [<c10ac650>] ? wake_threads_waitq+0x40/0x40
>> [<c10ac700>] ? irq_thread_dtor+0xb0/0xb0
>> [<c1057224>] kthread+0x94/0xa0
>> [<c182b337>] ret_from_kernel_thread+0x1b/0x28
>> [<c1057190>] ? kthread_create_on_node+0xc0/0xc0
>>
>> Signed-off-by: fupan li <[email protected]>
> Thanks, out of curiosity, do you plan on sending a glue layer for
> Windriver's DWC3 ?
No, just this fix patch.

Fupan
>
> cheers
>