2000-12-07 09:46:34

by David Woodhouse

[permalink] [raw]
Subject: USB-related lockup in test12-pre5



Haven't tried test12-pre7 yet. Is enabling bus mastering likely to make
this magically go away? I doubt it.

This happened when trying to run excel under wine. Dual Celeron with
CONFIG_USB_UHCI.

NMI Watchdog detected LOCKUP on CPU1, registers:
CPU: 1
EIP: 0010:[<c0270c21>]
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00200086
eax: c12c9600 ebx: c7a383c0 ecx: c7a383c0 edx: c12c9600
esi: 00000001 edi: 00000001 ebp: c7f068a0 esp: c123dea8
ds: 0018 es: 0018 ss: 0018
Process swapper (pid: 0, stackpage=c123d000)
Stack: 00200006 00000001 c7a38398 c7a383c0 00000001 c7a38000 c01f488e c7a383c0
ca8578be c7a383c0 00200246 00000000 c7a383c0 00000000 c7f068a0 c01ffc3e
c7a383c0 00000000 00000000 c12c9600 00000000 c7a3846c c7f068a0 c7f068bc
Call Trace: [<c01f488e>] [<ca8578be>] [<c01ffc3e>] [<c01ffd49>] [<c010c800>] [<c010ca01>] [<c0108f50>]
[<c010af88>] [<c0108f50>] [<c0100018>] [<c0108f7c>] [<c0109002>] [<c011f47d>] [<c010ca47>]
Code: f3 90 7e f8 e9 42 dc f8 ff 80 7e 24 00 f3 90 7e f8 e9 89 e7

>>EIP; c0270c21 <stext_lock+4e6d/8f50> <=====
Trace; c01f488e <usb_submit_urb+1e/30>
Trace; ca8578be <[audio]usbout_completed+7e/c0>
Trace; c01ffc3e <process_urb+1de/230>
Trace; c01ffd49 <uhci_interrupt+b9/120>
Trace; c010c800 <handle_IRQ_event+60/90>
Trace; c010ca01 <do_IRQ+a1/100>
Trace; c0108f50 <default_idle+0/40>
Trace; c010af88 <ret_from_intr+0/20>
Trace; c0108f50 <default_idle+0/40>
Trace; c0100018 <startup_32+18/cb>
Trace; c0108f7c <default_idle+2c/40>
Trace; c0109002 <cpu_idle+52/70>
Trace; c011f47d <do_softirq+6d/a0>
Trace; c010ca47 <do_IRQ+e7/100>
Code; c0270c21 <stext_lock+4e6d/8f50>
00000000 <_EIP>:
Code; c0270c21 <stext_lock+4e6d/8f50> <=====
0: f3 90 repz nop <=====
Code; c0270c23 <stext_lock+4e6f/8f50>
2: 7e f8 jle fffffffc <_EIP+0xfffffffc> c0270c1d <stext_lock+4e69/8f50>
Code; c0270c25 <stext_lock+4e71/8f50>
4: e9 42 dc f8 ff jmp fff8dc4b <_EIP+0xfff8dc4b> c01fe86c <uhci_submit_urb+6c/2d0>
Code; c0270c2a <stext_lock+4e76/8f50>
9: 80 7e 24 00 cmpb $0x0,0x24(%esi)
Code; c0270c2e <stext_lock+4e7a/8f50>
d: f3 90 repz nop
Code; c0270c30 <stext_lock+4e7c/8f50>
f: 7e f8 jle 9 <_EIP+0x9> c0270c2a <stext_lock+4e76/8f50>
Code; c0270c32 <stext_lock+4e7e/8f50>
11: e9 89 e7 00 00 jmp e79f <_EIP+0xe79f> c027f3c0 <tvecs+1678/e4f8>



--
dwmw2



2000-12-07 19:11:09

by Johannes Erdfelt

[permalink] [raw]
Subject: Re: USB-related lockup in test12-pre5

On Thu, Dec 07, 2000, David Woodhouse <[email protected]> wrote:
> Haven't tried test12-pre7 yet. Is enabling bus mastering likely to make
> this magically go away? I doubt it.

Probably not. Enabling bus mastering is the difference between USB
working at all (transfering data to the device) and not working.

> This happened when trying to run excel under wine. Dual Celeron with
> CONFIG_USB_UHCI.

Could you try the alternate UHCI driver? You may need to disable the
UHCI driver you have configured for the option to become visible.

> >>EIP; c0270c21 <stext_lock+4e6d/8f50> <=====
> Trace; c01f488e <usb_submit_urb+1e/30>
> Trace; ca8578be <[audio]usbout_completed+7e/c0>
> Trace; c01ffc3e <process_urb+1de/230>
> Trace; c01ffd49 <uhci_interrupt+b9/120>

It looks like you were using USB audio? Could you explain what you were
doing when the oops happened?

JE

2000-12-08 23:54:31

by David Woodhouse

[permalink] [raw]
Subject: Re: USB-related lockup in test12-pre5

On Thu, 7 Dec 2000, Johannes Erdfelt wrote:

> Could you try the alternate UHCI driver? You may need to disable the
> UHCI driver you have configured for the option to become visible.

Differently broken:
uhci: host controller process error. something bad happened
uhci: host controller halted. very bad

... but at least the machine doesn't die. This was working in test11,
IIRC. Certainly in test10.


--
dwmw2


2000-12-09 01:25:30

by Johannes Erdfelt

[permalink] [raw]
Subject: Re: USB-related lockup in test12-pre5

On Fri, Dec 08, 2000, David Woodhouse <[email protected]> wrote:
> On Thu, 7 Dec 2000, Johannes Erdfelt wrote:
>
> > Could you try the alternate UHCI driver? You may need to disable the
> > UHCI driver you have configured for the option to become visible.
>
> Differently broken:
> uhci: host controller process error. something bad happened
> uhci: host controller halted. very bad
>
> ... but at least the machine doesn't die. This was working in test11,
> IIRC. Certainly in test10.

I'm sure you can guess from the messages that we have a problem.

However, I haven't seen that error in a long time. Lemme go back and
look in the docs to see if I can get some more information as to why
that would actually happen.

JE

2000-12-09 01:35:54

by Johannes Erdfelt

[permalink] [raw]
Subject: Re: USB-related lockup in test12-pre5

On Fri, Dec 08, 2000, Johannes Erdfelt <[email protected]> wrote:
> On Fri, Dec 08, 2000, David Woodhouse <[email protected]> wrote:
> > On Thu, 7 Dec 2000, Johannes Erdfelt wrote:
> >
> > > Could you try the alternate UHCI driver? You may need to disable the
> > > UHCI driver you have configured for the option to become visible.
> >
> > Differently broken:
> > uhci: host controller process error. something bad happened
> > uhci: host controller halted. very bad
> >
> > ... but at least the machine doesn't die. This was working in test11,
> > IIRC. Certainly in test10.
>
> I'm sure you can guess from the messages that we have a problem.
>
> However, I haven't seen that error in a long time. Lemme go back and
> look in the docs to see if I can get some more information as to why
> that would actually happen.

Actually, looking back at your previous email, enabling bus mastering
may make this error go away.

Could you give -pre7 a try? Or have you already?

JE

2000-12-09 09:05:51

by David Woodhouse

[permalink] [raw]
Subject: Re: USB-related lockup in test12-pre5

On Fri, 8 Dec 2000, Johannes Erdfelt wrote:

> Actually, looking back at your previous email, enabling bus mastering
> may make this error go away.
>
> Could you give -pre7 a try? Or have you already?

This is pre7

--
dwmw2


2000-12-09 14:31:12

by David Woodhouse

[permalink] [raw]
Subject: [FIXED] Re: USB-related lockup in test12-pre5


>>EIP; c0270c21 <stext_lock+4e6d/8f50> <=====
Trace; c01f488e <usb_submit_urb+1e/30>
Trace; ca8578be <[audio]usbout_completed+7e/c0>
Trace; c01ffc3e <process_urb+1de/230>
Trace; c01ffd49 <uhci_interrupt+b9/120>

1. process_urb() obtains the urb_list_lock.
2. Then calls urb->complete() which is audio.c::usbout_complete()
3. Which in turn calls usb_submit_urb()
4. Which calls uhci_submit_urb()
5. Which tries to obtain urb_list_lock --> BOOM!

As it seems that we were already able to drop the lock in process_urb()
and not worry about the consequences, I've just made it do so.

Index: drivers/usb/usb-uhci.c
===================================================================
RCS file: /net/passion/inst/cvs/linux/drivers/usb/usb-uhci.c,v
retrieving revision 1.1.2.21
diff -u -r1.1.2.21 usb-uhci.c
--- drivers/usb/usb-uhci.c 2000/12/07 09:36:19 1.1.2.21
+++ drivers/usb/usb-uhci.c 2000/12/09 13:49:50
@@ -2626,14 +2626,14 @@
// Completion
if (urb->complete) {
urb->dev = NULL;
+ spin_unlock(&s->urb_list_lock);
urb->complete ((struct urb *) urb);
// Re-submit the URB if ring-linked
if (is_ring && (urb->status != -ENOENT) &&
!contains_killed) {
urb->dev=usb_dev;
- spin_unlock(&s->urb_list_lock);
uhci_submit_urb (urb);
- spin_lock(&s->urb_list_lock);
}
+ spin_lock(&s->urb_list_lock);
}

usb_dec_dev_use (usb_dev);

--
dwmw2