2004-06-04 19:39:19

by Zephaniah E. Hull

[permalink] [raw]
Subject: USBDEVFS_RESET deadlocks USB bus.

Starting at 2.6.7-rc1 or so (that is when we first noticed it) the new
pilot-link libusb back end started deadlocking the entire USB bus that
the palm device was on.

I have finally tracked it down to happening when we make the
USBDEVFS_RESET ioctl, we never return from it and from that point on the
bus in question is completely dead, no processing is done, no
notifications of devices being plugged in or unplugged.

This is still happening in 2.6.7-rc2-mm1.

It seems to happen with both the UHCI and OHCI back ends, so it is
probably above that.

Given that there were heavy locking changes, I suspect that the case in
question got screwed up somehow.

--
1024D/E65A7801 Zephaniah E. Hull <[email protected]>
92ED 94E4 B1E6 3624 226D 5727 4453 008B E65A 7801
CCs of replies from mailing lists are requested.

Perl == Being
-- Descartes (paraphrased).


Attachments:
(No filename) (874.00 B)
signature.asc (189.00 B)
Digital signature
Download all attachments

2004-06-04 19:53:48

by Greg KH

[permalink] [raw]
Subject: Re: USBDEVFS_RESET deadlocks USB bus.

On Fri, Jun 04, 2004 at 03:39:11PM -0400, Zephaniah E. Hull wrote:
> Starting at 2.6.7-rc1 or so (that is when we first noticed it) the new
> pilot-link libusb back end started deadlocking the entire USB bus that
> the palm device was on.
>
> I have finally tracked it down to happening when we make the
> USBDEVFS_RESET ioctl, we never return from it and from that point on the
> bus in question is completely dead, no processing is done, no
> notifications of devices being plugged in or unplugged.
>
> This is still happening in 2.6.7-rc2-mm1.
>
> It seems to happen with both the UHCI and OHCI back ends, so it is
> probably above that.
>
> Given that there were heavy locking changes, I suspect that the case in
> question got screwed up somehow.

This needs to go to the linux-usb-devel list, now CC:

Anyway, can you look at the output of SysRq-T and see where the
processes are hung?

thanks,

greg k-h

2004-06-04 20:02:21

by Zephaniah E. Hull

[permalink] [raw]
Subject: Re: USBDEVFS_RESET deadlocks USB bus.

On Fri, Jun 04, 2004 at 12:52:47PM -0700, Greg KH wrote:
> On Fri, Jun 04, 2004 at 03:39:11PM -0400, Zephaniah E. Hull wrote:
> > Starting at 2.6.7-rc1 or so (that is when we first noticed it) the new
> > pilot-link libusb back end started deadlocking the entire USB bus that
> > the palm device was on.
> >
> > I have finally tracked it down to happening when we make the
> > USBDEVFS_RESET ioctl, we never return from it and from that point on the
> > bus in question is completely dead, no processing is done, no
> > notifications of devices being plugged in or unplugged.
> >
> > This is still happening in 2.6.7-rc2-mm1.
> >
> > It seems to happen with both the UHCI and OHCI back ends, so it is
> > probably above that.
> >
> > Given that there were heavy locking changes, I suspect that the case in
> > question got screwed up somehow.
>
> This needs to go to the linux-usb-devel list, now CC:

Whoops.
>
> Anyway, can you look at the output of SysRq-T and see where the
> processes are hung?

lt-pilot-xfer D 00000000 0 11415 2709 (NOTLB)
d2ad3eb0 00000086 c022391a 00000000 3231203a 000a2e35 00000001 d2ad3ea7
d4edd000 d7fc8a00 c8449790 00000000 abb31900 000f447a c4bae4b8 c977d824
00000246 d2ad2000 d2ad3eec c0336735 c977d82c c4bae310 00000001 c4bae310
Call Trace:
[<c0336735>] __down+0x85/0x120
[<c033692f>] __down_failed+0xb/0x14
[<c026af27>] .text.lock.hub+0x69/0x82
[<c0272b7f>] usbdev_ioctl+0x19f/0x710
[<c015a45d>] file_ioctl+0x5d/0x170
[<c015a686>] sys_ioctl+0x116/0x250
[<c0103f8f>] syscall_call+0x7/0xb

This is on 2.6.7-rc2-mm1.

Please keep me in the reply list if we are pulled off of linux-kernel,
as I am not currently on linux-usb-devel.

--
1024D/E65A7801 Zephaniah E. Hull <[email protected]>
92ED 94E4 B1E6 3624 226D 5727 4453 008B E65A 7801
CCs of replies from mailing lists are requested.

<cas> well there ya go. say something stupid in irc and have it
immortalised forever in someone's .sig file


Attachments:
(No filename) (1.96 kB)
signature.asc (189.00 B)
Digital signature
Download all attachments

2004-06-04 20:05:47

by Zephaniah E. Hull

[permalink] [raw]
Subject: Re: USBDEVFS_RESET deadlocks USB bus.

On Fri, Jun 04, 2004 at 04:02:11PM -0400, Zephaniah E. Hull wrote:
> On Fri, Jun 04, 2004 at 12:52:47PM -0700, Greg KH wrote:
> > On Fri, Jun 04, 2004 at 03:39:11PM -0400, Zephaniah E. Hull wrote:
> > > Starting at 2.6.7-rc1 or so (that is when we first noticed it) the new
> > > pilot-link libusb back end started deadlocking the entire USB bus that
> > > the palm device was on.
> > >
> > > I have finally tracked it down to happening when we make the
> > > USBDEVFS_RESET ioctl, we never return from it and from that point on the
> > > bus in question is completely dead, no processing is done, no
> > > notifications of devices being plugged in or unplugged.
> > >
> > > This is still happening in 2.6.7-rc2-mm1.
> > >
> > > It seems to happen with both the UHCI and OHCI back ends, so it is
> > > probably above that.
> > >
> > > Given that there were heavy locking changes, I suspect that the case in
> > > question got screwed up somehow.

Further details, existing links with devices remain functional (IE, the
HID layer continues to get data from already connected mice), however
all processing or new devices or removed devices seems to be just gone,
on all USB busses.

--
1024D/E65A7801 Zephaniah E. Hull <[email protected]>
92ED 94E4 B1E6 3624 226D 5727 4453 008B E65A 7801
CCs of replies from mailing lists are requested.

"And now, little kittens, we're going to run across red-hot
motherboards, with our bare feet." -- Buzh.


Attachments:
(No filename) (1.43 kB)
signature.asc (189.00 B)
Digital signature
Download all attachments

2004-06-04 20:09:00

by Greg KH

[permalink] [raw]
Subject: Re: USBDEVFS_RESET deadlocks USB bus.

On Fri, Jun 04, 2004 at 04:02:11PM -0400, Zephaniah E. Hull wrote:
>
> lt-pilot-xfer D 00000000 0 11415 2709 (NOTLB)
> d2ad3eb0 00000086 c022391a 00000000 3231203a 000a2e35 00000001 d2ad3ea7
> d4edd000 d7fc8a00 c8449790 00000000 abb31900 000f447a c4bae4b8 c977d824
> 00000246 d2ad2000 d2ad3eec c0336735 c977d82c c4bae310 00000001 c4bae310
> Call Trace:
> [<c0336735>] __down+0x85/0x120
> [<c033692f>] __down_failed+0xb/0x14
> [<c026af27>] .text.lock.hub+0x69/0x82
> [<c0272b7f>] usbdev_ioctl+0x19f/0x710
> [<c015a45d>] file_ioctl+0x5d/0x170
> [<c015a686>] sys_ioctl+0x116/0x250
> [<c0103f8f>] syscall_call+0x7/0xb
>
> This is on 2.6.7-rc2-mm1.

Ah, can you not try the -mm1 kernel? This problem should not be in the
mainline kernel. There was a locking issue in the last bk-usb patch
that made it into the -mm1 kernel that was fixed yesterday.

thanks,

greg k-h

2004-06-04 20:29:11

by David A. Desrosiers

[permalink] [raw]
Subject: Re: USBDEVFS_RESET deadlocks USB bus.


> Ah, can you not try the -mm1 kernel? This problem should not be in the
> mainline kernel. There was a locking issue in the last bk-usb patch that
> made it into the -mm1 kernel that was fixed yesterday.

I can confirm that the last kernel this worked with was 2.6.6. We
can't get libusb working at all in 2.4 for the device detection (as you
know). I've now tried everything from 2.6.6 up to 2.6.7-rc2-mm2, including
all "official" (non-akpm) kernels in-between. Same issue.

What I see in the logs, which may be of use, is:

May 31 21:38:08 wrath kernel: uhci_hcd 0000:00:1d.1: suspend_hc
May 31 21:38:08 wrath kernel: uhci_hcd 0000:00:1d.1: wakeup_hc
May 31 21:38:11 wrath kernel: uhci_hcd 0000:00:1d.1: suspend_hc
May 31 21:38:11 wrath kernel: uhci_hcd 0000:00:1d.1: wakeup_hc
May 31 21:38:13 wrath kernel: uhci_hcd 0000:00:1d.1: suspend_hc
May 31 21:38:13 wrath kernel: uhci_hcd 0000:00:1d.1: wakeup_hc

..over and over and over, taking the system load gradually up, until
it no longer responds, and has to be rebooted forcibly.

d.

2004-06-04 20:40:48

by Duncan Sands

[permalink] [raw]
Subject: Re: [linux-usb-devel] Re: USBDEVFS_RESET deadlocks USB bus.

> c4bae310 Call Trace:
> [<c0336735>] __down+0x85/0x120
> [<c033692f>] __down_failed+0xb/0x14
> [<c026af27>] .text.lock.hub+0x69/0x82
> [<c0272b7f>] usbdev_ioctl+0x19f/0x710
> [<c015a45d>] file_ioctl+0x5d/0x170
> [<c015a686>] sys_ioctl+0x116/0x250
> [<c0103f8f>] syscall_call+0x7/0xb

Does this help?

Ciao,

Duncan.

--- linux-2.5/drivers/usb/core/devio.c.orig 2004-06-04 22:29:28.000000000 +0200
+++ linux-2.5/drivers/usb/core/devio.c 2004-06-04 22:40:22.000000000 +0200
@@ -50,6 +50,8 @@
#include "hcd.h" /* for usbcore internals */
#include "usb.h"

+extern int __usb_reset_device(struct usb_device *udev);
+
struct async {
struct list_head asynclist;
struct dev_state *ps;
@@ -719,7 +721,7 @@

static int proc_resetdevice(struct dev_state *ps)
{
- return usb_reset_device(ps->dev);
+ return __usb_reset_device(ps->dev);

}


2004-06-04 21:07:31

by Zephaniah E. Hull

[permalink] [raw]
Subject: Re: USBDEVFS_RESET deadlocks USB bus.

On Fri, Jun 04, 2004 at 01:07:48PM -0700, Greg KH wrote:
> On Fri, Jun 04, 2004 at 04:02:11PM -0400, Zephaniah E. Hull wrote:
> >
> > lt-pilot-xfer D 00000000 0 11415 2709 (NOTLB)
> > d2ad3eb0 00000086 c022391a 00000000 3231203a 000a2e35 00000001 d2ad3ea7
> > d4edd000 d7fc8a00 c8449790 00000000 abb31900 000f447a c4bae4b8 c977d824
> > 00000246 d2ad2000 d2ad3eec c0336735 c977d82c c4bae310 00000001 c4bae310
> > Call Trace:
> > [<c0336735>] __down+0x85/0x120
> > [<c033692f>] __down_failed+0xb/0x14
> > [<c026af27>] .text.lock.hub+0x69/0x82
> > [<c0272b7f>] usbdev_ioctl+0x19f/0x710
> > [<c015a45d>] file_ioctl+0x5d/0x170
> > [<c015a686>] sys_ioctl+0x116/0x250
> > [<c0103f8f>] syscall_call+0x7/0xb
> >
> > This is on 2.6.7-rc2-mm1.
>
> Ah, can you not try the -mm1 kernel? This problem should not be in the
> mainline kernel. There was a locking issue in the last bk-usb patch
> that made it into the -mm1 kernel that was fixed yesterday.

This is from -rc2.

lt-pilot-xfer D 00000000 0 3034 2981 (NOTLB)
c3701eb0 00000082 c022039a 00000000 3231203a 300a2e35 00000001 c3701ea7
1a1e0e40 000f4264 c6e0cc30 00000000 1a1e0e40 000f4264 c50cf3b8 d6cc1824
00000246 c3700000 c3701eec c032eb35 d6cc182c c50cf210 00000001 c50cf210
Call Trace:
[<c032eb35>] __down+0x85/0x100
[<c032ece7>] __down_failed+0xb/0x14
[<c02663f7>] .text.lock.hub+0x69/0x72
[<c026dfff>] usbdev_ioctl+0x19f/0x710
[<c015951d>] file_ioctl+0x5d/0x170
[<c0159746>] sys_ioctl+0x116/0x250
[<c0103f0f>] syscall_call+0x7/0xb

I'll next try with the patch from Duncan Sands, but if things are as bad
as Alan Stern says, then we should at least make the ioctl fail for now
instead of killing USB.

--
1024D/E65A7801 Zephaniah E. Hull <[email protected]>
92ED 94E4 B1E6 3624 226D 5727 4453 008B E65A 7801
CCs of replies from mailing lists are requested.

"Delivery anywhere in the world within thirty minutes or the second one's
free." - "pizza box" art atop a Minuteman ICBM silo, Paul A. Suhler, RHF


Attachments:
(No filename) (2.02 kB)
signature.asc (189.00 B)
Digital signature
Download all attachments

2004-06-04 21:30:43

by Zephaniah E. Hull

[permalink] [raw]
Subject: Re: [linux-usb-devel] Re: USBDEVFS_RESET deadlocks USB bus.

On Fri, Jun 04, 2004 at 10:40:43PM +0200, Duncan Sands wrote:
> > c4bae310 Call Trace:
> > [<c0336735>] __down+0x85/0x120
> > [<c033692f>] __down_failed+0xb/0x14
> > [<c026af27>] .text.lock.hub+0x69/0x82
> > [<c0272b7f>] usbdev_ioctl+0x19f/0x710
> > [<c015a45d>] file_ioctl+0x5d/0x170
> > [<c015a686>] sys_ioctl+0x116/0x250
> > [<c0103f8f>] syscall_call+0x7/0xb
>
> Does this help?

I'm afraid not.

--
1024D/E65A7801 Zephaniah E. Hull <[email protected]>
92ED 94E4 B1E6 3624 226D 5727 4453 008B E65A 7801
CCs of replies from mailing lists are requested.

I still do not understand why manglement believes that cutting off the
oxygen flow to the brain will INCREASE productivity. The reason they
made that decision is probably because they couldn't think clearly due
to wearing neckties.
-- Paul Tomko on ASR.


Attachments:
(No filename) (836.00 B)
signature.asc (189.00 B)
Digital signature
Download all attachments

2004-06-05 07:55:11

by Duncan Sands

[permalink] [raw]
Subject: Re: [linux-usb-devel] Re: USBDEVFS_RESET deadlocks USB bus.

On Friday 04 June 2004 23:30, Zephaniah E. Hull wrote:
> On Fri, Jun 04, 2004 at 10:40:43PM +0200, Duncan Sands wrote:
> > > c4bae310 Call Trace:
> > > [<c0336735>] __down+0x85/0x120
> > > [<c033692f>] __down_failed+0xb/0x14
> > > [<c026af27>] .text.lock.hub+0x69/0x82
> > > [<c0272b7f>] usbdev_ioctl+0x19f/0x710
> > > [<c015a45d>] file_ioctl+0x5d/0x170
> > > [<c015a686>] sys_ioctl+0x116/0x250
> > > [<c0103f8f>] syscall_call+0x7/0xb
> >
> > Does this help?
>
> I'm afraid not.

Are you sure? That seems impossible to me! Can you
get a new call trace please.

Thanks,

Duncan.

2004-06-06 06:36:10

by Zephaniah E. Hull

[permalink] [raw]
Subject: Re: [linux-usb-devel] Re: USBDEVFS_RESET deadlocks USB bus.

On Sat, Jun 05, 2004 at 09:55:01AM +0200, Duncan Sands wrote:
> On Friday 04 June 2004 23:30, Zephaniah E. Hull wrote:
> > On Fri, Jun 04, 2004 at 10:40:43PM +0200, Duncan Sands wrote:
> > > > c4bae310 Call Trace:
> > > > [<c0336735>] __down+0x85/0x120
> > > > [<c033692f>] __down_failed+0xb/0x14
> > > > [<c026af27>] .text.lock.hub+0x69/0x82
> > > > [<c0272b7f>] usbdev_ioctl+0x19f/0x710
> > > > [<c015a45d>] file_ioctl+0x5d/0x170
> > > > [<c015a686>] sys_ioctl+0x116/0x250
> > > > [<c0103f8f>] syscall_call+0x7/0xb
> > >
> > > Does this help?
> >
> > I'm afraid not.
>
> Are you sure? That seems impossible to me! Can you
> get a new call trace please.

Hrm, I could have sworn that the kernel I tested with was rebuilt with
the patch, but now that I am trying it on rc2-mm1 with the patch, it
does in fact seem to be working, mostly.

Thanks a lot, and sorry for the previous report.

I seem to be seeing a locking related race condition on bulk reads and
writes as well, should I start a new thread for those?

--
1024D/E65A7801 Zephaniah E. Hull <[email protected]>
92ED 94E4 B1E6 3624 226D 5727 4453 008B E65A 7801
CCs of replies from mailing lists are requested.

* james would be more impressed if netgod's magic powers could stop the
splits in the first place...
* netgod notes debian developers are notoriously hard to impress
-- Seen on #Debian


Attachments:
(No filename) (1.36 kB)
signature.asc (189.00 B)
Digital signature
Download all attachments

2004-06-07 07:05:45

by Duncan Sands

[permalink] [raw]
Subject: Re: [linux-usb-devel] Re: USBDEVFS_RESET deadlocks USB bus.

> > Are you sure? That seems impossible to me! Can you
> > get a new call trace please.
>
> Hrm, I could have sworn that the kernel I tested with was rebuilt with
> the patch, but now that I am trying it on rc2-mm1 with the patch, it
> does in fact seem to be working, mostly.
>
> Thanks a lot, and sorry for the previous report.
>
> I seem to be seeing a locking related race condition on bulk reads and
> writes as well, should I start a new thread for those?

I don't think it matters much whether you start a new thread or not. What
problem are you seeing?

Ciao,

Duncan.

2004-06-07 15:43:46

by Zephaniah E. Hull

[permalink] [raw]
Subject: Re: [linux-usb-devel] Re: USBDEVFS_RESET deadlocks USB bus.

On Mon, Jun 07, 2004 at 09:05:41AM +0200, Duncan Sands wrote:
> > > Are you sure? That seems impossible to me! Can you
> > > get a new call trace please.
> >
> > Hrm, I could have sworn that the kernel I tested with was rebuilt with
> > the patch, but now that I am trying it on rc2-mm1 with the patch, it
> > does in fact seem to be working, mostly.
> >
> > Thanks a lot, and sorry for the previous report.
> >
> > I seem to be seeing a locking related race condition on bulk reads and
> > writes as well, should I start a new thread for those?
>
> I don't think it matters much whether you start a new thread or not. What
> problem are you seeing?

To give some background, the libusb backend of pilot-link is a slightly
odd design, we do the init work, reset the device, and then setup a read
thread, which basicly does a continuous loop of bulk no-timeout reads
from USB.

In the primary thread we do most of the work, including doing the USB
bulk writes to do things like ask the pilot for information.

Which, once I had things working again, sometimes resulted in the
following issue:

lt-pilot-xfer D 00000010 0 3351 3097 3398 (NOTLB)
ca93dec0 00000082 00000001 00000010 cdaa434c caac11e8 cdaa43ec caac11a0
ca93ded4 ce8b4318 c03a2fc0 00000000 3904f0c0 000f42ec cb7fe398 ca913c24
00000246 ca93c000 ca93defc c0336735 ca913c2c cb7fe1f0 00000001 cb7fe1f0
Call Trace:
[<c0336735>] __down+0x85/0x120
[<c033692f>] __down_failed+0xb/0x14
[<c0273245>] .text.lock.devio+0xe5/0x120
[<c015a45d>] file_ioctl+0x5d/0x170
[<c015a686>] sys_ioctl+0x116/0x250
[<c0103f8f>] syscall_call+0x7/0xb

lt-pilot-xfer D 00000001 0 3398 3351 (NOTLB)
ca8ebe1c 00000082 00000000 00000001 0ca97854 ce5bca08 d7d85000 00000001
ce5bca08 00000246 ca8ebe2c 00000000 38200f00 000f42ec cb7fe938 ca8ebeac
ca8ea000 ca8ea000 ca8ebe70 c0336eb8 00000000 cb7fe790 c01146a0 00000000
Call Trace:
[<c0336eb8>] wait_for_completion+0x78/0xf0
[<c026cc0a>] usb_start_wait_urb+0x7a/0xc0
[<c0271866>] proc_bulk+0x116/0x220
[<c0272f61>] usbdev_ioctl+0x581/0x710
[<c015a45d>] file_ioctl+0x5d/0x170
[<c015a686>] sys_ioctl+0x116/0x250
[<c0103f8f>] syscall_call+0x7/0xb

With the device not sending us any more data until it receives the
write, and the write not getting to send until the bulk read finishes or
the device goes away.

With the predictable annoyance this causes, of course.

--
1024D/E65A7801 Zephaniah E. Hull <[email protected]>
92ED 94E4 B1E6 3624 226D 5727 4453 008B E65A 7801
CCs of replies from mailing lists are requested.

Yes, Java is so bulletproofed that to a C programmer it feels like
being in a straightjacket, but it's a really comfy and warm
straightjacket, and the world would be a safer place if everyone was
straightjacketed most of the time. -- Overheard in the SDM.


Attachments:
(No filename) (2.81 kB)
signature.asc (189.00 B)
Digital signature
Download all attachments

2004-06-07 19:11:39

by Duncan Sands

[permalink] [raw]
Subject: Re: [linux-usb-devel] Re: USBDEVFS_RESET deadlocks USB bus.

> To give some background, the libusb backend of pilot-link is a slightly
> odd design, we do the init work, reset the device, and then setup a read
> thread, which basicly does a continuous loop of bulk no-timeout reads
> from USB.
>
> In the primary thread we do most of the work, including doing the USB
> bulk writes to do things like ask the pilot for information.
>
> Which, once I had things working again, sometimes resulted in the
> following issue:
>
> lt-pilot-xfer D 00000010 0 3351 3097 3398 (NOTLB)
> ca93dec0 00000082 00000001 00000010 cdaa434c caac11e8 cdaa43ec caac11a0
> ca93ded4 ce8b4318 c03a2fc0 00000000 3904f0c0 000f42ec cb7fe398 ca913c24
> 00000246 ca93c000 ca93defc c0336735 ca913c2c cb7fe1f0 00000001 cb7fe1f0
> Call Trace:
> [<c0336735>] __down+0x85/0x120
> [<c033692f>] __down_failed+0xb/0x14
> [<c0273245>] .text.lock.devio+0xe5/0x120
> [<c015a45d>] file_ioctl+0x5d/0x170
> [<c015a686>] sys_ioctl+0x116/0x250
> [<c0103f8f>] syscall_call+0x7/0xb
>
> lt-pilot-xfer D 00000001 0 3398 3351 (NOTLB)
> ca8ebe1c 00000082 00000000 00000001 0ca97854 ce5bca08 d7d85000 00000001
> ce5bca08 00000246 ca8ebe2c 00000000 38200f00 000f42ec cb7fe938 ca8ebeac
> ca8ea000 ca8ea000 ca8ebe70 c0336eb8 00000000 cb7fe790 c01146a0 00000000
> Call Trace:
> [<c0336eb8>] wait_for_completion+0x78/0xf0
> [<c026cc0a>] usb_start_wait_urb+0x7a/0xc0
> [<c0271866>] proc_bulk+0x116/0x220
> [<c0272f61>] usbdev_ioctl+0x581/0x710
> [<c015a45d>] file_ioctl+0x5d/0x170
> [<c015a686>] sys_ioctl+0x116/0x250
> [<c0103f8f>] syscall_call+0x7/0xb
>
> With the device not sending us any more data until it receives the
> write, and the write not getting to send until the bulk read finishes or
> the device goes away.
>
> With the predictable annoyance this causes, of course.

This is my fault. I will fix it.

Ciao,

Duncan.

2004-06-07 23:13:59

by Zephaniah E. Hull

[permalink] [raw]
Subject: Re: [linux-usb-devel] Re: USBDEVFS_RESET deadlocks USB bus.

On Mon, Jun 07, 2004 at 09:11:32PM +0200, Duncan Sands wrote:
> > With the device not sending us any more data until it receives the
> > write, and the write not getting to send until the bulk read finishes or
> > the device goes away.
> >
> > With the predictable annoyance this causes, of course.
>
> This is my fault. I will fix it.

Great, could you send me the patch? (So I have something usable until it
gets into mainline and a kernel is released with it.)
>
> Ciao,
>
> Duncan.
>

--
1024D/E65A7801 Zephaniah E. Hull <[email protected]>
92ED 94E4 B1E6 3624 226D 5727 4453 008B E65A 7801
CCs of replies from mailing lists are requested.

<nonlinear> .net is microsofts perverted version of a java networked
environment uglified for windows-specific crap
-- nonlinear in #opengl


Attachments:
(No filename) (821.00 B)
signature.asc (189.00 B)
Digital signature
Download all attachments

2004-06-08 22:33:40

by Duncan Sands

[permalink] [raw]
Subject: Re: [linux-usb-devel] Re: USBDEVFS_RESET deadlocks USB bus.

> Great, could you send me the patch? (So I have something usable until it
> gets into mainline and a kernel is released with it.)

Sure - I just have to write it first! It's a bit tricky to do right...

Duncan.

2004-06-18 20:11:21

by Ian E. Morgan

[permalink] [raw]
Subject: Re: [linux-usb-devel] Re: USBDEVFS_RESET deadlocks USB bus.

On Tue, 8 Jun 2004, Duncan Sands wrote:

>> Great, could you send me the patch? (So I have something usable until it
>> gets into mainline and a kernel is released with it.)
>
> Sure - I just have to write it first! It's a bit tricky to do right...

Any further status on this?

I don't know for sure that the problem I'm having is related, but it sure
sound like it:

I've tried these kernels:
2.6.6
2.6.7
2.6.7 + 2.6.7-rc3-mm2's big USB patch
all both with and witout kernel pre-emption enabled.

I have 3 Addonics 4-slot PCMCIA card reader attached via two 7-port D-Link
hub. All units are USB 2.0 (reported in /proc/bus/usb/devices as 480Mbps).
The problem occurs even with just one slot connected by one hub and
everything else disconncted. All readers and hubs are powered by extrernal
AC adapters, so power draw from the bus should not be an issue.

I can insert a PCMCIA card and, w/ usb-storage, see it as /dev/sda. But here
is the problem: I can only do this (roughly) twice. After ejecting it the
2nd time, and inserting it the 3rd time, the USB subsystem seems to hang,
and rather than detecting the card, goes into an infinite cycle of
attempting and failing to reset the bus. The 3 insertion cycles is almost
always constant, but it has once allowed me 6 cycles before it locked up.
During that one 6-cycle count, the same hang appeard after the 3rd
insertion, but the bus actually recovered after one reset attempt, but then
locked up after the 6th insertion never to recover again.

The kernel processes [usb-storage] and [scsi_eh_n] appear locked, with the
[scsi_eh_n] in D state:

root 4297 0.0 0.0 0 0 ? SW 15:21 0:00 [usb-storage]
root 4298 0.0 0.0 0 0 ? DW 15:21 0:00 [scsi_eh_6]

Here is a stack trace of usb-storage and scsi_eh_6:

Jun 18 15:27:05 light kernel: usb-storage S C03A1C40 0 4297 1 4298 2515 (L-TLB)
Jun 18 15:27:05 light kernel: d5b43f58 00000046 d737e0b0 c03a1c40 00000001 d5b43f48 c0117575 df892b30
Jun 18 15:27:05 light kernel: 00000001 00000000 d5d86030 00002274 32de0b8e 0000006f d737e258 c158dd04
Jun 18 15:27:05 light kernel: 00000286 d5b42000 d737e0b0 c02a9512 c158dd0c 00000000 00000001 d737e0b0
Jun 18 15:27:05 light kernel: Call Trace:
Jun 18 15:27:05 light kernel: [<c0117575>] __wake_up_common+0x38/0x57
Jun 18 15:27:05 light kernel: [<c02a9512>] __down_interruptible+0x9b/0x101
Jun 18 15:27:05 light kernel: [<c011752b>] default_wake_function+0x0/0x12
Jun 18 15:27:05 light kernel: [<c011a559>] printk+0xef/0x12c
Jun 18 15:27:05 light kernel: [<c02a958b>] __down_failed_interruptible+0x7/0xc
Jun 18 15:27:05 light kernel: [<e0ddd29e>] .text.lock.usb+0x5/0x5b [usb_storage]
Jun 18 15:27:05 light kernel: [<c0105d4e>] ret_from_fork+0x6/0x14
Jun 18 15:27:05 light kernel: [<e0ddc4b9>] usb_stor_control_thread+0x0/0x2b5 [usb_storage]
Jun 18 15:27:05 light kernel: [<e0ddc4b9>] usb_stor_control_thread+0x0/0x2b5 [usb_storage]
Jun 18 15:27:05 light kernel: [<c0104291>] kernel_thread_helper+0x5/0xb
Jun 18 15:27:05 light kernel:
Jun 18 15:27:05 light kernel: scsi_eh_6 D C03A20E8 0 4298 1 4297 (L-TLB)
Jun 18 15:27:05 light kernel: d5eabed0 00000046 df892b30 c03a20e8 00000073 00000000 c02f76b8 00000001
Jun 18 15:27:05 light kernel: d9c03e01 00000073 df892b30 000024bb d9c04dd3 00000073 d5d861d8 df8ff424
Jun 18 15:27:05 light kernel: 00000286 d5d86030 df8ff42c c02a942c 00000001 d5d86030 c011752b df8ff42c
Jun 18 15:27:05 light kernel: Call Trace:
Jun 18 15:27:05 light kernel: [<c02a942c>] __down+0x7c/0xc7
Jun 18 15:27:05 light kernel: [<c011752b>] default_wake_function+0x0/0x12
Jun 18 15:27:05 light kernel: [<c011a674>] release_console_sem+0x9a/0x9c
Jun 18 15:27:05 light kernel: [<c02a9580>] __down_failed+0x8/0xc
Jun 18 15:27:05 light kernel: [<e08dd6ff>] .text.lock.hub+0x87/0x98 [usbcore]
Jun 18 15:27:05 light kernel: [<e0dda35e>] bus_reset+0xc9/0xeb [usb_storage]
Jun 18 15:27:05 light kernel: [<e0dcb146>] scsi_try_bus_reset+0x54/0x8a [scsi_mod]
Jun 18 15:27:05 light kernel: [<e0dcb265>] scsi_eh_bus_reset+0x5f/0x101 [scsi_mod]
Jun 18 15:27:05 light kernel: [<e0dcb095>] scsi_eh_bus_device_reset+0x75/0xd2 [scsi_mod]
Jun 18 15:27:05 light kernel: [<e0dcb7a0>] scsi_eh_ready_devs+0x63/0x93 [scsi_mod]
Jun 18 15:27:05 light kernel: [<e0dcb8f4>] scsi_unjam_host+0xa1/0xa3 [scsi_mod]
Jun 18 15:27:05 light kernel: [<e0dcb994>] scsi_error_handler+0x9e/0xc6 [scsi_mod]
Jun 18 15:27:05 light kernel: [<e0dcb8f6>] scsi_error_handler+0x0/0xc6 [scsi_mod]
Jun 18 15:27:05 light kernel: [<c0104291>] kernel_thread_helper+0x5/0xb

And here are the debug logs leading up the the lockup:

Jun 18 15:21:17 light kernel: hub 1-1:1.0: port 2, status 0101, change 0001, 12 Mb/s
Jun 18 15:21:17 light kernel: hub 1-1:1.0: debounce: port 2: total 100ms stable 100ms status 0x101
Jun 18 15:21:17 light kernel: usb 1-1.2: new full speed USB device using address 17
Jun 18 15:21:17 light kernel: usb 1-1.2: new device strings: Mfr=73, Product=82, SerialNumber=99
Jun 18 15:21:17 light kernel: usb 1-1.2: default language 0x0409
Jun 18 15:21:17 light kernel: usb 1-1.2: Product: USB to IDE Cable
Jun 18 15:21:17 light kernel: usb 1-1.2: Manufacturer: Addonics
Jun 18 15:21:17 light kernel: usb 1-1.2: SerialNumber: 0123456789AB
Jun 18 15:21:17 light kernel: usb 1-1.2: hotplug
Jun 18 15:21:17 light kernel: usb 1-1.2: adding 1-1.2:2.0 (config #2, interface 0)
Jun 18 15:21:17 light kernel: usb 1-1.2:2.0: hotplug
Jun 18 15:21:17 light kernel: usb-storage 1-1.2:2.0: usb_probe_interface
Jun 18 15:21:17 light kernel: usb-storage 1-1.2:2.0: usb_probe_interface - got id
Jun 18 15:21:17 light kernel: usb-storage: USB Mass Storage device detected
Jun 18 15:21:17 light kernel: usb-storage: altsetting is 0, id_index is 92
Jun 18 15:21:17 light kernel: usb-storage: -- associate_dev
Jun 18 15:21:18 light kernel: usb-storage: Transport: Bulk
Jun 18 15:21:18 light kernel: usb-storage: Protocol: Transparent SCSI
Jun 18 15:21:18 light kernel: usb-storage: Endpoints: In: 0xd6c4ba54 Out: 0xd6c4ba68 Int: 0xd6c4ba7c (Period 32)
Jun 18 15:21:18 light kernel: usb-storage: usb_stor_control_msg: rq=fe rqtype=a1 value=0000 index=00 len=1
Jun 18 15:21:18 light kernel: usb-storage: GetMaxLUN command result is 1, data is 0
Jun 18 15:21:18 light kernel: usb-storage: *** thread sleeping.
Jun 18 15:21:18 light kernel: scsi6 : SCSI emulation for USB Mass Storage devices
Jun 18 15:21:18 light kernel: usb-storage: queuecommand called
Jun 18 15:21:18 light kernel: usb-storage: *** thread awakened.
Jun 18 15:21:18 light kernel: usb-storage: Command INQUIRY (6 bytes)
Jun 18 15:21:18 light kernel: usb-storage: 12 00 00 00 24 00
Jun 18 15:21:18 light kernel: usb-storage: Bulk Command S 0x43425355 T 0xd1 L 36 F 128 Trg 0 LUN 0 CL 6
Jun 18 15:21:18 light kernel: usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
Jun 18 15:21:18 light kernel: usb-storage: Status code 0; transferred 31/31
Jun 18 15:21:18 light kernel: usb-storage: -- transfer complete
Jun 18 15:21:18 light kernel: usb-storage: Bulk command transfer result=0
Jun 18 15:21:18 light kernel: usb-storage: usb_stor_bulk_transfer_buf: xfer 36 bytes
Jun 18 15:21:23 light kernel: usb-storage: command_abort called
Jun 18 15:21:23 light kernel: usb-storage: usb_stor_stop_transport called
Jun 18 15:21:23 light kernel: usb-storage: -- cancelling URB
Jun 18 15:21:23 light kernel: usb-storage: Status code -104; transferred 0/36
Jun 18 15:21:23 light kernel: usb-storage: -- transfer cancelled
Jun 18 15:21:23 light kernel: usb-storage: Bulk data transfer result 0x4
Jun 18 15:21:23 light kernel: usb-storage: -- command was aborted
Jun 18 15:21:23 light kernel: usb-storage: usb_stor_Bulk_reset called
Jun 18 15:21:23 light kernel: usb-storage: usb_stor_control_msg: rq=ff rqtype=21 value=0000 index=00 len=0
Jun 18 15:21:43 light kernel: usb-storage: Timeout -- cancelling URB
Jun 18 15:21:43 light kernel: usb-storage: Soft reset failed: -104
Jun 18 15:21:43 light kernel: usb-storage: scsi command aborted
Jun 18 15:21:43 light kernel: usb-storage: *** thread sleeping.
Jun 18 15:21:43 light kernel: usb-storage: queuecommand called
Jun 18 15:21:43 light kernel: usb-storage: *** thread awakened.
Jun 18 15:21:43 light kernel: usb-storage: Command TEST_UNIT_READY (6 bytes)
Jun 18 15:21:43 light kernel: usb-storage: 00 00 00 00 00 00
Jun 18 15:21:43 light kernel: usb-storage: Bulk Command S 0x43425355 T 0xd1 L 0 F 0 Trg 0 LUN 0 CL 6
Jun 18 15:21:43 light kernel: usb-storage: usb_stor_bulk_transfer_buf: xfer 31 bytes
Jun 18 15:21:43 light kernel: usb-storage: Status code 0; transferred 31/31
Jun 18 15:21:43 light kernel: usb-storage: -- transfer complete
Jun 18 15:21:43 light kernel: usb-storage: Bulk command transfer result=0
Jun 18 15:21:43 light kernel: usb-storage: Attempting to get CSW...
Jun 18 15:21:43 light kernel: usb-storage: usb_stor_bulk_transfer_buf: xfer 13 bytes
Jun 18 15:21:53 light kernel: usb-storage: command_abort called
Jun 18 15:21:53 light kernel: usb-storage: usb_stor_stop_transport called
Jun 18 15:21:53 light kernel: usb-storage: -- cancelling URB
Jun 18 15:21:53 light kernel: usb-storage: Status code -104; transferred 0/13
Jun 18 15:21:53 light kernel: usb-storage: -- transfer cancelled
Jun 18 15:21:53 light kernel: usb-storage: Bulk status result = 4
Jun 18 15:21:53 light kernel: usb-storage: -- command was aborted
Jun 18 15:21:53 light kernel: usb-storage: usb_stor_Bulk_reset called
Jun 18 15:21:53 light kernel: usb-storage: usb_stor_control_msg: rq=ff rqtype=21 value=0000 index=00 len=0
Jun 18 15:22:13 light kernel: usb-storage: Timeout -- cancelling URB
Jun 18 15:22:13 light kernel: usb-storage: Soft reset failed: -104
Jun 18 15:22:13 light kernel: usb-storage: scsi command aborted
Jun 18 15:22:13 light kernel: usb-storage: *** thread sleeping.
Jun 18 15:22:13 light kernel: usb-storage: device_reset called
Jun 18 15:22:13 light kernel: usb-storage: usb_stor_Bulk_reset called
Jun 18 15:22:13 light kernel: usb-storage: usb_stor_control_msg: rq=ff rqtype=21 value=0000 index=00 len=0
Jun 18 15:22:33 light kernel: usb-storage: Timeout -- cancelling URB
Jun 18 15:22:33 light kernel: usb-storage: Soft reset failed: -104
Jun 18 15:22:33 light kernel: usb-storage: bus_reset called
Jun 18 15:24:02 light kernel: hub 1-1:1.0: transfer --> -84
Jun 18 15:24:33 light last message repeated 79 times
Jun 18 15:25:00 light last message repeated 71 times

Notice that after several bus reset timeouts, the "-84" error repeats ad-
infinitum.

This particular test was with only uhci_hcd, but the same deadlock occurs
with ehci_hcd as well, though the end result is different only in the
reported error over and over:

Jun 18 14:42:04 light kernel: ehci_hcd 0000:00:1d.7: devpath 2 ep1in 3strikes
Jun 18 14:42:04 light kernel: hub 4-2:1.0: transfer --> -71

Can anyone help, or at least tell me that this is the same problem seen in the
earlier parts of this thread?

Regards,
Ian Morgan

--
-------------------------------------------------------------------
Ian E. Morgan Vice President & C.O.O. Webcon, Inc.
imorgan at webcon dot ca PGP: #2DA40D07 http://www.webcon.ca
* Customized Linux Network Solutions for your Business *
-------------------------------------------------------------------

2004-06-19 13:48:25

by Duncan Sands

[permalink] [raw]
Subject: Re: [linux-usb-devel] Re: USBDEVFS_RESET deadlocks USB bus.

> I don't know for sure that the problem I'm having is related, but it sure
> sound like it:

Hi Ian, I was talking about the problem in usbfs where only one bulk or
control message can be in flight at any one time. Your problem seems
quite different to this, sorry!

All the best,

Duncan.