2007-08-05 11:09:52

by Tino Keitel

[permalink] [raw]
Subject: Re: 2.6.23-rc1: USB hard disk broken

On Thu, Jul 26, 2007 at 10:06:40 +0200, Oliver Neukum wrote:
> Am Mittwoch 25 Juli 2007 schrieb Tino Keitel:
> > On Wed, Jul 25, 2007 at 10:24:36 +0200, Oliver Neukum wrote:
> > > Am Mittwoch 25 Juli 2007 schrieb Tino Keitel:
> > > > Hi,
> > > >
> > > > I just tried 2.6.23-rc1 and shortly after the boot my external USB hard
> > > > disk went mad.
> >
> > [...]
> >
> > > Please recompile with CONFIG_USB_DEBUG set.
> > >
> > > Regards
> > > Oliver
> >
> > I tried, but it didn't happen again.
>
> Did you try with the old kernel again? This bug may be related to timing.

I tried again -rc1 without USB_DEBUG, and was able to reproduce the
bug 2 times. At the second time, the kernel log shows this:

2007-08-05_10:30:27.75572 kern.err: ehci_hcd 0000:00:1d.7: dev 6 ep1in scatterli
st error 0/-121
2007-08-05_10:30:27.86576 kern.info: usb 1-6: reset high speed USB device using ehci_hcd and address 5
2007-08-05_10:30:55.95293 kern.info: usb 1-6: USB disconnect, address 5
2007-08-05_10:30:55.95300 kern.err: ehci_hcd 0000:00:1d.7: dev 6 ep1in scatterlist error -108/-108
2007-08-05_10:30:55.95310 kern.info: sd 4:0:0:0: [sdb] Result: hostbyte=0x07 driverbyte=0x00
2007-08-05_10:30:55.95314 kern.warn: end_request: I/O error, dev sdb, sector 594818327
2007-08-05_10:30:55.95321 kern.info: sd 4:0:0:0: [sdb] Result: hostbyte=0x07 driverbyte=0x00
2007-08-05_10:30:55.95325 kern.warn: end_request: I/O error, dev sdb, sector 594818567
2007-08-05_10:30:55.95331 kern.info: sd 4:0:0:0: [sdb] Result: hostbyte=0x07 driverbyte=0x00
2007-08-05_10:30:55.95335 kern.warn: end_request: I/O error, dev sdb, sector 594818583
2007-08-05_10:30:55.95342 kern.info: sd 4:0:0:0: [sdb] Result: hostbyte=0x07 driverbyte=0x00
2007-08-05_10:30:55.95352 kern.warn: end_request: I/O error, dev sdb, sector 594818823
2007-08-05_10:30:55.95356 kern.info: sd 4:0:0:0: [sdb] Result: hostbyte=0x07 driverbyte=0x00
2007-08-05_10:30:55.95360 kern.warn: end_request: I/O error, dev sdb, sector 594818327
2007-08-05_10:30:55.95455 kern.info: sd 4:0:0:0: [sdb] Result: hostbyte=0x07 driverbyte=0x00
2007-08-05_10:30:55.95461 kern.warn: end_request: I/O error, dev sdb, sector 594818327
2007-08-05_10:30:55.96972 kern.err: scsi 4:0:0:0: rejecting I/O to dead device
2007-08-05_10:30:55.96977 kern.err: scsi 4:0:0:0: rejecting I/O to dead device

The "scatterlist" line wasn't there in the other cases. I'll try again
with the latest git.

Regards,
Tino


2007-08-05 11:41:44

by Tino Keitel

[permalink] [raw]
Subject: Re: 2.6.23-rc1: USB hard disk broken

On Sun, Aug 05, 2007 at 13:09:42 +0200, Tino Keitel wrote:
> On Thu, Jul 26, 2007 at 10:06:40 +0200, Oliver Neukum wrote:
> > Am Mittwoch 25 Juli 2007 schrieb Tino Keitel:
> > > On Wed, Jul 25, 2007 at 10:24:36 +0200, Oliver Neukum wrote:
> > > > Am Mittwoch 25 Juli 2007 schrieb Tino Keitel:
> > > > > Hi,
> > > > >
> > > > > I just tried 2.6.23-rc1 and shortly after the boot my external USB hard
> > > > > disk went mad.
> > >
> > > [...]
> > >
> > > > Please recompile with CONFIG_USB_DEBUG set.
> > > >
> > > > Regards
> > > > Oliver
> > >
> > > I tried, but it didn't happen again.
> >
> > Did you try with the old kernel again? This bug may be related to timing.
>
> I tried again -rc1 without USB_DEBUG, and was able to reproduce the
> bug 2 times. At the second time, the kernel log shows this:

Now I tried current git d4ac2477fad0f2680e84ec12e387ce67682c5c13 and I
can still reproduce it.

Regards,
Tino

2007-08-05 15:44:21

by Oliver Neukum

[permalink] [raw]
Subject: Re: 2.6.23-rc1: USB hard disk broken

Am Sonntag 05 August 2007 schrieb Tino Keitel:
> On Thu, Jul 26, 2007 at 10:06:40 +0200, Oliver Neukum wrote:
> > Am Mittwoch 25 Juli 2007 schrieb Tino Keitel:
> > > On Wed, Jul 25, 2007 at 10:24:36 +0200, Oliver Neukum wrote:
> > > > Am Mittwoch 25 Juli 2007 schrieb Tino Keitel:
> > > > > Hi,
> > > > >
> > > > > I just tried 2.6.23-rc1 and shortly after the boot my external USB hard
> > > > > disk went mad.
> > >
> > > [...]
> > >
> > > > Please recompile with CONFIG_USB_DEBUG set.
> > > >
> > > > ??Regards
> > > > ??????????Oliver
> > >
> > > I tried, but it didn't happen again.
> >
> > Did you try with the old kernel again? This bug may be related to timing.
>
> I tried again -rc1 without USB_DEBUG, and was able to reproduce the
> bug 2 times. At the second time, the kernel log shows this:
>
> 2007-08-05_10:30:27.75572 kern.err: ehci_hcd 0000:00:1d.7: dev 6 ep1in scatterli
> st error 0/-121
> 2007-08-05_10:30:27.86576 kern.info: usb 1-6: reset high speed USB device using ehci_hcd and address 5
> 2007-08-05_10:30:55.95293 kern.info: usb 1-6: USB disconnect, address 5
> 2007-08-05_10:30:55.95300 kern.err: ehci_hcd 0000:00:1d.7: dev 6 ep1in scatterlist error -108/-108

David, does this error say anything to you?

Regards
Oliver

2007-08-05 19:11:33

by David Brownell

[permalink] [raw]
Subject: Re: 2.6.23-rc1: USB hard disk broken

On Sunday 05 August 2007, Oliver Neukum wrote:
> >
> > 2007-08-05_10:30:27.75572 kern.err:
> > ehci_hcd 0000:00:1d.7: dev 6 ep1in scatterlist error 0/-121

That's rather strange since it means a *success* (urb->status 0) was
reported after a short read (scatterlist status -120, -EREMOTEIO).

The hardware should have stopped queue processing after the short
read, because of how qtd->hw_alt_next gets set up ... at least,
that's how I remember it, these many years after writing that code.

It might be that because of the issue noted below, it was wrongly
restarted by the software.


> > 2007-08-05_10:30:27.86576 kern.info: usb 1-6: reset high speed USB device using ehci_hcd and address 5
> > 2007-08-05_10:30:55.95293 kern.info: usb 1-6: USB disconnect, address 5
> > 2007-08-05_10:30:55.95300 kern.err:
> > ehci_hcd 0000:00:1d.7: dev 6 ep1in scatterlist error -108/-108

That one just means nobody updated that test to recognize that
the -ESHUTDOWN (-108) triggered after disconnect is a "clean"
failure like the ones triggered by unlinking.

However it also indicates that something changed in the unlink
code paths, since I see the *expected* code (-ECONNRESET) is no
longer being set by usbcore during unlinks ... it's not quite
clear to me what else that change will have broken. Including
whether that might not explain how the hardware queue got wrongly
restarted after the short read above.

- Dave


> David, does this error say anything to you?


2007-08-09 17:04:21

by Dan Zwell

[permalink] [raw]
Subject: Re: 2.6.23-rc1: USB hard disk broken

Oliver Neukum wrote:
> Am Sonntag 05 August 2007 schrieb Tino Keitel:
>> On Thu, Jul 26, 2007 at 10:06:40 +0200, Oliver Neukum wrote:
>>> Am Mittwoch 25 Juli 2007 schrieb Tino Keitel:
>>>> On Wed, Jul 25, 2007 at 10:24:36 +0200, Oliver Neukum wrote:
>>>>> Am Mittwoch 25 Juli 2007 schrieb Tino Keitel:
>>>>>> Hi,
>>>>>>
>>>>>> I just tried 2.6.23-rc1 and shortly after the boot my external USB hard
>>>>>> disk went mad.
>>>> [...]
>>>>
>>>>> Please recompile with CONFIG_USB_DEBUG set.
>>>>>
>> I tried again -rc1 without USB_DEBUG, and was able to reproduce the
>> bug 2 times. At the second time, the kernel log shows this:
>>
>> 2007-08-05_10:30:27.75572 kern.err: ehci_hcd 0000:00:1d.7: dev 6 ep1in scatterli
>> st error 0/-121
>> 2007-08-05_10:30:27.86576 kern.info: usb 1-6: reset high speed USB device using ehci_hcd and address 5
>> 2007-08-05_10:30:55.95293 kern.info: usb 1-6: USB disconnect, address 5
>> 2007-08-05_10:30:55.95300 kern.err: ehci_hcd 0000:00:1d.7: dev 6 ep1in scatterlist error -108/-108
>
> David, does this error say anything to you?
>
> Regards
> Oliver

Hi,

I just completed a git bisect, and commit
8dfe4b14869fd185ca25ee88b02ada58a3005eaf was the commit that introduced
this problem. This is "usb-storage: implement autosuspend". I don't
think there have been many changes in drivers/usb since I last verified
this problem, so I'm pretty sure this is still happening in the latest
kernel.

The first thing I am going to do is pull the latest sources, attempt to
revert this patch, and see whether the kernel compiles and works
properly with my USB hard drive. Please let me know of anything I can do
to help.

Dan

2007-08-09 20:00:33

by Alan Stern

[permalink] [raw]
Subject: Re: 2.6.23-rc1: USB hard disk broken

On Thu, 9 Aug 2007, Dan Zwell wrote:

> Oliver Neukum wrote:
> > Am Sonntag 05 August 2007 schrieb Tino Keitel:
> >> On Thu, Jul 26, 2007 at 10:06:40 +0200, Oliver Neukum wrote:
> >>> Am Mittwoch 25 Juli 2007 schrieb Tino Keitel:
> >>>> On Wed, Jul 25, 2007 at 10:24:36 +0200, Oliver Neukum wrote:
> >>>>> Am Mittwoch 25 Juli 2007 schrieb Tino Keitel:
> >>>>>> Hi,
> >>>>>>
> >>>>>> I just tried 2.6.23-rc1 and shortly after the boot my external USB hard
> >>>>>> disk went mad.
> >>>> [...]
> >>>>
> >>>>> Please recompile with CONFIG_USB_DEBUG set.
> >>>>>
> >> I tried again -rc1 without USB_DEBUG, and was able to reproduce the
> >> bug 2 times. At the second time, the kernel log shows this:
> >>
> >> 2007-08-05_10:30:27.75572 kern.err: ehci_hcd 0000:00:1d.7: dev 6 ep1in scatterli
> >> st error 0/-121
> >> 2007-08-05_10:30:27.86576 kern.info: usb 1-6: reset high speed USB device using ehci_hcd and address 5
> >> 2007-08-05_10:30:55.95293 kern.info: usb 1-6: USB disconnect, address 5
> >> 2007-08-05_10:30:55.95300 kern.err: ehci_hcd 0000:00:1d.7: dev 6 ep1in scatterlist error -108/-108
> >
> > David, does this error say anything to you?
> >
> > Regards
> > Oliver
>
> Hi,
>
> I just completed a git bisect, and commit
> 8dfe4b14869fd185ca25ee88b02ada58a3005eaf was the commit that introduced
> this problem. This is "usb-storage: implement autosuspend". I don't
> think there have been many changes in drivers/usb since I last verified
> this problem, so I'm pretty sure this is still happening in the latest
> kernel.
>
> The first thing I am going to do is pull the latest sources, attempt to
> revert this patch, and see whether the kernel compiles and works
> properly with my USB hard drive. Please let me know of anything I can do
> to help.

What makes you think the problem you see is the same as the one
described by Tino? Do you get the "scatterlist error 0/-121" line in
your log?

Please provide a dmesg log showing your problem with CONFIG_USB_DEBUG
enabled.

Alan Stern

2007-08-09 20:27:10

by Tino Keitel

[permalink] [raw]
Subject: Re: 2.6.23-rc1: USB hard disk broken

On Thu, Aug 09, 2007 at 16:00:13 -0400, Alan Stern wrote:

[...]

> What makes you think the problem you see is the same as the one
> described by Tino? Do you get the "scatterlist error 0/-121" line in
> your log?
>
> Please provide a dmesg log showing your problem with CONFIG_USB_DEBUG
> enabled.

I'll try to reproduce the problem with the commit reverted, but not
before Sunday evening.

Regards,
Tino

2007-08-09 22:21:13

by Dan Zwell

[permalink] [raw]
Subject: Re: 2.6.23-rc1: USB hard disk broken

Alan Stern wrote:
> On Thu, 9 Aug 2007, Dan Zwell wrote:
>
>> Oliver Neukum wrote:
>>> Am Sonntag 05 August 2007 schrieb Tino Keitel:
>>>> On Thu, Jul 26, 2007 at 10:06:40 +0200, Oliver Neukum wrote:
>>>>> Am Mittwoch 25 Juli 2007 schrieb Tino Keitel:
>>>>>> On Wed, Jul 25, 2007 at 10:24:36 +0200, Oliver Neukum wrote:
>>>>>>> Am Mittwoch 25 Juli 2007 schrieb Tino Keitel:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I just tried 2.6.23-rc1 and shortly after the boot my external USB hard
>>>>>>>> disk went mad.
>>>>>> [...]
>>>>>>
>>>>>>> Please recompile with CONFIG_USB_DEBUG set.
>>>>>>>
>>>> I tried again -rc1 without USB_DEBUG, and was able to reproduce the
>>>> bug 2 times. At the second time, the kernel log shows this:
>>>>
>>>> 2007-08-05_10:30:27.75572 kern.err: ehci_hcd 0000:00:1d.7: dev 6 ep1in scatterli
>>>> st error 0/-121
>>>> 2007-08-05_10:30:27.86576 kern.info: usb 1-6: reset high speed USB device using ehci_hcd and address 5
>>>> 2007-08-05_10:30:55.95293 kern.info: usb 1-6: USB disconnect, address 5
>>>> 2007-08-05_10:30:55.95300 kern.err: ehci_hcd 0000:00:1d.7: dev 6 ep1in scatterlist error -108/-108
>>> David, does this error say anything to you?
>>>
>>> Regards
>>> Oliver
>> Hi,
>>
>> I just completed a git bisect, and commit
>> 8dfe4b14869fd185ca25ee88b02ada58a3005eaf was the commit that introduced
>> this problem. This is "usb-storage: implement autosuspend". I don't
>> think there have been many changes in drivers/usb since I last verified
>> this problem, so I'm pretty sure this is still happening in the latest
>> kernel.
>>
> What makes you think the problem you see is the same as the one
> described by Tino? Do you get the "scatterlist error 0/-121" line in
> your log?
>
> Please provide a dmesg log showing your problem with CONFIG_USB_DEBUG
> enabled.
>
> Alan Stern
>
>

You're right, it probably isn't the same error, unfortunately. I jumped
to conclusions when I read Tino's description of the problem, "my
external USB hard disk went mad", which is a very good description of
what is happening to me. When any attempt is made to access the USB
disk, it (logically) disconnects and reconnects. When using hald's
automounting facilities, this leads to (endlessly repeated) madness.

I have attached my dmesg output (and a sanitized version that should
contain all the important information, but is easier to read). It looks
like the drive is auto suspended, but the resume process fails. Then the
process is repeated.

[ 126.512815] usb 1-1: usb auto-resume
[ 126.543447] uhci_hcd 0000:00:1f.2: port 1 portsc 00a5,01
[ 126.559426] usb 1-1: finish resume
[ 126.561435] usb 1-1: gone after usb resume? status -19
[ 126.561445] usb 1-1: can't resume, status -19
[ 126.561451] hub 1-0:1.0: logical disconnect on port 1
[ 126.562486] sd 5:0:0:0: [sdb] Result: hostbyte=DID_ERROR
driverbyte=DRIVER_OK,SUGGEST_OK

Relevant info:
-obviously, I'm using uhci
-the drive is SATA, connected to USB with a SATA/IDE to USB adapter
-this problem does not occur with a USB flash drive
-reverting the commit that introduced auto-suspend prevents this error.

Thanks,
Dan


Attachments:
dmesg.bz2 (12.08 kB)
dmesg_sanitized.bz2 (7.59 kB)
Download all attachments

2007-08-10 14:18:53

by Alan Stern

[permalink] [raw]
Subject: Re: 2.6.23-rc1: USB hard disk broken

On Thu, 9 Aug 2007, Dan Zwell wrote:

> > What makes you think the problem you see is the same as the one
> > described by Tino? Do you get the "scatterlist error 0/-121" line in
> > your log?
> >
> > Please provide a dmesg log showing your problem with CONFIG_USB_DEBUG
> > enabled.
> >
> > Alan Stern
> >
> >
>
> You're right, it probably isn't the same error, unfortunately. I jumped
> to conclusions when I read Tino's description of the problem, "my
> external USB hard disk went mad", which is a very good description of
> what is happening to me. When any attempt is made to access the USB
> disk, it (logically) disconnects and reconnects. When using hald's
> automounting facilities, this leads to (endlessly repeated) madness.
>
> I have attached my dmesg output (and a sanitized version that should
> contain all the important information, but is easier to read). It looks
> like the drive is auto suspended, but the resume process fails. Then the
> process is repeated.
>
> [ 126.512815] usb 1-1: usb auto-resume
> [ 126.543447] uhci_hcd 0000:00:1f.2: port 1 portsc 00a5,01
> [ 126.559426] usb 1-1: finish resume
> [ 126.561435] usb 1-1: gone after usb resume? status -19
> [ 126.561445] usb 1-1: can't resume, status -19
> [ 126.561451] hub 1-0:1.0: logical disconnect on port 1
> [ 126.562486] sd 5:0:0:0: [sdb] Result: hostbyte=DID_ERROR
> driverbyte=DRIVER_OK,SUGGEST_OK

This suggests a bug in the device's firmware, probably it sends a
1-byte Device-Status reply instead of a 2-byte reply as required by the
USB spec. You could find out for certain by using usbmon.

But if that is indeed the problem, the patch below should help. I've
seen it before; perhaps we should adopt this workaround permanently.

> Relevant info:
> -obviously, I'm using uhci
> -the drive is SATA, connected to USB with a SATA/IDE to USB adapter
> -this problem does not occur with a USB flash drive
> -reverting the commit that introduced auto-suspend prevents this error.

If necessary you could disable autosuspend for your drive. But first
test this patch.

Alan Stern



Index: 2.6.23-rc1/drivers/usb/core/hub.c
===================================================================
--- 2.6.23-rc1.orig/drivers/usb/core/hub.c
+++ 2.6.23-rc1/drivers/usb/core/hub.c
@@ -1644,9 +1644,10 @@ static int finish_port_resume(struct usb
* and device drivers will know about any resume quirks.
*/
if (status == 0) {
+ devstatus = 0;
status = usb_get_status(udev, USB_RECIP_DEVICE, 0, &devstatus);
if (status >= 0)
- status = (status == 2 ? 0 : -ENODEV);
+ status = (status > 0 ? 0 : -ENODEV);
}

if (status) {

2007-08-10 20:31:05

by Dan Zwell

[permalink] [raw]
Subject: Re: 2.6.23-rc1: USB hard disk broken

Alan Stern wrote:
>> [ 126.512815] usb 1-1: usb auto-resume
>> [ 126.543447] uhci_hcd 0000:00:1f.2: port 1 portsc 00a5,01
>> [ 126.559426] usb 1-1: finish resume
>> [ 126.561435] usb 1-1: gone after usb resume? status -19
>> [ 126.561445] usb 1-1: can't resume, status -19
>> [ 126.561451] hub 1-0:1.0: logical disconnect on port 1
>> [ 126.562486] sd 5:0:0:0: [sdb] Result: hostbyte=DID_ERROR
>> driverbyte=DRIVER_OK,SUGGEST_OK
>
> This suggests a bug in the device's firmware, probably it sends a
> 1-byte Device-Status reply instead of a 2-byte reply as required by the
> USB spec. You could find out for certain by using usbmon.
>
> But if that is indeed the problem, the patch below should help. I've
> seen it before; perhaps we should adopt this workaround permanently.
>
>> Relevant info:
>> -obviously, I'm using uhci
>> -the drive is SATA, connected to USB with a SATA/IDE to USB adapter
>> -this problem does not occur with a USB flash drive
>> -reverting the commit that introduced auto-suspend prevents this error.
>
> If necessary you could disable autosuspend for your drive. But first
> test this patch.
>
> Alan Stern
>
>
>
> Index: 2.6.23-rc1/drivers/usb/core/hub.c
> ===================================================================
> --- 2.6.23-rc1.orig/drivers/usb/core/hub.c
> +++ 2.6.23-rc1/drivers/usb/core/hub.c
> @@ -1644,9 +1644,10 @@ static int finish_port_resume(struct usb
> * and device drivers will know about any resume quirks.
> */
> if (status == 0) {
> + devstatus = 0;
> status = usb_get_status(udev, USB_RECIP_DEVICE, 0, &devstatus);
> if (status >= 0)
> - status = (status == 2 ? 0 : -ENODEV);
> + status = (status > 0 ? 0 : -ENODEV);
> }
>
> if (status) {
>
>

Alan,

Yes, that patch worked, and dmesg now shows the device auto-suspending
and resuming every few seconds. Thanks a lot. I hope you do merge this
patch or a workaround like it.

Dan

2007-08-10 20:43:34

by Alan Stern

[permalink] [raw]
Subject: Re: 2.6.23-rc1: USB hard disk broken

On Fri, 10 Aug 2007, Dan Zwell wrote:

> Alan,
>
> Yes, that patch worked, and dmesg now shows the device auto-suspending
> and resuming every few seconds. Thanks a lot. I hope you do merge this
> patch or a workaround like it.

I will submit it; we'll see whether anyone objects.

By the way, you can change the constant autosuspend and autoresume
behavior easily enough. All you have to do is:

echo N >/sys/bus/usb/devices/.../power/autosuspend

where "..." is the path for the disk's USB device and N is the number
of seconds the disk should be idle before it gets autosuspended. If
you use -1 for N then the disk will never autosuspend.

Alan Stern

2007-09-12 22:21:59

by Mark Lord

[permalink] [raw]
Subject: Re: 2.6.23-rc1: USB hard disk broken (REGRESSION)

Dan Zwell wrote:
> Alan Stern wrote:
>>> [ 126.512815] usb 1-1: usb auto-resume
>>> [ 126.543447] uhci_hcd 0000:00:1f.2: port 1 portsc 00a5,01
>>> [ 126.559426] usb 1-1: finish resume
>>> [ 126.561435] usb 1-1: gone after usb resume? status -19
>>> [ 126.561445] usb 1-1: can't resume, status -19
>>> [ 126.561451] hub 1-0:1.0: logical disconnect on port 1
>>> [ 126.562486] sd 5:0:0:0: [sdb] Result: hostbyte=DID_ERROR
>>> driverbyte=DRIVER_OK,SUGGEST_OK
>>
>> This suggests a bug in the device's firmware, probably it sends a
>> 1-byte Device-Status reply instead of a 2-byte reply as required by
>> the USB spec. You could find out for certain by using usbmon.
>>
>> But if that is indeed the problem, the patch below should help. I've
>> seen it before; perhaps we should adopt this workaround permanently.
>>
>>> Relevant info:
>>> -obviously, I'm using uhci
>>> -the drive is SATA, connected to USB with a SATA/IDE to USB adapter
>>> -this problem does not occur with a USB flash drive
>>> -reverting the commit that introduced auto-suspend prevents this error.
>>
>> If necessary you could disable autosuspend for your drive. But first
>> test this patch.
>>
>> Alan Stern
>>
>>
>>
>> Index: 2.6.23-rc1/drivers/usb/core/hub.c
>> ===================================================================
>> --- 2.6.23-rc1.orig/drivers/usb/core/hub.c
>> +++ 2.6.23-rc1/drivers/usb/core/hub.c
>> @@ -1644,9 +1644,10 @@ static int finish_port_resume(struct usb
>> * and device drivers will know about any resume quirks.
>> */
>> if (status == 0) {
>> + devstatus = 0;
>> status = usb_get_status(udev, USB_RECIP_DEVICE, 0, &devstatus);
>> if (status >= 0)
>> - status = (status == 2 ? 0 : -ENODEV);
>> + status = (status > 0 ? 0 : -ENODEV);
>> }
>>
>> if (status) {
>>
>>
>
> Alan,
>
> Yes, that patch worked, and dmesg now shows the device auto-suspending
> and resuming every few seconds. Thanks a lot. I hope you do merge this
> patch or a workaround like it.
>
> Dan

The same bug kills my Sandisk Cruzer Micro USB pen drives.
I plug them in, they work briefly, then the light goes out (abnormal),
and 30-second timeout/reset is needed for each subsequent access. Ugh.

They work fine in 2.6.22. I'll try the above patch here now and see if it fixes
this regression.

2007-09-12 22:40:32

by Mark Lord

[permalink] [raw]
Subject: Re: 2.6.23-rc1: USB hard disk broken (REGRESSION)

Mark Lord wrote:
> Dan Zwell wrote:
>> Alan Stern wrote:
>>>> [ 126.512815] usb 1-1: usb auto-resume
>>>> [ 126.543447] uhci_hcd 0000:00:1f.2: port 1 portsc 00a5,01
>>>> [ 126.559426] usb 1-1: finish resume
>>>> [ 126.561435] usb 1-1: gone after usb resume? status -19
>>>> [ 126.561445] usb 1-1: can't resume, status -19
>>>> [ 126.561451] hub 1-0:1.0: logical disconnect on port 1
>>>> [ 126.562486] sd 5:0:0:0: [sdb] Result: hostbyte=DID_ERROR
>>>> driverbyte=DRIVER_OK,SUGGEST_OK
>>>
>>> This suggests a bug in the device's firmware, probably it sends a
>>> 1-byte Device-Status reply instead of a 2-byte reply as required by
>>> the USB spec. You could find out for certain by using usbmon.
>>>
>>> But if that is indeed the problem, the patch below should help. I've
>>> seen it before; perhaps we should adopt this workaround permanently.
>>>
>>>> Relevant info:
>>>> -obviously, I'm using uhci
>>>> -the drive is SATA, connected to USB with a SATA/IDE to USB adapter
>>>> -this problem does not occur with a USB flash drive
>>>> -reverting the commit that introduced auto-suspend prevents this error.
>>>
>>> If necessary you could disable autosuspend for your drive. But first
>>> test this patch.
>>>
>>> Alan Stern
>>>
>>>
>>>
>>> Index: 2.6.23-rc1/drivers/usb/core/hub.c
>>> ===================================================================
>>> --- 2.6.23-rc1.orig/drivers/usb/core/hub.c
>>> +++ 2.6.23-rc1/drivers/usb/core/hub.c
>>> @@ -1644,9 +1644,10 @@ static int finish_port_resume(struct usb
>>> * and device drivers will know about any resume quirks.
>>> */
>>> if (status == 0) {
>>> + devstatus = 0;
>>> status = usb_get_status(udev, USB_RECIP_DEVICE, 0, &devstatus);
>>> if (status >= 0)
>>> - status = (status == 2 ? 0 : -ENODEV);
>>> + status = (status > 0 ? 0 : -ENODEV);
>>> }
>>>
>>> if (status) {
>>>
>>>
>>
>> Alan,
>>
>> Yes, that patch worked, and dmesg now shows the device auto-suspending
>> and resuming every few seconds. Thanks a lot. I hope you do merge this
>> patch or a workaround like it.
>>
>> Dan
>
> The same bug kills my Sandisk Cruzer Micro USB pen drives.
> I plug them in, they work briefly, then the light goes out (abnormal),
> and 30-second timeout/reset is needed for each subsequent access. Ugh.
>
> They work fine in 2.6.22. I'll try the above patch here now and see if
> it fixes
> this regression.

Nope. Patch is already in -rc6 I see, so still NFG.
We can continue blacklisting the multitudes of b0rked devices one by one,
or we can revert this change or default it to "off" for usb-storage (at least).

This really kills a lot of everyday devices. Here's my Sandisk Cruzer(s),
after forcing autosuspend=0:

Bus 005 Device 014: ID 0781:5151 SanDisk Corp. Cruzer Micro 256/512MB Flash Drive
Device Descriptor:
bLength 18
bDescriptorType 1
bcdUSB 2.00
bDeviceClass 0 (Defined at Interface level)
bDeviceSubClass 0
bDeviceProtocol 0
bMaxPacketSize0 64
idVendor 0x0781 SanDisk Corp.
idProduct 0x5151 Cruzer Micro 256/512MB Flash Drive
bcdDevice 0.10
iManufacturer 1 SanDisk Corporation
iProduct 2 Cruzer Micro
iSerial 3 20060775000CF73334D3
bNumConfigurations 1
Configuration Descriptor:
bLength 9
bDescriptorType 2
wTotalLength 32
bNumInterfaces 1
bConfigurationValue 1
iConfiguration 0
bmAttributes 0x80
(Bus Powered)
MaxPower 200mA
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 0
bAlternateSetting 0
bNumEndpoints 2
bInterfaceClass 8 Mass Storage
bInterfaceSubClass 6 SCSI
bInterfaceProtocol 80 Bulk (Zip)
iInterface 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x81 EP 1 IN
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x01 EP 1 OUT
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 1
Device Qualifier (for other device speed):
bLength 10
bDescriptorType 6
bcdUSB 2.00
bDeviceClass 0 (Defined at Interface level)
bDeviceSubClass 0
bDeviceProtocol 0
bMaxPacketSize0 64
bNumConfigurations 1
Device Status: 0x0000
(Bus Powered)