2005-02-22 18:45:18

by Parag Warudkar

[permalink] [raw]
Subject: Re: [linux-usb-devel] 2.6: USB Storage hangs machine on bootup for ~2 minutes

On Monday 07 February 2005 11:18 am, Alan Stern wrote:
> > I am already running 2.6.11-rc3 and the problem has not gone away. Are
> > there any relevant fixes in -bk?
>
> No.
>
> > Attached is the bootup log after enabling CONFIG_USB_STORAGE_DEBUG.
>
> You said that the system hangs during bootup. ?Where in the log does that
> hang occur? ?The log itself looks perfectly normal. ?The Maxtor drive is
> scanned, the partitions detected, and then apparently one or two
> partitions are mounted. ?There's no indication of any problem.

I have tracked down the reason for this hang - it seems that kudzu gets stuck
in D state on usb_device_read - Below SysRQ+T from 2.6.11-rc4 - always
reproducible.

kudzu D 00000000ffffffff 0 4424 4472
(NOTLB)
ffff81002bebfd98 0000000000000086 ffff81002c538150 ffff81002f21d00e
000000078847ce40 ffff81002b5977c0 000000000000fd38 ffffffff803defc0
ffff81002bebfd88 ffffffff80219b32
Call Trace:
<ffffffff80219b32>{_atomic_dec_and_lock+290} <ffffffff80383835>{__down+421}
<ffffffff80133e30>{default_wake_function+0}
<ffffffff803868ae>{__down_failed+53}
<ffffffff802db9f1>{.text.lock.usb+5}
<ffffffff802edb35>{usb_device_read+229}
<ffffffff801998d1>{vfs_read+225} <ffffffff80199bd0>{sys_read+80}
<ffffffff8010ed1e>{system_call+126}

Thereafter, if I try to mount the USB drive, even mount gets stuck.

Parag


2005-02-22 20:41:32

by Alan Stern

[permalink] [raw]
Subject: Re: [linux-usb-devel] 2.6: USB Storage hangs machine on bootup for ~2 minutes

On Tue, 22 Feb 2005, Parag Warudkar wrote:

> > You said that the system hangs during bootup. ?Where in the log does that
> > hang occur? ?The log itself looks perfectly normal. ?The Maxtor drive is
> > scanned, the partitions detected, and then apparently one or two
> > partitions are mounted. ?There's no indication of any problem.
>
> I have tracked down the reason for this hang - it seems that kudzu gets stuck
> in D state on usb_device_read - Below SysRQ+T from 2.6.11-rc4 - always
> reproducible.
>
> kudzu D 00000000ffffffff 0 4424 4472
> (NOTLB)
> ffff81002bebfd98 0000000000000086 ffff81002c538150 ffff81002f21d00e
> 000000078847ce40 ffff81002b5977c0 000000000000fd38 ffffffff803defc0
> ffff81002bebfd88 ffffffff80219b32
> Call Trace:
> <ffffffff80219b32>{_atomic_dec_and_lock+290} <ffffffff80383835>{__down+421}
> <ffffffff80133e30>{default_wake_function+0}
> <ffffffff803868ae>{__down_failed+53}
> <ffffffff802db9f1>{.text.lock.usb+5}
> <ffffffff802edb35>{usb_device_read+229}
> <ffffffff801998d1>{vfs_read+225} <ffffffff80199bd0>{sys_read+80}
> <ffffffff8010ed1e>{system_call+126}
>
> Thereafter, if I try to mount the USB drive, even mount gets stuck.

usb_device_read acquires a couple of locks, one for the USB bus list and
one for the root hub of the bus it's looking at. I don't know which one
occurs at offset 229 on your system -- maybe you can tell. Oddly enough,
neither of those locks is for a USB device like the Maxtor drive. So it's
not at all clear why plugging in the drive should mess up kudzu. Or why
the blockage should clear up after a couple of minutes.

Perhaps we can find out by looking at other entries in the stack trace.
Of particular interest are the khubd, usb-storage, and scsi_eh processes.

Alan Stern

2005-02-22 22:51:07

by Parag Warudkar

[permalink] [raw]
Subject: Re: [linux-usb-devel] 2.6: USB Storage hangs machine on bootup for ~2 minutes

On Tuesday 22 February 2005 03:41 pm, Alan Stern wrote:
> usb_device_read acquires a couple of locks, one for the USB bus list and
> one for the root hub of the bus it's looking at. ?I don't know which one
> occurs at offset 229 on your system -- maybe you can tell. ?Oddly enough,
> neither of those locks is for a USB device like the Maxtor drive. ?So it's
> not at all clear why plugging in the drive should mess up kudzu. ?Or why
> the blockage should clear up after a couple of minutes.
>
> Perhaps we can find out by looking at other entries in the stack trace. ?
> Of particular interest are the khubd, usb-storage, and scsi_eh processes.

Alan,
See below for stack traces and also note that the stack traces are after I
modified usb_device_read to do down_interruptible instead of down. (kudzu
gets stuck regardless though.) Let me know if you want me to revert the
down_interruptible change and repost the stack trace.

I wrongly related this to the 2 minute hang - this one is forever if I let
kudzu start during boot. If I run kudzu after boot is complete, it gets stuck
and everything else on that drive (mount, unmount ..) also gets stuck. Sorry
about the confusion.

Attached is the disassembly of usb_device_read from my machine.

Parag

SysRQ + T for relevant processes
======================
hald D 00000020e12be31a 0 2558 1 3272 2545
(NOTLB)
ffff81002c76fb48 0000000000000082 ffff81002c76fb28 ffffffff88515044
000000862c76fb08 ffff81002eb10800 0000000000000249 ffff810001c56030
ffff81002c76fc08 0000000000000002
Call Trace:
<ffffffff88515044>{:scsi_mod:scsi_request_fn+2356}
<ffffffff80385f0f>{io_schedule+15} <ffffffff801698d6>{sync_page+70}
<ffffffff80386295>{__wait_on_bit_lock+69}
<ffffffff80169890>{sync_page+0}
<ffffffff8016a1b7>{__lock_page+167}
<ffffffff8015c860>{wake_bit_function+0}
<ffffffff8015c860>{wake_bit_function+0}
<ffffffff8016aed4>{do_generic_mapping_read+660}
<ffffffff8016b1f0>{file_read_actor+0}
<ffffffff8016d1d4>{__generic_file_aio_read+420}
<ffffffff8016d39b>{generic_file_read+187}
<ffffffff8015c820>{autoremove_wake_function+0}
<ffffffff80186150>{do_brk+720} <ffffffff801998d1>{vfs_read+225}
<ffffffff80199bd0>{sys_read+80} <ffffffff8010ed1e>{system_call+126}

scsi_eh_2 D 00000000ffffffff 0 3581 1 3582 3277
(L-TLB)
ffff81002bdc1d88 0000000000000046 00000000000001e3 ffff81002bed8800
ffff81002bdc1d48 ffff81002c25a800 0000000000000812 ffffffff803df080
ffff81002bdc1ed0 ffff81002c25a800
Call Trace:
<ffffffff803846d5>{wait_for_completion+437}
<ffffffff80133e30>{default_wake_function+0}
<ffffffff80133e30>{default_wake_function+0}
<ffffffff88535be3>{:usb_storage:usb_stor_stop_transport+35}
<ffffffff88534250>{:usb_storage:command_abort+256}
<ffffffff88511f9c>{:scsi_mod:scsi_error_handler+2172}
<ffffffff8010f7ef>{child_rip+8}
<ffffffff88511720>{:scsi_mod:scsi_error_handler+0}
<ffffffff8010f7e7>{child_rip+0}
usb-storage D 00000000ffffffff 0 3582 1 3627 3581
(L-TLB)
ffff81002b8e1c08 0000000000000046 ffff81002b9e1000 0000000000000010
000000762b8e1c98 ffff81002bed8800 00000000000003dd ffff81002eb10800
00000000c0040280 000000000000001f
Call Trace:
<ffffffff803846d5>{wait_for_completion+437}
<ffffffff803843e1>{thread_return+253}
<ffffffff80133e30>{default_wake_function+0}
<ffffffff80133e30>{default_wake_function+0}
<ffffffff88535166>{:usb_storage:usb_stor_msg_common+550}
<ffffffff80120426>{dma_unmap_sg+134}
<ffffffff8853570f>{:usb_storage:usb_stor_bulk_transfer_buf+143}
<ffffffff88535f8b>{:usb_storage:usb_stor_Bulk_transport+203}
<ffffffff885358eb>{:usb_storage:usb_stor_invoke_transport+59}
<ffffffff88534dfb>{:usb_storage:usb_stor_transparent_scsi_command+27}
<ffffffff88536974>{:usb_storage:usb_stor_control_thread+756}
<ffffffff801337c3>{finish_task_switch+195}
<ffffffff8010f7ef>{child_rip+8}
<ffffffff88536680>{:usb_storage:usb_stor_control_thread+0}
<ffffffff8010f7e7>{child_rip+0}

scsi_eh_3 S 00000000ffffffff 0 3627 1 3634 3582
(L-TLB)
ffff81002bd47d68 0000000000000046 ffff81002bd47d28 ffffffff80219b32
000000742bc387c0 ffff81002bc387c0 0000000000000226 ffff81002b9fc030
ffff81002bd47d48 ffffffff80147ab1
Call Trace:
<ffffffff80219b32>{_atomic_dec_and_lock+290} <ffffffff80147ab1>{free_uid+33}
<ffffffff80383bd6>{__down_interruptible+486}
<ffffffff80133e30>{default_wake_function+0}
<ffffffff80386928>{__down_failed_interruptible+53}
<ffffffff88512a75>{:scsi_mod:.text.lock.scsi_error+45}
<ffffffff8010f7ef>{child_rip+8}
<ffffffff88511720>{:scsi_mod:scsi_error_handler+0}
<ffffffff8010f7e7>{child_rip+0}
usb-storage S 00000020f04e3d81 0 3634 1 3627
(L-TLB)
ffff81002bc53df8 0000000000000046 ffff81002bc53d80 0000000000001000
000000732b9eba7c ffff81002b9fc030 00000000000005ad ffff81002ebc9800
ffff81002bc53de8 ffffffff8853570f
Call Trace:
<ffffffff8853570f>{:usb_storage:usb_stor_bulk_transfer_buf+143}
<ffffffff80383bd6>{__down_interruptible+486}
<ffffffff80133e30>{default_wake_function+0}
<ffffffff80386928>{__down_failed_interruptible+53}
<ffffffff88537bb8>{:usb_storage:.text.lock.usb+5}
<ffffffff801337c3>{finish_task_switch+195}
<ffffffff8010f7ef>{child_rip+8}
<ffffffff88536680>{:usb_storage:usb_stor_control_thread+0}
<ffffffff8010f7e7>{child_rip+0}

khubd S 0000001ddd381da1 0 125 1 182 9
(L-TLB)
ffff810001ecbe18 0000000000000046 0000000000000246 ffff81002ba2c400
0000007401ecbdd8 ffff810001e9a800 0000000000003023 ffff81002bed8070
ffff810001ecbe18 ffffffff8015c4c9
Call Trace:
<ffffffff8015c4c9>{prepare_to_wait+345} <ffffffff802e0216>{hub_thread+4118}
<ffffffff8016f0cf>{free_pages_bulk+1007}
<ffffffff8015c820>{autoremove_wake_function+0}
<ffffffff8015c820>{autoremove_wake_function+0}
<ffffffff8010f7ef>{child_rip+8}
<ffffffff802df200>{hub_thread+0} <ffffffff8010f7e7>{child_rip+0}



Attachments:
(No filename) (6.09 kB)
usb_dv_rd.dis (5.74 kB)
Download all attachments

2005-02-23 15:51:54

by Alan Stern

[permalink] [raw]
Subject: Re: [linux-usb-devel] 2.6: USB Storage hangs machine on bootup for ~2 minutes

On Tue, 22 Feb 2005, Parag Warudkar wrote:

> Alan,
> See below for stack traces and also note that the stack traces are after I
> modified usb_device_read to do down_interruptible instead of down. (kudzu
> gets stuck regardless though.) Let me know if you want me to revert the
> down_interruptible change and repost the stack trace.
>
> I wrongly related this to the 2 minute hang - this one is forever if I let
> kudzu start during boot. If I run kudzu after boot is complete, it gets stuck
> and everything else on that drive (mount, unmount ..) also gets stuck. Sorry
> about the confusion.

Well, this problem looks like something for which a patch was submitted
about a week ago. Try applying

http://marc.theaimsgroup.com/?l=linux-usb-devel&m=110859441830485&w=2

and see if it helps. This doesn't bring us any closer to understanding
the 2-minute delay, though.

> Attached is the disassembly of usb_device_read from my machine.

I hope it doesn't become necessary to decode that!

Alan Stern

> Parag
>
> SysRQ + T for relevant processes
> ======================
> hald D 00000020e12be31a 0 2558 1 3272 2545
> (NOTLB)
> ffff81002c76fb48 0000000000000082 ffff81002c76fb28 ffffffff88515044
> 000000862c76fb08 ffff81002eb10800 0000000000000249 ffff810001c56030
> ffff81002c76fc08 0000000000000002
> Call Trace:
> <ffffffff88515044>{:scsi_mod:scsi_request_fn+2356}
> <ffffffff80385f0f>{io_schedule+15} <ffffffff801698d6>{sync_page+70}
> <ffffffff80386295>{__wait_on_bit_lock+69}
> <ffffffff80169890>{sync_page+0}
> <ffffffff8016a1b7>{__lock_page+167}
> <ffffffff8015c860>{wake_bit_function+0}
> <ffffffff8015c860>{wake_bit_function+0}
> <ffffffff8016aed4>{do_generic_mapping_read+660}
> <ffffffff8016b1f0>{file_read_actor+0}
> <ffffffff8016d1d4>{__generic_file_aio_read+420}
> <ffffffff8016d39b>{generic_file_read+187}
> <ffffffff8015c820>{autoremove_wake_function+0}
> <ffffffff80186150>{do_brk+720} <ffffffff801998d1>{vfs_read+225}
> <ffffffff80199bd0>{sys_read+80} <ffffffff8010ed1e>{system_call+126}
>
> scsi_eh_2 D 00000000ffffffff 0 3581 1 3582 3277
> (L-TLB)
> ffff81002bdc1d88 0000000000000046 00000000000001e3 ffff81002bed8800
> ffff81002bdc1d48 ffff81002c25a800 0000000000000812 ffffffff803df080
> ffff81002bdc1ed0 ffff81002c25a800
> Call Trace:
> <ffffffff803846d5>{wait_for_completion+437}
> <ffffffff80133e30>{default_wake_function+0}
> <ffffffff80133e30>{default_wake_function+0}
> <ffffffff88535be3>{:usb_storage:usb_stor_stop_transport+35}
> <ffffffff88534250>{:usb_storage:command_abort+256}
> <ffffffff88511f9c>{:scsi_mod:scsi_error_handler+2172}
> <ffffffff8010f7ef>{child_rip+8}
> <ffffffff88511720>{:scsi_mod:scsi_error_handler+0}
> <ffffffff8010f7e7>{child_rip+0}
> usb-storage D 00000000ffffffff 0 3582 1 3627 3581
> (L-TLB)
> ffff81002b8e1c08 0000000000000046 ffff81002b9e1000 0000000000000010
> 000000762b8e1c98 ffff81002bed8800 00000000000003dd ffff81002eb10800
> 00000000c0040280 000000000000001f
> Call Trace:
> <ffffffff803846d5>{wait_for_completion+437}
> <ffffffff803843e1>{thread_return+253}
> <ffffffff80133e30>{default_wake_function+0}
> <ffffffff80133e30>{default_wake_function+0}
> <ffffffff88535166>{:usb_storage:usb_stor_msg_common+550}
> <ffffffff80120426>{dma_unmap_sg+134}
> <ffffffff8853570f>{:usb_storage:usb_stor_bulk_transfer_buf+143}
> <ffffffff88535f8b>{:usb_storage:usb_stor_Bulk_transport+203}
> <ffffffff885358eb>{:usb_storage:usb_stor_invoke_transport+59}
> <ffffffff88534dfb>{:usb_storage:usb_stor_transparent_scsi_command+27}
> <ffffffff88536974>{:usb_storage:usb_stor_control_thread+756}
> <ffffffff801337c3>{finish_task_switch+195}
> <ffffffff8010f7ef>{child_rip+8}
> <ffffffff88536680>{:usb_storage:usb_stor_control_thread+0}
> <ffffffff8010f7e7>{child_rip+0}
>
> scsi_eh_3 S 00000000ffffffff 0 3627 1 3634 3582
> (L-TLB)
> ffff81002bd47d68 0000000000000046 ffff81002bd47d28 ffffffff80219b32
> 000000742bc387c0 ffff81002bc387c0 0000000000000226 ffff81002b9fc030
> ffff81002bd47d48 ffffffff80147ab1
> Call Trace:
> <ffffffff80219b32>{_atomic_dec_and_lock+290} <ffffffff80147ab1>{free_uid+33}
> <ffffffff80383bd6>{__down_interruptible+486}
> <ffffffff80133e30>{default_wake_function+0}
> <ffffffff80386928>{__down_failed_interruptible+53}
> <ffffffff88512a75>{:scsi_mod:.text.lock.scsi_error+45}
> <ffffffff8010f7ef>{child_rip+8}
> <ffffffff88511720>{:scsi_mod:scsi_error_handler+0}
> <ffffffff8010f7e7>{child_rip+0}
> usb-storage S 00000020f04e3d81 0 3634 1 3627
> (L-TLB)
> ffff81002bc53df8 0000000000000046 ffff81002bc53d80 0000000000001000
> 000000732b9eba7c ffff81002b9fc030 00000000000005ad ffff81002ebc9800
> ffff81002bc53de8 ffffffff8853570f
> Call Trace:
> <ffffffff8853570f>{:usb_storage:usb_stor_bulk_transfer_buf+143}
> <ffffffff80383bd6>{__down_interruptible+486}
> <ffffffff80133e30>{default_wake_function+0}
> <ffffffff80386928>{__down_failed_interruptible+53}
> <ffffffff88537bb8>{:usb_storage:.text.lock.usb+5}
> <ffffffff801337c3>{finish_task_switch+195}
> <ffffffff8010f7ef>{child_rip+8}
> <ffffffff88536680>{:usb_storage:usb_stor_control_thread+0}
> <ffffffff8010f7e7>{child_rip+0}
>
> khubd S 0000001ddd381da1 0 125 1 182 9
> (L-TLB)
> ffff810001ecbe18 0000000000000046 0000000000000246 ffff81002ba2c400
> 0000007401ecbdd8 ffff810001e9a800 0000000000003023 ffff81002bed8070
> ffff810001ecbe18 ffffffff8015c4c9
> Call Trace:
> <ffffffff8015c4c9>{prepare_to_wait+345} <ffffffff802e0216>{hub_thread+4118}
> <ffffffff8016f0cf>{free_pages_bulk+1007}
> <ffffffff8015c820>{autoremove_wake_function+0}
> <ffffffff8015c820>{autoremove_wake_function+0}
> <ffffffff8010f7ef>{child_rip+8}
> <ffffffff802df200>{hub_thread+0} <ffffffff8010f7e7>{child_rip+0}