2004-04-16 22:12:35

by Konstantin V. Gavrilenko

[permalink] [raw]
Subject: 2.6.5-mm6 reiserfs and hpt366 problems

Hi guys,


first time writing to kernel mlist and with a bunch of problems :|

initial problem is regarding hpt366 controller that I have, the 2.6.5
kernel randomly crashes when I copy stuff bettween hdd's connected to
hpt366 and hdd's conencted to normal piix

2x 200gb (raid1) hdd conencted to hpt366
1x 180gb hdd connected to piix

it seems that quite a few people around have the similar problem with
2.6.x, since afaic remember I did not have this problem when I was using
2.4.24

Anyway, I applied 2.6.5-mm6 patch. The system does not hang anymore, but,
I get this weird errors filling up syslog:


Apr 16 22:45:06 filo kernel: Debug: sleeping function called from
invalid context at drivers/block/ll_rw_blk.c:1156
Apr 16 22:45:06 filo kernel: in_atomic():0, irqs_disabled():1
Apr 16 22:45:06 filo kernel: Call Trace:
Apr 16 22:45:06 filo kernel: [<c0117449>] __might_sleep+0x99/0xb0
Apr 16 22:45:06 filo kernel: [<c02950bf>] generic_unplug_device+0x1f/0x60
Apr 16 22:45:06 filo kernel: [<c02d9fc9>] unplug_slaves+0x69/0x70
Apr 16 22:45:06 filo kernel: [<c029511f>] blk_backing_dev_unplug+0x1f/0x30
Apr 16 22:45:06 filo kernel: [<c014dc99>] __wait_on_buffer+0xd9/0xe0
Apr 16 22:45:06 filo kernel: [<c0117770>] autoremove_wake_function+0x0/0x50
Apr 16 22:45:06 filo kernel: [<c013998c>] mark_page_accessed+0x2c/0x40
Apr 16 22:45:06 filo kernel: [<c0117770>] autoremove_wake_function+0x0/0x50
Apr 16 22:45:06 filo kernel: [<c01a91da>] flush_commit_list+0x29a/0x3a0
Apr 16 22:45:06 filo kernel: [<c01adf1a>] do_journal_end+0x85a/0xb60
Apr 16 22:45:06 filo kernel: [<c0136b30>] pdflush+0x0/0x30
Apr 16 22:45:06 filo kernel: [<c01acc8d>] journal_end_sync+0x4d/0x90
Apr 16 22:45:06 filo kernel: [<c019964d>] reiserfs_sync_fs+0x4d/0x60
Apr 16 22:45:06 filo kernel: [<c03776b7>] __down_failed_trylock+0x7/0xc
Apr 16 22:45:06 filo kernel: [<c0152b81>] sync_supers+0xb1/0xc0
Apr 16 22:45:06 filo kernel: [<c01361d6>] wb_kupdate+0x56/0x140
Apr 16 22:45:06 filo kernel: [<c0377974>] schedule+0x2a4/0x480
Apr 16 22:45:06 filo kernel: [<c0136a54>] __pdflush+0xa4/0x180
Apr 16 22:45:06 filo kernel: [<c0136b58>] pdflush+0x28/0x30
Apr 16 22:45:06 filo kernel: [<c0136180>] wb_kupdate+0x0/0x140
Apr 16 22:45:06 filo kernel: [<c0136b30>] pdflush+0x0/0x30
Apr 16 22:45:06 filo kernel: [<c012a8aa>] kthread+0xaa/0xb0
Apr 16 22:45:06 filo kernel: [<c012a800>] kthread+0x0/0xb0
Apr 16 22:45:06 filo kernel: [<c01022e5>] kernel_thread_helper+0x5/0x10



furthermore,

Sometimes I get shitloads of these in syslog:

Apr 16 21:57:48 filo kernel: is_tree_node: node level 59305 does not
match to the expected one 1
Apr 16 21:57:48 filo kernel: vs-5150: search_by_key: invalid format
found in block 691533. Fsck?
Apr 16 21:57:48 filo kernel: is_tree_node: node level 59305 does not
match to the expected one 1
Apr 16 21:57:48 filo kernel: vs-5150: search_by_key: invalid format
found in block 691533. Fsck?
Apr 16 21:57:48 filo kernel: is_tree_node: node level 59305 does not
match to the expected one 1
Apr 16 21:57:48 filo kernel: vs-5150: search_by_key: invalid format
found in block 691533. Fsck?
Apr 16 21:57:48 filo kernel: is_tree_node: node level 59305 does not
match to the expected one 1
Apr 16 21:57:48 filo kernel: vs-5150: search_by_key: invalid format
found in block 691533. Fsck?


I have unmounted the partition and run reiserfsck on it, that reported
that everything is fine, and nothing to worry about.

Can someone advise whether I am likely to run into problems later on
with these types of errors? SInce it is my first expirience with reiserfs.



--
Respectfully,
Konstantin V. Gavrilenko

Managing Director
Arhont Ltd - Information Security

web: http://www.arhont.com
http://www.wi-foo.com
e-mail: [email protected]

tel: +44 (0) 870 44 31337
fax: +44 (0) 117 969 0141

PGP: Key ID - 0x4F3608F7
PGP: Server - keyserver.pgp.com


2004-04-16 22:51:38

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.5-mm6 reiserfs and hpt366 problems

Konstantin Gavrilenko <[email protected]> wrote:
>
> Anyway, I applied 2.6.5-mm6 patch. The system does not hang anymore, but,
> I get this weird errors filling up syslog:
>
>
> Apr 16 22:45:06 filo kernel: Debug: sleeping function called from
> invalid context at drivers/block/ll_rw_blk.c:1156
> Apr 16 22:45:06 filo kernel: in_atomic():0, irqs_disabled():1
> Apr 16 22:45:06 filo kernel: Call Trace:
> Apr 16 22:45:06 filo kernel: [<c0117449>] __might_sleep+0x99/0xb0
> Apr 16 22:45:06 filo kernel: [<c02950bf>] generic_unplug_device+0x1f/0x60
> Apr 16 22:45:06 filo kernel: [<c02d9fc9>] unplug_slaves+0x69/0x70
> Apr 16 22:45:06 filo kernel: [<c029511f>] blk_backing_dev_unplug+0x1f/0x30
> Apr 16 22:45:06 filo kernel: [<c014dc99>] __wait_on_buffer+0xd9/0xe0
> Apr 16 22:45:06 filo kernel: [<c0117770>] autoremove_wake_function+0x0/0x50
> Apr 16 22:45:06 filo kernel: [<c013998c>] mark_page_accessed+0x2c/0x40
> Apr 16 22:45:06 filo kernel: [<c0117770>] autoremove_wake_function+0x0/0x50
> Apr 16 22:45:06 filo kernel: [<c01a91da>] flush_commit_list+0x29a/0x3a0
> Apr 16 22:45:06 filo kernel: [<c01adf1a>] do_journal_end+0x85a/0xb60
> Apr 16 22:45:06 filo kernel: [<c0136b30>] pdflush+0x0/0x30
> Apr 16 22:45:06 filo kernel: [<c01acc8d>] journal_end_sync+0x4d/0x90
> Apr 16 22:45:06 filo kernel: [<c019964d>] reiserfs_sync_fs+0x4d/0x60
> Apr 16 22:45:06 filo kernel: [<c03776b7>] __down_failed_trylock+0x7/0xc
> Apr 16 22:45:06 filo kernel: [<c0152b81>] sync_supers+0xb1/0xc0
> Apr 16 22:45:06 filo kernel: [<c01361d6>] wb_kupdate+0x56/0x140
> Apr 16 22:45:06 filo kernel: [<c0377974>] schedule+0x2a4/0x480
> Apr 16 22:45:06 filo kernel: [<c0136a54>] __pdflush+0xa4/0x180
> Apr 16 22:45:06 filo kernel: [<c0136b58>] pdflush+0x28/0x30
> Apr 16 22:45:06 filo kernel: [<c0136180>] wb_kupdate+0x0/0x140
> Apr 16 22:45:06 filo kernel: [<c0136b30>] pdflush+0x0/0x30

Someone else saw this, but they had an EVMS patch applied. Have you
applied any additional patches here?

2004-04-16 23:37:11

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.5-mm6 reiserfs and hpt366 problems

Konstantin Gavrilenko <[email protected]> wrote:
>
> Apr 16 22:45:06 filo kernel: Debug: sleeping function called from
> invalid context at drivers/block/ll_rw_blk.c:1156
> Apr 16 22:45:06 filo kernel: in_atomic():0, irqs_disabled():1
> Apr 16 22:45:06 filo kernel: Call Trace:
> Apr 16 22:45:06 filo kernel: [<c0117449>] __might_sleep+0x99/0xb0
> Apr 16 22:45:06 filo kernel: [<c02950bf>] generic_unplug_device+0x1f/0x60
> Apr 16 22:45:06 filo kernel: [<c02d9fc9>] unplug_slaves+0x69/0x70

Ah, I see it. There are several instances of unplug_slaves(). The one in
raid1.c is incorrectly calling ->unplug_fn() under spin_lock_irq().