2005-09-04 00:15:35

by Bret Towe

[permalink] [raw]
Subject: nfs4 client bug

i encountered the following error while using nfs4
ive hit this error i think twice now not sure what causes it yet tho
this time the only io related items going on was emerge sync running
in the background (which shouldnt of touched nfs at all) and xmms
playing some music

another problem i had was iowait was showing near 100% but nothing
over nfs was working
but no errors or anything was showing in dmesg if i encounter this
again is there a way to
find out where it locked up so i can give a report on what the problem is?

attached is my config the box is an athlon64
if any further information is needed let me know

Unable to handle kernel paging request at 0000000000100108 RIP:
<ffffffff80189538>{generic_drop_inode+56}
PGD 27bec067 PUD 27be9067 PMD 0
Oops: 0002 [1]
CPU 0
Modules linked in: fglrx agpgart snd_seq_midi snd_emu10k1_synth
snd_emux_synth snd_seq_virmidi snd_seq_midi_emul snd_pcm_oss
snd_mixer_oss snd_seq_oss snd_seq_midi_event snd_seq snd_emu10k1
snd_rawmidi snd_seq_device snd_ac97_codec snd_pcm snd_timer
snd_page_alloc snd_util_mem snd_hwdep snd w83627hf i2c_sensor i2c_isa
i2c_core usb_storage r8169 ehci_hcd uhci_hcd dm_mirror dm_snapshot
dm_mod
Pid: 149, comm: kswapd0 Tainted: P M 2.6.13
RIP: 0010:[<ffffffff80189538>] <ffffffff80189538>{generic_drop_inode+56}
RSP: 0018:ffff81002f9d9b78 EFLAGS: 00010246
RAX: 0000000000100100 RBX: ffff810022d4d950 RCX: 0000000000200200
RDX: ffff810022d4d960 RSI: ffff81002ea21000 RDI: ffff810022d4d950
RBP: ffff810022d4d950 R08: 00000000fffffffa R09: ffff810022d4da68
R10: 0000000000000001 R11: ffffffff80189500 R12: 0000000000000000
R13: ffff810022d4d7d0 R14: ffff810022d4d860 R15: ffff81002c3cec00
FS: 00002aaaabc64f80(0000) GS:ffffffff805bc800(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000100108 CR3: 0000000027468000 CR4: 00000000000006e0
Process kswapd0 (pid: 149, threadinfo ffff81002f9d8000, task ffff81002f9d2760)
Stack: ffff81002a65fa00 ffffffff801e2925 000000012f9d9bf8 ffff81002f9d9c28
ffffffffffffffff ffff81002f9d9c28 ffff81002f9d9c10 ffff810022d4d938
0000000000000001 0000000000000000
Call Trace:<ffffffff801e2925>{__nfs_revalidate_inode+261}
<ffffffff80151dcf>{find_get_pages_tag+31}
<ffffffff8015af2a>{pagevec_lookup_tag+26}
<ffffffff801517fe>{wait_on_page_writeback_range+206}
<ffffffff801fc0ca>{nfs_do_return_delegation+42}
<ffffffff801fc1f5>{nfs_inode_return_delegation+197}
<ffffffff801e3910>{nfs4_clear_inode+32}
<ffffffff8018839e>{clear_inode+158}
<ffffffff80188fee>{dispose_list+94}
<ffffffff80189222>{shrink_icache_memory+434}
<ffffffff8015b77b>{shrink_slab+219} <ffffffff8015cb99>{balance_pgdat+617}
<ffffffff8015c932>{balance_pgdat+2} <ffffffff8015ce17>{kswapd+295}
<ffffffff80144730>{autoremove_wake_function+0}
<ffffffff80144730>{autoremove_wake_function+0}
<ffffffff8010f3e6>{child_rip+8} <ffffffff8015ccf0>{kswapd+0}
<ffffffff8010f3de>{child_rip+0}

Code: 48 89 48 08 48 89 01 48 8b 05 8a 4e 30 00 48 89 50 08 48 89
RIP <ffffffff80189538>{generic_drop_inode+56} RSP <ffff81002f9d9b78>
CR2: 0000000000100108
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at "fs/inode.c":1142
invalid operand: 0000 [2]
CPU 0
Modules linked in: fglrx agpgart snd_seq_midi snd_emu10k1_synth
snd_emux_synth snd_seq_virmidi snd_seq_midi_emul snd_pcm_oss
snd_mixer_oss snd_seq_oss snd_seq_midi_event snd_seq snd_emu10k1
snd_rawmidi snd_seq_device snd_ac97_codec snd_pcm snd_timer
snd_page_alloc snd_util_mem snd_hwdep snd w83627hf i2c_sensor i2c_isa
i2c_core usb_storage r8169 ehci_hcd uhci_hcd dm_mirror dm_snapshot
dm_mod
Pid: 9539, comm: rpciod/0 Tainted: P M 2.6.13
RIP: 0010:[<ffffffff8018853e>] <ffffffff8018853e>{iput+30}
RSP: 0018:ffff81002a181e08 EFLAGS: 00010246
RAX: ffffffff80494a00 RBX: ffff810023da8d40 RCX: ffff81002311f490
RDX: ffff810022919680 RSI: ffff81002614da40 RDI: ffff810023da8d40
RBP: ffff8100286af780 R08: ffff8100230f64e0 R09: 0000000000000000
R10: 0000000000000001 R11: ffffffff801f8c90 R12: ffff81002311f480
R13: ffff81002614da00 R14: ffff81002c3cec00 R15: ffffffff805c3fb0
FS: 00002aaaaade6b00(0000) GS:ffffffff805bc800(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00002aaaaade0680 CR3: 0000000022d19000 CR4: 00000000000006e0
Process rpciod/0 (pid: 9539, threadinfo ffff81002a180000, task ffff81002a4a2330)
Stack: ffff81002311f480 ffffffff801fb0fd 00000000fffffff3 ffff81002fb0f1c0
ffff8100286af780 ffffffff801efa51 ffff81002fb0f1c0 ffff81002d9be310
ffff81002fb0f2a8 ffff81002d9be320
Call Trace:<ffffffff801fb0fd>{nfs4_put_open_state+109}
<ffffffff801efa51>{nfs4_close_done+225}
<ffffffff803940a5>{__rpc_execute+165}
<ffffffff8014046a>{worker_thread+442}
<ffffffff8012e060>{default_wake_function+0}
<ffffffff8012e060>{default_wake_function+0}
<ffffffff801402b0>{worker_thread+0} <ffffffff8014424d>{kthread+205}
<ffffffff8010f3e6>{child_rip+8}
<ffffffff80144290>{keventd_create_kthread+0}
<ffffffff80144180>{kthread+0} <ffffffff8010f3de>{child_rip+0}


Code: 0f 0b a3 e0 61 3c 80 ff ff ff ff c2 76 04 66 66 66 90 48 85
RIP <ffffffff8018853e>{iput+30} RSP <ffff81002a181e08>


Attachments:
(No filename) (5.11 kB)
ghoststar-config (31.47 kB)
Download all attachments

2005-09-04 03:05:11

by Bret Towe

[permalink] [raw]
Subject: Re: nfs4 client bug

On 9/3/05, Bret Towe <[email protected]> wrote:
> i encountered the following error while using nfs4
> ive hit this error i think twice now not sure what causes it yet tho
> this time the only io related items going on was emerge sync running
> in the background (which shouldnt of touched nfs at all) and xmms
> playing some music
>
> another problem i had was iowait was showing near 100% but nothing
> over nfs was working
> but no errors or anything was showing in dmesg if i encounter this
> again is there a way to
> find out where it locked up so i can give a report on what the problem is?
>
> attached is my config the box is an athlon64
> if any further information is needed let me know
>
> Unable to handle kernel paging request at 0000000000100108 RIP:
> <ffffffff80189538>{generic_drop_inode+56}
> PGD 27bec067 PUD 27be9067 PMD 0
> Oops: 0002 [1]
> CPU 0
> Modules linked in: fglrx agpgart snd_seq_midi snd_emu10k1_synth
> snd_emux_synth snd_seq_virmidi snd_seq_midi_emul snd_pcm_oss
> snd_mixer_oss snd_seq_oss snd_seq_midi_event snd_seq snd_emu10k1
> snd_rawmidi snd_seq_device snd_ac97_codec snd_pcm snd_timer
> snd_page_alloc snd_util_mem snd_hwdep snd w83627hf i2c_sensor i2c_isa
> i2c_core usb_storage r8169 ehci_hcd uhci_hcd dm_mirror dm_snapshot
> dm_mod
> Pid: 149, comm: kswapd0 Tainted: P M 2.6.13
> RIP: 0010:[<ffffffff80189538>] <ffffffff80189538>{generic_drop_inode+56}
> RSP: 0018:ffff81002f9d9b78 EFLAGS: 00010246
> RAX: 0000000000100100 RBX: ffff810022d4d950 RCX: 0000000000200200
> RDX: ffff810022d4d960 RSI: ffff81002ea21000 RDI: ffff810022d4d950
> RBP: ffff810022d4d950 R08: 00000000fffffffa R09: ffff810022d4da68
> R10: 0000000000000001 R11: ffffffff80189500 R12: 0000000000000000
> R13: ffff810022d4d7d0 R14: ffff810022d4d860 R15: ffff81002c3cec00
> FS: 00002aaaabc64f80(0000) GS:ffffffff805bc800(0000) knlGS:0000000000000000
> CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 0000000000100108 CR3: 0000000027468000 CR4: 00000000000006e0
> Process kswapd0 (pid: 149, threadinfo ffff81002f9d8000, task ffff81002f9d2760)
> Stack: ffff81002a65fa00 ffffffff801e2925 000000012f9d9bf8 ffff81002f9d9c28
> ffffffffffffffff ffff81002f9d9c28 ffff81002f9d9c10 ffff810022d4d938
> 0000000000000001 0000000000000000
> Call Trace:<ffffffff801e2925>{__nfs_revalidate_inode+261}
> <ffffffff80151dcf>{find_get_pages_tag+31}
> <ffffffff8015af2a>{pagevec_lookup_tag+26}
> <ffffffff801517fe>{wait_on_page_writeback_range+206}
> <ffffffff801fc0ca>{nfs_do_return_delegation+42}
> <ffffffff801fc1f5>{nfs_inode_return_delegation+197}
> <ffffffff801e3910>{nfs4_clear_inode+32}
> <ffffffff8018839e>{clear_inode+158}
> <ffffffff80188fee>{dispose_list+94}
> <ffffffff80189222>{shrink_icache_memory+434}
> <ffffffff8015b77b>{shrink_slab+219} <ffffffff8015cb99>{balance_pgdat+617}
> <ffffffff8015c932>{balance_pgdat+2} <ffffffff8015ce17>{kswapd+295}
> <ffffffff80144730>{autoremove_wake_function+0}
> <ffffffff80144730>{autoremove_wake_function+0}
> <ffffffff8010f3e6>{child_rip+8} <ffffffff8015ccf0>{kswapd+0}
> <ffffffff8010f3de>{child_rip+0}
>
> Code: 48 89 48 08 48 89 01 48 8b 05 8a 4e 30 00 48 89 50 08 48 89
> RIP <ffffffff80189538>{generic_drop_inode+56} RSP <ffff81002f9d9b78>
> CR2: 0000000000100108
> ----------- [cut here ] --------- [please bite here ] ---------
> Kernel BUG at "fs/inode.c":1142
> invalid operand: 0000 [2]
> CPU 0
> Modules linked in: fglrx agpgart snd_seq_midi snd_emu10k1_synth
> snd_emux_synth snd_seq_virmidi snd_seq_midi_emul snd_pcm_oss
> snd_mixer_oss snd_seq_oss snd_seq_midi_event snd_seq snd_emu10k1
> snd_rawmidi snd_seq_device snd_ac97_codec snd_pcm snd_timer
> snd_page_alloc snd_util_mem snd_hwdep snd w83627hf i2c_sensor i2c_isa
> i2c_core usb_storage r8169 ehci_hcd uhci_hcd dm_mirror dm_snapshot
> dm_mod
> Pid: 9539, comm: rpciod/0 Tainted: P M 2.6.13
> RIP: 0010:[<ffffffff8018853e>] <ffffffff8018853e>{iput+30}
> RSP: 0018:ffff81002a181e08 EFLAGS: 00010246
> RAX: ffffffff80494a00 RBX: ffff810023da8d40 RCX: ffff81002311f490
> RDX: ffff810022919680 RSI: ffff81002614da40 RDI: ffff810023da8d40
> RBP: ffff8100286af780 R08: ffff8100230f64e0 R09: 0000000000000000
> R10: 0000000000000001 R11: ffffffff801f8c90 R12: ffff81002311f480
> R13: ffff81002614da00 R14: ffff81002c3cec00 R15: ffffffff805c3fb0
> FS: 00002aaaaade6b00(0000) GS:ffffffff805bc800(0000) knlGS:0000000000000000
> CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 00002aaaaade0680 CR3: 0000000022d19000 CR4: 00000000000006e0
> Process rpciod/0 (pid: 9539, threadinfo ffff81002a180000, task ffff81002a4a2330)
> Stack: ffff81002311f480 ffffffff801fb0fd 00000000fffffff3 ffff81002fb0f1c0
> ffff8100286af780 ffffffff801efa51 ffff81002fb0f1c0 ffff81002d9be310
> ffff81002fb0f2a8 ffff81002d9be320
> Call Trace:<ffffffff801fb0fd>{nfs4_put_open_state+109}
> <ffffffff801efa51>{nfs4_close_done+225}
> <ffffffff803940a5>{__rpc_execute+165}
> <ffffffff8014046a>{worker_thread+442}
> <ffffffff8012e060>{default_wake_function+0}
> <ffffffff8012e060>{default_wake_function+0}
> <ffffffff801402b0>{worker_thread+0} <ffffffff8014424d>{kthread+205}
> <ffffffff8010f3e6>{child_rip+8}
> <ffffffff80144290>{keventd_create_kthread+0}
> <ffffffff80144180>{kthread+0} <ffffffff8010f3de>{child_rip+0}
>
>
> Code: 0f 0b a3 e0 61 3c 80 ff ff ff ff c2 76 04 66 66 66 90 48 85
> RIP <ffffffff8018853e>{iput+30} RSP <ffff81002a181e08>
>
>
>

after moving some files on the server to a new location then trying to
add the files
to xmms playlist i found the following in dmesg after xmms froze
wonder how many more items i can find...

nfs_update_inode: inode number mismatch
expected (0:11/0x27c27), got (0:11/0xe4c7d3)
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at "fs/inode.c":1142
invalid operand: 0000 [1]
CPU 0
Modules linked in: fglrx agpgart snd_seq_midi snd_emu10k1_synth
snd_emux_synth snd_seq_virmidi snd_seq_midi_emul snd_pcm_oss
snd_mixer_oss snd_seq_oss snd_seq_midi_event snd_seq snd_emu10k1
snd_rawmidi snd_seq_device snd_ac97_codec snd_pcm snd_timer
snd_page_alloc snd_util_mem snd_hwdep snd w83627hf i2c_sensor i2c_isa
i2c_core usb_storage r8169 ehci_hcd uhci_hcd dm_mirror dm_snapshot
dm_mod
Pid: 11028, comm: gvim Tainted: P M 2.6.13
RIP: 0010:[<ffffffff8018853e>] <ffffffff8018853e>{iput+30}
RSP: 0018:ffff81000e999c98 EFLAGS: 00010246
RAX: ffffffff80494a00 RBX: ffff810012235d80 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff810012235d80 RDI: ffff810012235d80
RBP: ffff810012235d80 R08: ffff81002ccb3d40 R09: ffff81002ccb3cf8
R10: ffff81002ccb3d40 R11: ffffffff80154a40 R12: ffff81000e999d28
R13: ffff81000e999d38 R14: ffff81002cb08cc0 R15: 00007ffffff53b30
FS: 00002aaaafc47b80(0000) GS:ffffffff805bc800(0000) knlGS:00000000563ca620
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000511338 CR3: 0000000011d53000 CR4: 00000000000006e0
Process gvim (pid: 11028, threadinfo ffff81000e998000, task ffff81001a732a60)
Stack: ffff81002ccb3cf8 ffffffff801867c6 ffff81000e999d38 ffff81002ccb3cf8
ffff81000e999e68 ffffffff8017d2ee 68706f7200000007 00000000fffffff5
ffff81002fd24001 0000000000000000
Call Trace:<ffffffff801867c6>{dput+390} <ffffffff8017d2ee>{do_lookup+414}
<ffffffff8017d801>{__link_path_walk+849}
<ffffffff8017e2aa>{link_path_walk+186}
<ffffffff80229f7a>{strncpy_from_user+74}
<ffffffff8017e521>{path_lookup+385}
<ffffffff8017e7ee>{__user_walk+62} <ffffffff80178d89>{vfs_stat+41}
<ffffffff801791df>{sys_newstat+31} <ffffffff8010e8b6>{system_call+126}


Code: 0f 0b a3 e0 61 3c 80 ff ff ff ff c2 76 04 66 66 66 90 48 85
RIP <ffffffff8018853e>{iput+30} RSP <ffff81000e999c98>
<0>general protection fault: 0000 [2]
CPU 0
Modules linked in: fglrx agpgart snd_seq_midi snd_emu10k1_synth
snd_emux_synth snd_seq_virmidi snd_seq_midi_emul snd_pcm_oss
snd_mixer_oss snd_seq_oss snd_seq_midi_event snd_seq snd_emu10k1
snd_rawmidi snd_seq_device snd_ac97_codec snd_pcm snd_timer
snd_page_alloc snd_util_mem snd_hwdep snd w83627hf i2c_sensor i2c_isa
i2c_core usb_storage r8169 ehci_hcd uhci_hcd dm_mirror dm_snapshot
dm_mod
Pid: 11074, comm: klauncher Tainted: P M 2.6.13
RIP: 0010:[<ffffffff801de262>] <ffffffff801de262>{nfs_lookup_revalidate+386}
RSP: 0018:ffff81001a341b18 EFLAGS: 00010282
RAX: ed05fe0eed52fe10 RBX: ffff81002195c638 RCX: 425f676209090945
RDX: 0000000000000001 RSI: ffff81001ba065a0 RDI: ffff81001ba065a0
RBP: ffff81001ba065a0 R08: 0000000000000000 R09: 000000000000000b
R10: ffff81002a4e58d8 R11: 0000000000000246 R12: ffff81001bd2d2d8
R13: ffff810012235270 R14: ffff81001a341e68 R15: 00007ffffff4b400
FS: 00002aaaaea99a00(0000) GS:ffffffff805bc800(0000) knlGS:00000000563ca620
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002aaaab1c4520 CR3: 0000000018294000 CR4: 00000000000006e0
Process klauncher (pid: 11074, threadinfo ffff81001a340000, task
ffff81001ab2f440)
Stack: ffff81001a341cd8 00008001000002c7 30ba174331000000 0000000000000000
0000000000000000 0000000000000000 ffff810013cf38d8 ffff81002e2a0a00
0000000000000000 ffff810028b56700
Call Trace:<ffffffff801e2f85>{nfs_revalidate_inode+37}
<ffffffff801de34c>{nfs_lookup_revalidate+620}
<ffffffff80394c90>{rpcauth_lookup_credcache+320}
<ffffffff80394c4b>{rpcauth_lookup_credcache+251}
<ffffffff801de95d>{nfs_open_revalidate+269}
<ffffffff8017d2cb>{do_lookup+379}
<ffffffff8017dd6d>{__link_path_walk+2237}
<ffffffff8017e2aa>{link_path_walk+186}
<ffffffff80229f7a>{strncpy_from_user+74}
<ffffffff8017e521>{path_lookup+385}
<ffffffff8017e7ee>{__user_walk+62} <ffffffff80178df6>{vfs_lstat+38}
<ffffffff8017922f>{sys_newlstat+31} <ffffffff8010e8b6>{system_call+126}


Code: 48 8b b8 50 02 00 00 74 17 41 8b 46 20 a8 40 75 19 a8 14 75
RIP <ffffffff801de262>{nfs_lookup_revalidate+386} RSP <ffff81001a341b18>
<0>general protection fault: 0000 [3]
CPU 0
Modules linked in: fglrx agpgart snd_seq_midi snd_emu10k1_synth
snd_emux_synth snd_seq_virmidi snd_seq_midi_emul snd_pcm_oss
snd_mixer_oss snd_seq_oss snd_seq_midi_event snd_seq snd_emu10k1
snd_rawmidi snd_seq_device snd_ac97_codec snd_pcm snd_timer
snd_page_alloc snd_util_mem snd_hwdep snd w83627hf i2c_sensor i2c_isa
i2c_core usb_storage r8169 ehci_hcd uhci_hcd dm_mirror dm_snapshot
dm_mod
Pid: 11067, comm: konsole Tainted: P M 2.6.13
RIP: 0010:[<ffffffff801de262>] <ffffffff801de262>{nfs_lookup_revalidate+386}
RSP: 0018:ffff81000f4f1b18 EFLAGS: 00010282
RAX: ed05fe0eed52fe10 RBX: ffff81002195c638 RCX: 425f676209090945
RDX: 0000000000000001 RSI: ffff81001ba065a0 RDI: ffff81001ba065a0
RBP: ffff81001ba065a0 R08: 0000000000000000 R09: 000000000000000b
R10: ffff81002a4e58d8 R11: 0000000000000246 R12: ffff81001bd2d2d8
R13: ffff810012235270 R14: ffff81000f4f1e68 R15: 00007fffffc71c60
FS: 00002aaaaef4c0a0(0000) GS:ffffffff805bc800(0000) knlGS:00000000563ca620
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007fffffc70c48 CR3: 000000001cab2000 CR4: 00000000000006e0
Process konsole (pid: 11067, threadinfo ffff81000f4f0000, task ffff810018467900)
Stack: ffff81000f4f1cd8 0000000000000001 ffff81002e347240 ffff810018467900
ffff810018467938 0000000000000073 0000023459aab717 ffffffff803a4ee0
ffff81000f4f1bd8 0000000000000082
Call Trace:<ffffffff803a4ee0>{thread_return+0}
<ffffffff801e2f85>{nfs_revalidate_inode+37}
<ffffffff801de34c>{nfs_lookup_revalidate+620}
<ffffffff803a580e>{schedule_timeout+30}
<ffffffff80394c4b>{rpcauth_lookup_credcache+251}
<ffffffff801de95d>{nfs_open_revalidate+269}
<ffffffff8017d2cb>{do_lookup+379}
<ffffffff8017dd6d>{__link_path_walk+2237}
<ffffffff8017e2aa>{link_path_walk+186}
<ffffffff80229f7a>{strncpy_from_user+74}
<ffffffff8017e521>{path_lookup+385} <ffffffff8017e7ee>{__user_walk+62}
<ffffffff80178df6>{vfs_lstat+38} <ffffffff8017922f>{sys_newlstat+31}
<ffffffff8010e8b6>{system_call+126}

Code: 48 8b b8 50 02 00 00 74 17 41 8b 46 20 a8 40 75 19 a8 14 75
RIP <ffffffff801de262>{nfs_lookup_revalidate+386} RSP <ffff81000f4f1b18>

2005-09-04 10:38:51

by Francois Romieu

[permalink] [raw]
Subject: Re: nfs4 client bug

Bret Towe <[email protected]> :
[...]
> after moving some files on the server to a new location then trying to
> add the files
> to xmms playlist i found the following in dmesg after xmms froze
> wonder how many more items i can find...

The system includes some binary only stuff. Please contact your vendor
or provide the traces for a configuration wherein the relevant module
was not loaded after boot. It may make sense to get in touch with
[email protected] then.

--
Ueimor

2005-09-04 19:44:09

by Bret Towe

[permalink] [raw]
Subject: Re: nfs4 client bug

On 9/4/05, Francois Romieu <[email protected]> wrote:
> Bret Towe <[email protected]> :
> [...]
> > after moving some files on the server to a new location then trying to
> > add the files
> > to xmms playlist i found the following in dmesg after xmms froze
> > wonder how many more items i can find...
>
> The system includes some binary only stuff. Please contact your vendor
> or provide the traces for a configuration wherein the relevant module
> was not loaded after boot. It may make sense to get in touch with
> [email protected] then.

the 'binary only stuff' is ati-drivers kernel module and it crashs
with or without it
ill provide a 'untainted' trace as soon as i can repeat the bug again

> --
> Ueimor
>

2005-09-04 20:51:14

by Bret Towe

[permalink] [raw]
Subject: Re: nfs4 client bug

On 9/4/05, Bret Towe <[email protected]> wrote:
> On 9/4/05, Francois Romieu <[email protected]> wrote:
> > Bret Towe <[email protected]> :
> > [...]
> > > after moving some files on the server to a new location then trying to
> > > add the files
> > > to xmms playlist i found the following in dmesg after xmms froze
> > > wonder how many more items i can find...
> >
> > The system includes some binary only stuff. Please contact your vendor
> > or provide the traces for a configuration wherein the relevant module
> > was not loaded after boot. It may make sense to get in touch with
> > [email protected] then.
>
> the 'binary only stuff' is ati-drivers kernel module and it crashs
> with or without it
> ill provide a 'untainted' trace as soon as i can repeat the bug again

ok without ati-drivers kernel module loaded the computer basicly just
hard locks when
some bug hits dunno if its the same item

to repeat it tho one needs laptop-mode enabled have xmms playing music
(flac in my case)
which resides on nfs then just put the computer under some local load
for a little bit
till which im guessing it needs to clear some memory or somethin and
it hits this hard lock
or the errors i mailed previously when ati-drivers is loaded

2005-09-04 21:52:22

by J. Bruce Fields

[permalink] [raw]
Subject: Re: nfs4 client bug

On Sun, Sep 04, 2005 at 01:51:08PM -0700, Bret Towe wrote:
> On 9/4/05, Bret Towe <[email protected]> wrote:
> > On 9/4/05, Francois Romieu <[email protected]> wrote:
> > > Bret Towe <[email protected]> :
> > > [...]
> > > > after moving some files on the server to a new location then trying to
> > > > add the files
> > > > to xmms playlist i found the following in dmesg after xmms froze
> > > > wonder how many more items i can find...
> > >
> > > The system includes some binary only stuff. Please contact your vendor
> > > or provide the traces for a configuration wherein the relevant module
> > > was not loaded after boot. It may make sense to get in touch with
> > > [email protected] then.
> >
> > the 'binary only stuff' is ati-drivers kernel module and it crashs
> > with or without it
> > ill provide a 'untainted' trace as soon as i can repeat the bug again
>
> ok without ati-drivers kernel module loaded the computer basicly just
> hard locks when
> some bug hits dunno if its the same item

Do you get anything from alt-sysrq-T?

> to repeat it tho one needs laptop-mode enabled have xmms playing music
> (flac in my case)
> which resides on nfs then just put the computer under some local load
> for a little bit
> till which im guessing it needs to clear some memory or somethin and
> it hits this hard lock
> or the errors i mailed previously when ati-drivers is loaded

What kernel version is this?

--b.

2005-09-05 03:08:28

by Bret Towe

[permalink] [raw]
Subject: Re: nfs4 client bug

On 9/4/05, J. Bruce Fields <[email protected]> wrote:
> On Sun, Sep 04, 2005 at 01:51:08PM -0700, Bret Towe wrote:
> > On 9/4/05, Bret Towe <[email protected]> wrote:
> > > On 9/4/05, Francois Romieu <[email protected]> wrote:
> > > > Bret Towe <[email protected]> :
> > > > [...]
> > > > > after moving some files on the server to a new location then trying to
> > > > > add the files
> > > > > to xmms playlist i found the following in dmesg after xmms froze
> > > > > wonder how many more items i can find...
> > > >
> > > > The system includes some binary only stuff. Please contact your vendor
> > > > or provide the traces for a configuration wherein the relevant module
> > > > was not loaded after boot. It may make sense to get in touch with
> > > > [email protected] then.
> > >
> > > the 'binary only stuff' is ati-drivers kernel module and it crashs
> > > with or without it
> > > ill provide a 'untainted' trace as soon as i can repeat the bug again
> >
> > ok without ati-drivers kernel module loaded the computer basicly just
> > hard locks when
> > some bug hits dunno if its the same item
>
> Do you get anything from alt-sysrq-T?

no i havent used that im usally in x when its freezing
x wont even switch to console would it still give me anything then?

> > to repeat it tho one needs laptop-mode enabled have xmms playing music
> > (flac in my case)
> > which resides on nfs then just put the computer under some local load
> > for a little bit
> > till which im guessing it needs to clear some memory or somethin and
> > it hits this hard lock
> > or the errors i mailed previously when ati-drivers is loaded
>
> What kernel version is this?

2.6.13


> --b.
>

2005-09-05 03:18:29

by J. Bruce Fields

[permalink] [raw]
Subject: Re: nfs4 client bug

On Sun, Sep 04, 2005 at 08:08:22PM -0700, Bret Towe wrote:
> On 9/4/05, J. Bruce Fields <[email protected]> wrote:
> > Do you get anything from alt-sysrq-T?
>
> no i havent used that im usally in x when its freezing
> x wont even switch to console would it still give me anything then?

Well, you can try something like:
alt-sysrq-T
wait a couple seconds, then
alt-sysrq-S
alt-sysrq-U
alt-sysrq-B
with maybe a second between each to give stuff a chance to get to disk.

Then if you're lucky you may find the stack dumps in your log after you
reboot.

--b.

2005-09-05 16:24:52

by Andreas Sundstrom

[permalink] [raw]
Subject: Re: nfs4 client bug

[Bret Towe]
> i encountered the following error while using nfs4
> ive hit this error i think twice now not sure what causes it yet tho
> this time the only io related items going on was emerge sync running
> in the background (which shouldnt of touched nfs at all) and xmms
> playing some music
>
> another problem i had was iowait was showing near 100% but nothing
> over nfs was working
> but no errors or anything was showing in dmesg if i encounter this
> again is there a way to
> find out where it locked up so i can give a report on what the problem
> is?
>
> attached is my config the box is an athlon64
> if any further information is needed let me know
>

I have had some issues with nfsv4 in 2.6.13 too. I've had three Oopses
now, and the last one is included in this e-mail.

I'm also running on Athlon64, but I'm not running in 64-bit mode at the
moment. I've reverted to 2.6.12.5 for now, but if it would help I could
run 2.6.13 to gather more info if it's needed.

I'm not on the list so CC me for quicker response.

/Andreas Sundstrom

Kernel Oops:

kernel: Unable to handle kernel paging request at virtual address
00100104
kernel: printing eip:
kernel: c01803b9
kernel: *pde = 00000000
kernel: Oops: 0002 [#1]
kernel: PREEMPT
kernel: Modules linked in: evdev snd_emu10k1_synth snd_emux_synth
snd_seq_virmidi snd_seq_midi_emul snd_seq_midi snd_seq_midi_event
snd_seq snd_emu10k1 snd_rawmidi snd_seq_device snd_ac97_codec snd_pcm
snd_timer snd_page_alloc snd_util_mem snd_hwdep snd dm_mod st sbp2
kernel: CPU: 0
kernel: EIP: 0060:[generic_forget_inode+89/416] Not tainted VLI
kernel: EFLAGS: 00010246 (2.6.13)
kernel: EIP is at generic_forget_inode+0x59/0x1a0
kernel: eax: 00200200 ebx: f39761c0 ecx: 00100100 edx: f39761c8
kernel: esi: f6609200 edi: f6770800 ebp: 00000000 esp: c1b5dd50
kernel: ds: 007b es: 007b ss: 0068
kernel: Process kswapd0 (pid: 165, threadinfo=c1b5d000 task=c1b5b570)
kernel: Stack: f39761c0 c0180561 f39761e4 f39761c0 f3976130 c01f12ad
f39761c0 c1b5dd70
kernel: ffffffff ffffffff f39761c0 f39761c0 f39760a4 f39761b4
c01f17e2 f39761c0
kernel: c01429e2 f3976264 c1b5ddf4 00000000 0000000e 00000001
c1b5ddec 00000000
kernel: Call Trace:
kernel: [iput+65/128] iput+0x41/0x80
kernel: [nfs_wait_on_inode+125/160] nfs_wait_on_inode+0x7d/0xa0
kernel: [__nfs_revalidate_inode+146/704] __nfs_revalidate_inode
+0x92/0x2c0
kernel: [find_get_pages_tag+66/144] find_get_pages_tag+0x42/0x90
kernel: [pagevec_lookup_tag+51/64] pagevec_lookup_tag+0x33/0x40
kernel: [wait_on_page_writeback_range+109/288]
wait_on_page_writeback_range+0x6d/0x120
kernel: [nfs_commit_inode+69/160] nfs_commit_inode+0x45/0xa0
kernel: [nfs_sync_inode+104/128] nfs_sync_inode+0x68/0x80
kernel: [nfs_do_return_delegation+43/96] nfs_do_return_delegation
+0x2b/0x60
kernel: [nfs_inode_return_delegation+236/272]
nfs_inode_return_delegation+0xec/0x110
kernel: [nfs4_clear_inode+35/176] nfs4_clear_inode+0x23/0xb0
kernel: [clear_inode+106/192] clear_inode+0x6a/0xc0
kernel: [dispose_list+47/288] dispose_list+0x2f/0x120
kernel: [prune_icache+142/528] prune_icache+0x8e/0x210
kernel: [get_writeback_state+64/80] get_writeback_state+0x40/0x50
kernel: [shrink_icache_memory+69/80] shrink_icache_memory+0x45/0x50
kernel: [shrink_slab+308/416] shrink_slab+0x134/0x1a0
kernel: [balance_pgdat+571/1024] balance_pgdat+0x23b/0x400
kernel: [kswapd+214/288] kswapd+0xd6/0x120
kernel: [autoremove_wake_function+0/96] autoremove_wake_function
+0x0/0x60
kernel: [kswapd+0/288] kswapd+0x0/0x120
kernel: [kernel_thread_helper+5/12] kernel_thread_helper+0x5/0xc
kernel: Code: 46 37 40 74 45 b8 00 f0 ff ff 21 e0 ff 48 14 8b 40 08 a8
08 0f 85 2c 01 00 00 83 c4 0c 5b 5e c3 89 f6 8d 53 08 8b 4b 08 8b 42 04
<89> 41 04 89 08 a1 ec 0c 49 c0 89 50 04 89 43 08 c7 42 04 ec 0c
kernel: <6>note: kswapd0[165] exited with preempt_count 1
kernel: SysRq : Emergency Sync
kernel: Emergency Sync complete
kernel: SysRq : Emergency Sync


Relevant info from the config

sunkan@sunkan:~/kernel$ egrep -v "(^( |\t)*#|^$)" linux-2.6.13/.config
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_UID16=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_EXPERIMENTAL=y
CONFIG_CLEAN_COMPILE=y
CONFIG_BROKEN_ON_SMP=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION=""
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_POSIX_MQUEUE=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
CONFIG_SYSCTL=y
CONFIG_HOTPLUG=y
CONFIG_KOBJECT_UEVENT=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_KALLSYMS=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SHMEM=y
CONFIG_CC_ALIGN_FUNCTIONS=0
CONFIG_CC_ALIGN_LABELS=0
CONFIG_CC_ALIGN_LOOPS=0
CONFIG_CC_ALIGN_JUMPS=0
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_OBSOLETE_MODPARM=y
CONFIG_KMOD=y
CONFIG_X86_PC=y
CONFIG_MK8=y
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_HPET_TIMER=y
CONFIG_PREEMPT=y
CONFIG_PREEMPT_BKL=y
CONFIG_X86_UP_APIC=y
CONFIG_X86_UP_IOAPIC=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_TSC=y
CONFIG_X86_MCE=y
CONFIG_X86_MCE_NONFATAL=y
CONFIG_HIGHMEM4G=y
CONFIG_HIGHMEM=y
CONFIG_SELECT_MEMORY_MODEL=y
CONFIG_FLATMEM_MANUAL=y
CONFIG_FLATMEM=y
CONFIG_FLAT_NODE_MEM_MAP=y
CONFIG_MTRR=y
CONFIG_HAVE_DEC_LOCK=y
CONFIG_SECCOMP=y
CONFIG_HZ_250=y
CONFIG_HZ=250
CONFIG_PHYSICAL_START=0x100000
CONFIG_PM=y
CONFIG_ACPI=y
CONFIG_ACPI_BOOT=y
CONFIG_ACPI_INTERPRETER=y
CONFIG_ACPI_SLEEP=y
CONFIG_ACPI_SLEEP_PROC_FS=y
CONFIG_ACPI_AC=y
CONFIG_ACPI_BATTERY=y
CONFIG_ACPI_BUTTON=y
CONFIG_ACPI_VIDEO=m
CONFIG_ACPI_FAN=y
CONFIG_ACPI_PROCESSOR=y
CONFIG_ACPI_THERMAL=y
CONFIG_ACPI_BLACKLIST_YEAR=0
CONFIG_ACPI_BUS=y
CONFIG_ACPI_EC=y
CONFIG_ACPI_POWER=y
CONFIG_ACPI_PCI=y
CONFIG_ACPI_SYSTEM=y
CONFIG_CPU_FREQ=y
CONFIG_CPU_FREQ_TABLE=y
CONFIG_CPU_FREQ_STAT=y
CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
CONFIG_CPU_FREQ_GOV_POWERSAVE=y
CONFIG_CPU_FREQ_GOV_ONDEMAND=y
CONFIG_CPU_FREQ_GOV_CONSERVATIVE=m
CONFIG_X86_POWERNOW_K8=y
CONFIG_X86_POWERNOW_K8_ACPI=y
CONFIG_PCI=y
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_MMCONFIG=y
CONFIG_ISA_DMA_API=y
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_MISC=y
CONFIG_NET=y
CONFIG_PACKET=y
CONFIG_PACKET_MMAP=y
CONFIG_UNIX=y
CONFIG_XFRM=y
CONFIG_XFRM_USER=y
CONFIG_NET_KEY=y
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_FIB_HASH=y
CONFIG_INET_AH=y
CONFIG_INET_ESP=y
CONFIG_INET_IPCOMP=y
CONFIG_INET_TUNNEL=y
CONFIG_IP_TCPDIAG=y
CONFIG_IP_TCPDIAG_IPV6=y
CONFIG_TCP_CONG_BIC=y
CONFIG_IPV6=y
CONFIG_INET6_AH=y
CONFIG_INET6_ESP=y
CONFIG_INET6_IPCOMP=y
CONFIG_INET6_TUNNEL=y
CONFIG_STANDALONE=y
CONFIG_PREVENT_FIRMWARE_BUILD=y
CONFIG_PNP=y
CONFIG_PNPACPI=y
CONFIG_BLK_DEV_FD=y
CONFIG_BLK_DEV_LOOP=m
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_COUNT=16
CONFIG_BLK_DEV_RAM_SIZE=16384
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_CDROM_PKTCDVD=m
CONFIG_CDROM_PKTCDVD_BUFFERS=8
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
CONFIG_IDE=y
CONFIG_BLK_DEV_IDE=y
CONFIG_BLK_DEV_IDEDISK=y
CONFIG_BLK_DEV_IDECD=y
CONFIG_IDE_GENERIC=y
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_BLK_DEV_GENERIC=y
CONFIG_BLK_DEV_IDEDMA_PCI=y
CONFIG_IDEDMA_PCI_AUTO=y
CONFIG_BLK_DEV_VIA82CXXX=y
CONFIG_BLK_DEV_IDEDMA=y
CONFIG_IDEDMA_AUTO=y
CONFIG_SCSI=y
CONFIG_SCSI_PROC_FS=y
CONFIG_BLK_DEV_SD=y
CONFIG_CHR_DEV_ST=m
CONFIG_CHR_DEV_SG=m
CONFIG_SCSI_MULTI_LUN=y
CONFIG_SCSI_CONSTANTS=y
CONFIG_SCSI_LOGGING=y
CONFIG_SCSI_SPI_ATTRS=m
CONFIG_SCSI_AIC7XXX=m
CONFIG_AIC7XXX_CMDS_PER_DEVICE=32
CONFIG_AIC7XXX_RESET_DELAY_MS=5000
CONFIG_AIC7XXX_DEBUG_ENABLE=y
CONFIG_AIC7XXX_DEBUG_MASK=0
CONFIG_AIC7XXX_REG_PRETTY_PRINT=y
CONFIG_SCSI_QLA2XXX=y
CONFIG_MD=y
CONFIG_BLK_DEV_DM=m
CONFIG_DM_SNAPSHOT=m
CONFIG_IEEE1394=y
CONFIG_IEEE1394_OHCI1394=y
CONFIG_IEEE1394_SBP2=m
CONFIG_IEEE1394_DV1394=m
CONFIG_IEEE1394_RAWIO=m
CONFIG_NETDEVICES=y
CONFIG_NET_ETHERNET=y
CONFIG_MII=y
CONFIG_NET_VENDOR_3COM=y
CONFIG_VORTEX=y
CONFIG_R8169=y
CONFIG_R8169_NAPI=y
CONFIG_INPUT=y
CONFIG_INPUT_MOUSEDEV=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1600
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=1200
CONFIG_INPUT_EVDEV=m
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
CONFIG_SERIO_LIBPS2=y
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_NR_UARTS=2
CONFIG_SERIAL_CORE=y
CONFIG_UNIX98_PTYS=y
CONFIG_LEGACY_PTYS=y
CONFIG_LEGACY_PTY_COUNT=256
CONFIG_AGP=y
CONFIG_AGP_AMD64=y
CONFIG_AGP_VIA=y
CONFIG_I2C=m
CONFIG_I2C_CHARDEV=m
CONFIG_I2C_ALGOBIT=m
CONFIG_I2C_ALGOPCF=m
CONFIG_I2C_ALGOPCA=m
CONFIG_I2C_ALI1535=m
CONFIG_I2C_ALI1563=m
CONFIG_I2C_ALI15X3=m
CONFIG_I2C_AMD756=m
CONFIG_I2C_AMD8111=m
CONFIG_I2C_I801=m
CONFIG_I2C_I810=m
CONFIG_I2C_PIIX4=m
CONFIG_I2C_ISA=m
CONFIG_I2C_NFORCE2=m
CONFIG_I2C_PARPORT_LIGHT=m
CONFIG_I2C_PROSAVAGE=m
CONFIG_I2C_SAVAGE4=m
CONFIG_SCx200_ACB=m
CONFIG_I2C_SIS5595=m
CONFIG_I2C_SIS630=m
CONFIG_I2C_SIS96X=m
CONFIG_I2C_VIA=m
CONFIG_I2C_VIAPRO=m
CONFIG_I2C_VOODOO3=m
CONFIG_I2C_PCA_ISA=m
CONFIG_I2C_SENSOR=m
CONFIG_SENSORS_EEPROM=m
CONFIG_SENSORS_PCF8574=m
CONFIG_SENSORS_PCA9539=m
CONFIG_SENSORS_PCF8591=m
CONFIG_SENSORS_RTC8564=m
CONFIG_SENSORS_MAX6875=m
CONFIG_HWMON=y
CONFIG_SENSORS_ADM1021=m
CONFIG_SENSORS_ADM1025=m
CONFIG_SENSORS_ADM1026=m
CONFIG_SENSORS_ADM1031=m
CONFIG_SENSORS_ADM9240=m
CONFIG_SENSORS_ASB100=m
CONFIG_SENSORS_ATXP1=m
CONFIG_SENSORS_DS1621=m
CONFIG_SENSORS_FSCHER=m
CONFIG_SENSORS_GL518SM=m
CONFIG_SENSORS_IT87=m
CONFIG_SENSORS_LM63=m
CONFIG_SENSORS_LM75=m
CONFIG_SENSORS_LM77=m
CONFIG_SENSORS_LM78=m
CONFIG_SENSORS_LM80=m
CONFIG_SENSORS_LM83=m
CONFIG_SENSORS_LM85=m
CONFIG_SENSORS_LM87=m
CONFIG_SENSORS_LM90=m
CONFIG_SENSORS_MAX1619=m
CONFIG_SENSORS_PC87360=m
CONFIG_SENSORS_SMSC47M1=m
CONFIG_SENSORS_VIA686A=m
CONFIG_SENSORS_W83781D=m
CONFIG_SENSORS_W83L785TS=m
CONFIG_SENSORS_W83627HF=m
CONFIG_SENSORS_W83627EHF=m
CONFIG_FB=y
CONFIG_FB_CFB_FILLRECT=y
CONFIG_FB_CFB_COPYAREA=y
CONFIG_FB_CFB_IMAGEBLIT=y
CONFIG_FB_SOFT_CURSOR=y
CONFIG_FB_VESA=y
CONFIG_VIDEO_SELECT=y
CONFIG_VGA_CONSOLE=y
CONFIG_DUMMY_CONSOLE=y
CONFIG_FRAMEBUFFER_CONSOLE=y
CONFIG_FONT_8x8=y
CONFIG_FONT_8x16=y
CONFIG_SOUND=y
CONFIG_SND=m
CONFIG_SND_TIMER=m
CONFIG_SND_PCM=m
CONFIG_SND_HWDEP=m
CONFIG_SND_RAWMIDI=m
CONFIG_SND_SEQUENCER=m
CONFIG_SND_MPU401_UART=m
CONFIG_SND_AC97_CODEC=m
CONFIG_SND_EMU10K1=m
CONFIG_SND_VIA82XX=m
CONFIG_USB_ARCH_HAS_HCD=y
CONFIG_USB_ARCH_HAS_OHCI=y
CONFIG_USB=y
CONFIG_USB_DEVICEFS=y
CONFIG_USB_EHCI_HCD=y
CONFIG_USB_UHCI_HCD=y
CONFIG_USB_PRINTER=y
CONFIG_USB_STORAGE=y
CONFIG_USB_HID=y
CONFIG_USB_HIDINPUT=y
CONFIG_USB_MON=y
CONFIG_EXT2_FS=y
CONFIG_REISERFS_FS=y
CONFIG_FS_POSIX_ACL=y
CONFIG_MINIX_FS=y
CONFIG_ROMFS_FS=y
CONFIG_INOTIFY=y
CONFIG_DNOTIFY=y
CONFIG_AUTOFS4_FS=y
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
CONFIG_UDF_FS=y
CONFIG_UDF_NLS=y
CONFIG_FAT_FS=y
CONFIG_MSDOS_FS=y
CONFIG_VFAT_FS=y
CONFIG_FAT_DEFAULT_CODEPAGE=437
CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1"
CONFIG_NTFS_FS=m
CONFIG_NTFS_RW=y
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_SYSFS=y
CONFIG_TMPFS=y
CONFIG_RAMFS=y
CONFIG_CRAMFS=y
CONFIG_NFS_FS=y
CONFIG_NFS_V3=y
CONFIG_NFS_V3_ACL=y
CONFIG_NFS_V4=y
CONFIG_NFSD=y
CONFIG_NFSD_V2_ACL=y
CONFIG_NFSD_V3=y
CONFIG_NFSD_V3_ACL=y
CONFIG_NFSD_V4=y
CONFIG_NFSD_TCP=y
CONFIG_LOCKD=y
CONFIG_LOCKD_V4=y
CONFIG_EXPORTFS=y
CONFIG_NFS_ACL_SUPPORT=y
CONFIG_NFS_COMMON=y
CONFIG_SUNRPC=y
CONFIG_SUNRPC_GSS=y
CONFIG_RPCSEC_GSS_KRB5=y
CONFIG_SMB_FS=m
CONFIG_CIFS=m
CONFIG_MSDOS_PARTITION=y
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="iso8859-1"
CONFIG_NLS_CODEPAGE_437=y
CONFIG_NLS_ISO8859_1=y
CONFIG_NLS_ISO8859_15=m
CONFIG_NLS_UTF8=m
CONFIG_DEBUG_KERNEL=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_LOG_BUF_SHIFT=14
CONFIG_DEBUG_BUGVERBOSE=y
CONFIG_EARLY_PRINTK=y
CONFIG_4KSTACKS=y
CONFIG_X86_FIND_SMP_CONFIG=y
CONFIG_X86_MPPARSE=y
CONFIG_CRYPTO=y
CONFIG_CRYPTO_HMAC=y
CONFIG_CRYPTO_NULL=y
CONFIG_CRYPTO_MD4=y
CONFIG_CRYPTO_MD5=y
CONFIG_CRYPTO_SHA1=y
CONFIG_CRYPTO_SHA256=y
CONFIG_CRYPTO_SHA512=y
CONFIG_CRYPTO_TGR192=m
CONFIG_CRYPTO_DES=y
CONFIG_CRYPTO_AES_586=y
CONFIG_CRYPTO_DEFLATE=y
CONFIG_CRYPTO_CRC32C=m
CONFIG_CRC_CCITT=m
CONFIG_CRC32=y
CONFIG_LIBCRC32C=m
CONFIG_ZLIB_INFLATE=y
CONFIG_ZLIB_DEFLATE=y
CONFIG_GENERIC_HARDIRQS=y
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_X86_BIOS_REBOOT=y
CONFIG_PC=y
sunkan@sunkan:~/kernel$



2005-09-05 20:44:11

by Bret Towe

[permalink] [raw]
Subject: Re: nfs4 client bug

On 9/4/05, J. Bruce Fields <[email protected]> wrote:
> On Sun, Sep 04, 2005 at 08:08:22PM -0700, Bret Towe wrote:
> > On 9/4/05, J. Bruce Fields <[email protected]> wrote:
> > > Do you get anything from alt-sysrq-T?
> >
> > no i havent used that im usally in x when its freezing
> > x wont even switch to console would it still give me anything then?
>
> Well, you can try something like:
> alt-sysrq-T
> wait a couple seconds, then
> alt-sysrq-S
> alt-sysrq-U
> alt-sysrq-B
> with maybe a second between each to give stuff a chance to get to disk.
>
> Then if you're lucky you may find the stack dumps in your log after you
> reboot.

tried it and so far no luck ill keep trying a few more times and see
if i can get it
to spit somethin out to disk but i dont think ill be that lucky as that would
prob make life to easy wouldnt it?

2005-09-05 20:48:35

by Jesper Juhl

[permalink] [raw]
Subject: Re: nfs4 client bug

On 9/5/05, Bret Towe <[email protected]> wrote:
> On 9/4/05, J. Bruce Fields <[email protected]> wrote:
> > On Sun, Sep 04, 2005 at 08:08:22PM -0700, Bret Towe wrote:
> > > On 9/4/05, J. Bruce Fields <[email protected]> wrote:
> > > > Do you get anything from alt-sysrq-T?
> > >
> > > no i havent used that im usally in x when its freezing
> > > x wont even switch to console would it still give me anything then?
> >
> > Well, you can try something like:
> > alt-sysrq-T
> > wait a couple seconds, then
> > alt-sysrq-S
> > alt-sysrq-U
> > alt-sysrq-B
> > with maybe a second between each to give stuff a chance to get to disk.
> >
> > Then if you're lucky you may find the stack dumps in your log after you
> > reboot.
>
> tried it and so far no luck ill keep trying a few more times and see
> if i can get it
> to spit somethin out to disk but i dont think ill be that lucky as that would
> prob make life to easy wouldnt it?

How about

serial console over a cross-over cable to a second box.
netconsole will let you put the console on a different box over the network.
console on line printer will let you have a permanent record of the
console output on paper.

See
Documentation/serial-console.txt
Documentation/networking/netconsole.txt
the help entry for "config LP_CONSOLE" (in drivers/char/Kconfig)

Would any of those perhaps help you in capturing anything ?

--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2005-09-06 03:41:05

by Bret Towe

[permalink] [raw]
Subject: Re: nfs4 client bug

On 9/5/05, Jesper Juhl <[email protected]> wrote:
> On 9/5/05, Bret Towe <[email protected]> wrote:
> > On 9/4/05, J. Bruce Fields <[email protected]> wrote:
> > > On Sun, Sep 04, 2005 at 08:08:22PM -0700, Bret Towe wrote:
> > > > On 9/4/05, J. Bruce Fields <[email protected]> wrote:
> > > > > Do you get anything from alt-sysrq-T?
> > > >
> > > > no i havent used that im usally in x when its freezing
> > > > x wont even switch to console would it still give me anything then?
> > >
> > > Well, you can try something like:
> > > alt-sysrq-T
> > > wait a couple seconds, then
> > > alt-sysrq-S
> > > alt-sysrq-U
> > > alt-sysrq-B
> > > with maybe a second between each to give stuff a chance to get to disk.
> > >
> > > Then if you're lucky you may find the stack dumps in your log after you
> > > reboot.
> >
> > tried it and so far no luck ill keep trying a few more times and see
> > if i can get it
> > to spit somethin out to disk but i dont think ill be that lucky as that would
> > prob make life to easy wouldnt it?
>
> How about
>
> serial console over a cross-over cable to a second box.
> netconsole will let you put the console on a different box over the network.
> console on line printer will let you have a permanent record of the
> console output on paper.
>
> See
> Documentation/serial-console.txt
> Documentation/networking/netconsole.txt
> the help entry for "config LP_CONSOLE" (in drivers/char/Kconfig)
>
> Would any of those perhaps help you in capturing anything ?

netconsole++

got the following and ill be watching for anything that looks different
as i continue to push my luck

NMI Watchdog detected LOCKUP on CPU0CPU 0
Modules linked in: netconsole snd_seq_midi snd_emu10k1_synth
snd_emux_synth snd_seq_virmidi snd_seq_midi_emul snd_pcm_oss
snd_mixer_oss snd_seq_oss snd_seq_midi_event snd_seq snd_emu10k1
snd_rawmidi snd_seq_device snd_ac97_codec snd_pcm snd_timer
snd_page_alloc snd_util_mem snd_hwdep snd w83627hf i2c_sensor i2c_isa
i2c_core usb_storage r8169 ehci_hcd uhci_hcd dm_mirror dm_snapshot
dm_mod
Pid: 14169, comm: xmms Tainted: G M 2.6.13
RIP: 0010:[<ffffffff80158d3b>] <ffffffff80158d3b>{cache_alloc_refill+347}
RSP: 0018:ffff81001306fa48 EFLAGS: 00000017
RAX: ffff81002f995c90 RBX: ffff81002f994800 RCX: ffff81000a8a9000
RDX: ffff81002f995c90 RSI: 000000000000000e RDI: ffff810005cfc028
RBP: ffff81002f995c80 R08: ffff81002f994810 R09: ffff81002f995ca0
R10: ffff81002f995cb0 R11: ffffffff80154a40 R12: ffff81002f995c90
R13: ffff81002f98d5c0 R14: 00000000000000d0 R15: ffffffff801e1550
FS: 0000000041802960(0063) GS:ffffffff805bc800(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002aaaab351000 CR3: 00000000297c8000 CR4: 00000000000006e0
Process xmms (pid: 14169, threadinfo ffff81001306e000, task ffff81001beae0b0)
Stack: ffff81002fb0f1c0 0000000000000000 ffff81001306faa8 00000000000000d0
ffff81002f995c80 ffff81002d450400 ffff81001306fb18 ffff810001bb2078
ffffffff801e1550 ffffffff80158a7d
Call Trace:<ffffffff801e1550>{nfs_find_actor+0}
<ffffffff80158a7d>{kmem_cache_alloc+77}
<ffffffff801e4305>{nfs_alloc_inode+21} <ffffffff80187fc2>{alloc_inode+18}
<ffffffff80188e48>{iget5_locked+200}
<ffffffff801e15c0>{nfs_init_locked+0}
<ffffffff801e191e>{nfs_fhget+110} <ffffffff801de5f5>{nfs_lookup+309}
<ffffffff801eede7>{_nfs4_proc_access+215}
<ffffffff8017d235>{do_lookup+229}
<ffffffff8017d801>{__link_path_walk+849}
<ffffffff8017e2aa>{link_path_walk+186}
<ffffffff8016f668>{get_unused_fd+88} <ffffffff8017e521>{path_lookup+385}
<ffffffff8017ed6f>{open_namei+175} <ffffffff8012d6e4>{deactivate_task+20}
<ffffffff8016f5e7>{filp_open+39} <ffffffff8016f668>{get_unused_fd+88}
<ffffffff8016f7b4>{sys_open+84} <ffffffff8010e8b6>{system_call+126}


Code: 4c 89 61 08 49 89 0c 24 85 f6 0f 8f 3c ff ff ff 8b 3b 89 f8
console shuts up ...
<0>Kernel panic - not syncing: Aiee, killing interrupt handler!
<6>SysRq : Show State

sibling
task PC pid father child younger older
init S 000000010001d013 0 1 0 2 (NOTLB)
ffff810001c61d88 0000000000000082 0000000000000000 ffffffff8048c4b8
ffff81001beaee10 ffff810001c5f440 ffff81001beaee10 ffff810001c5f658
ffff810001c5f440 0000001001c61e68
Call Trace:<ffffffff803a5744>{schedule_timeout+148}
<ffffffff80139a30>{process_timeout+0}
<ffffffff8017c052>{pipe_poll+66} <ffffffff80182d77>{do_select+967}
<ffffffff801828c0>{__pollwait+0} <ffffffff801830bc>{sys_select+748}
<ffffffff8010e8b6>{system_call+126}
ksoftirqd/0 S 0000000000000000 0 2 1 3 (L-TLB)
ffff810001c65f08 0000000000000046 ffff810001c65f18 ffff81002c8102b0
ffff810026a85110 ffff810001c5ed90 ffff810026a85110 ffff810001c5efa8
ffff810001c5f440 ffff810001c65f08
Call Trace:<ffffffff80135e60>{ksoftirqd+0} <ffffffff80135ea5>{ksoftirqd+69}
<ffffffff80135e60>{ksoftirqd+0} <ffffffff8014424d>{kthread+205}
<ffffffff8010f3e6>{child_rip+8} <ffffffff80144180>{kthread+0}
<ffffffff8010f3de>{child_rip+0}
events/0 R running task 0 3 1 4 2 (L-TLB)
khelper S ffff810001c3b300 0 4 1 5 3 (L-TLB)
ffff810001ef9e78 0000000000000046 0000000000000000 ffffffff80140040
ffff81002fbd0e90 ffff810001c5e030 ffff81002fbd0e90 ffff810001c5e248
ffff810001c3b330 ffff810001c3b320
Call Trace:<ffffffff80140040>{__call_usermodehelper+0}
<ffffffff801403c6>{worker_thread+278}
<ffffffff8012e060>{default_wake_function+0}
<ffffffff8012e060>{default_wake_function+0}
<ffffffff801402b0>{worker_thread+0} <ffffffff8014424d>{kthread+205}
<ffffffff8010f3e6>{child_rip+8} <ffffffff80144180>{kthread+0}
<ffffffff8010f3de>{child_rip+0}
kthread S ffff810001ec3300 0 5 1 7 147 4 (L-TLB)
ffff810001f1de78 0000000000000046 0000000000000000 ffff81002f9d5d28
ffff81002f9d34c0 ffff810001f1b480 ffff81002f9d34c0 ffff810001f1b698
ffff810001ec3330 ffff810001ec3320
Call Trace:<ffffffff801403c6>{worker_thread+278}
<ffffffff8012e060>{default_wake_function+0}
<ffffffff8012e060>{default_wake_function+0}
<ffffffff801402b0>{worker_thread+0}
<ffffffff8014424d>{kthread+205} <ffffffff8010f3e6>{child_rip+8}
<ffffffff80144180>{kthread+0} <ffffffff8010f3de>{child_rip+0}

kacpid S ffff810001ec30c0 0 7 5 90 (L-TLB)
ffff81002f911e78 0000000000000046 403b0520522700f8 604000005101c501
ffff810001c5f440 ffff810001f1add0 ffff810001c5f440 ffff810001f1afe8
c408018023918081 0000000000010000

2005-09-06 05:11:17

by Andreas Sundstrom

[permalink] [raw]
Subject: Re: nfs4 client bug

[J. Bruce Fields]
> On Sun, Sep 04, 2005 at 01:51:08PM -0700, Bret Towe wrote:
> > On 9/4/05, Bret Towe <magnade@xxxxxxxxx> wrote:
> > > On 9/4/05, Francois Romieu <romieu@xxxxxxxxxxxxx> wrote:
> > > > Bret Towe <magnade@xxxxxxxxx> :
> > > > [...]
> > > > > after moving some files on the server to a new location then
> trying to
> > > > > add the files
> > > > > to xmms playlist i found the following in dmesg after xmms
> froze
> > > > > wonder how many more items i can find...
> > > >
> > > > The system includes some binary only stuff. Please contact your
> vendor
> > > > or provide the traces for a configuration wherein the relevant
> module
> > > > was not loaded after boot. It may make sense to get in touch
> with
> > > > nfs@xxxxxxxxxxxxxxxxxxxxx then.
> > >
> > > the 'binary only stuff' is ati-drivers kernel module and it crashs
> > > with or without it
> > > ill provide a 'untainted' trace as soon as i can repeat the bug
> again
> >
> > ok without ati-drivers kernel module loaded the computer basicly
> just
> > hard locks when
> > some bug hits dunno if its the same item
>
> Do you get anything from alt-sysrq-T?

I managed to get the following after another crash, I hope it helps
someone to figure out what is going on.

kernel: Unable to handle kernel paging request at virtual address
00100104
kernel: printing eip:
kernel: c01803b9
kernel: *pde = 00000000
kernel: Oops: 0002 [#1]
kernel: PREEMPT
kernel: Modules linked in: evdev snd_emu10k1_synth snd_emux_synth
snd_seq_virmidi snd_seq_midi_emul snd_seq_midi snd_seq_midi_event
snd_seq snd_emu10k1 snd_rawmidi snd_seq_device snd_ac97_codec snd_pcm
snd_timer snd_page_alloc snd_util_mem snd_hwdep snd dm_mod st sbp2
kernel: CPU: 0
kernel: EIP: 0060:[generic_forget_inode+89/416] Not tainted VLI
kernel: EFLAGS: 00010246 (2.6.13)
kernel: EIP is at generic_forget_inode+0x59/0x1a0
kernel: eax: 00200200 ebx: cb367888 ecx: 00100100 edx: cb367890
kernel: esi: f6732c00 edi: f6732a00 ebp: 00000000 esp: c1b5dd50
kernel: ds: 007b es: 007b ss: 0068
kernel: Process kswapd0 (pid: 165, threadinfo=c1b5d000 task=c1b5b570)
kernel: Stack: cb367888 c0180561 cb3678ac cb367888 cb3677f8 c01f12ad
cb367888 c1b5dd70
kernel: ffffffff ffffffff cb367888 cb367888 cb36776c cb36787c
c01f17e2 cb367888
kernel: c01429e2 cb36792c c1b5ddf4 00000000 0000000e 00000001
c1b5ddec 00000000
kernel: Call Trace:
kernel: [iput+65/128] iput+0x41/0x80
kernel: [nfs_wait_on_inode+125/160] nfs_wait_on_inode+0x7d/0xa0
kernel: [__nfs_revalidate_inode+146/704] __nfs_revalidate_inode
+0x92/0x2c0
kernel: [find_get_pages_tag+66/144] find_get_pages_tag+0x42/0x90
kernel: [pagevec_lookup_tag+51/64] pagevec_lookup_tag+0x33/0x40
kernel: [wait_on_page_writeback_range+109/288]
wait_on_page_writeback_range+0x6d/0x120
kernel: [nfs_commit_inode+69/160] nfs_commit_inode+0x45/0xa0
kernel: [nfs_sync_inode+104/128] nfs_sync_inode+0x68/0x80
kernel: [nfs_do_return_delegation+43/96] nfs_do_return_delegation
+0x2b/0x60
kernel: [nfs_inode_return_delegation+236/272]
nfs_inode_return_delegation+0xec/0x110
kernel: [nfs4_clear_inode+35/176] nfs4_clear_inode+0x23/0xb0
kernel: [clear_inode+106/192] clear_inode+0x6a/0xc0
kernel: [dispose_list+47/288] dispose_list+0x2f/0x120
kernel: [prune_icache+142/528] prune_icache+0x8e/0x210
kernel: [get_writeback_state+64/80] get_writeback_state+0x40/0x50
kernel: [shrink_icache_memory+69/80] shrink_icache_memory+0x45/0x50
kernel: [shrink_slab+308/416] shrink_slab+0x134/0x1a0
kernel: [balance_pgdat+571/1024] balance_pgdat+0x23b/0x400
kernel: [kswapd+214/288] kswapd+0xd6/0x120
kernel: [autoremove_wake_function+0/96] autoremove_wake_function
+0x0/0x60
kernel: [kswapd+0/288] kswapd+0x0/0x120
kernel: [kernel_thread_helper+5/12] kernel_thread_helper+0x5/0xc
kernel: Code: 46 37 40 74 45 b8 00 f0 ff ff 21 e0 ff 48 14 8b 40 08 a8
08 0f 85 2c 01 00 00 83 c4 0c 5b 5e c3 89 f6 8d 53 08 8b 4b 08 8b 42 04
<89> 41 04 89 08 a1 ec 0c 49 c0 89 50 04 89 43 08 c7 42 04 ec 0c
kernel: <6>note: kswapd0[165] exited with preempt_count 1
kernel: nfs_statfs: statfs error = 13
Sep 6 06:27:26 sunkan last message repeated 4 times
kernel: 8ab69c e28ab570 e28ab698 e28ab570 5561246f 000001a4 c7290000
kernel: 00017ade 00000000 e28ab570 c7290000 c7290efc 00000000
c7290000 c0421196


kernel: Call Trace:
kernel: [schedule_timeout+118/192] schedule_timeout+0x76/0xc0
kernel: [rwsem_wake+301/320] rwsem_wake+0x12d/0x140
kernel: [futex_wait+491/544] futex_wait+0x1eb/0x220
kernel: [default_wake_function+0/32] default_wake_function+0x0/0x20
kernel: [do_futex+116/208] do_futex+0x74/0xd0
kernel: [sys_futex+102/320] sys_futex+0x66/0x140
kernel: [syscall_call+7/11] syscall_call+0x7/0xb
kernel: java D C053EAA0 0 17843 17769 17844 17842
(NOTLB)
kernel: ca834dac 00000086 e28ab060 c053eaa0 c11aee80 c11aeea0 ca834000
ca834000
kernel: 000016b5 c11468e0 c136db80 e28ab188 e28ab060 8903e7a5
00002690 ca834000
kernel: 8fc2276b ffffd185 c014eaf0 ca834000 c0490cf4 00000286
c0490cfc c0420065
kernel: Call Trace:
kernel: [shrink_cache+352/752] shrink_cache+0x160/0x2f0
kernel: [__down+213/288] __down+0xd5/0x120
kernel: [default_wake_function+0/32] default_wake_function+0x0/0x20
kernel: [__down_failed+7/12] __down_failed+0x7/0xc
kernel: [.text.lock.inode+43/68] .text.lock.inode+0x2b/0x44
kernel: [throttle_vm_writeout+76/128] throttle_vm_writeout+0x4c/0x80
kernel: [shrink_icache_memory+69/80] shrink_icache_memory+0x45/0x50
kernel: [shrink_slab+308/416] shrink_slab+0x134/0x1a0
kernel: [try_to_free_pages+246/448] try_to_free_pages+0xf6/0x1c0
kernel: [__alloc_pages+516/1088] __alloc_pages+0x204/0x440
kernel: [__get_free_pages+27/64] __get_free_pages+0x1b/0x40
kernel: [__pollwait+140/208] __pollwait+0x8c/0xd0
kernel: [unix_poll+161/192] unix_poll+0xa1/0xc0
kernel: [sock_poll+38/48] sock_poll+0x26/0x30
kernel: [do_pollfd+77/192] do_pollfd+0x4d/0xc0
kernel: [do_poll+93/208] do_poll+0x5d/0xd0
kernel: [sys_poll+137/592] sys_poll+0x89/0x250
kernel: [__pollwait+0/208] __pollwait+0x0/0xd0
kernel: [syscall_call+7/11] syscall_call+0x7/0xb
kernel: java S C053EF48 0 17844 17769 17845 17843
(NOTLB)
kernel: ca82aeb8 00200086 f3473a40 c053ef48 000001a4 6aa3fb3f ca82a000
ca82a000
kernel: 00000e82 6aa3fb3f 000001a4 e7abfb68 e7abfa40 6aa47595
000001a4 ca82a000
kernel: 0001b530 00000000 ca82aecc 029e23a8 ca82aecc 00000000
ca82a000 c0421170
kernel: Call Trace:
kernel: [schedule_timeout+80/192] schedule_timeout+0x50/0xc0
kernel: [process_timeout+0/16] process_timeout+0x0/0x10
kernel: [futex_wait+491/544] futex_wait+0x1eb/0x220
kernel: [default_wake_function+0/32] default_wake_function+0x0/0x20
kernel: [do_futex+116/208] do_futex+0x74/0xd0
kernel: [sys_futex+102/320] sys_futex+0x66/0x140
kernel: [syscall_call+7/11] syscall_call+0x7/0xb
kernel: java S C053EAA0 0 17845 17769 17846 17844
(NOTLB)
kernel: e49aeeb8 00200086 cbfe7570 c053eaa0 000027a2 c2cdc1be e49ae000
e49ae000
kernel: 000003cc c68e3acb 000027a2 cbfe7698 cbfe7570 c9ac049d
000027a2 e49ae000
kernel: cdaac8f1 00000000 081b3000 e49ae000 e49aeefc 00000000
e49ae000 c0421196
kernel: Call Trace:
kernel: [schedule_timeout+118/192] schedule_timeout+0x76/0xc0
kernel: [find_extend_vma+41/144] find_extend_vma+0x29/0x90
kernel: [futex_wait+491/544] futex_wait+0x1eb/0x220
kernel: [default_wake_function+0/32] default_wake_function+0x0/0x20
kernel: [do_futex+116/208] do_futex+0x74/0xd0
kernel: [sys_futex+102/320] sys_futex+0x66/0x140
kernel: [sys_gettimeofday+59/144] sys_gettimeofday+0x3b/0x90
kernel: [syscall_call+7/11] syscall_call+0x7/0xb
kernel: java S C053EAA0 0 17846 17769 17850 17845
(NOTLB)
kernel: e37b7eb8 00200086 e7abf530 c053eaa0 000024d5 c014b8d5 e37b7000
e37b7000
kernel: 0000088e 253d1904 000024d5 e7abf658 e7abf530 1c6a2166
0000278f e37b7000
kernel: 00ddccb3 00000000 e37b7ecc 00a6a5df e37b7ecc 00000000
e37b7000 c0421170
kernel: Call Trace:
kernel: [alloc_slabmgmt+85/96] alloc_slabmgmt+0x55/0x60
kernel: [schedule_timeout+80/192] schedule_timeout+0x50/0xc0
kernel: [process_timeout+0/16] process_timeout+0x0/0x10
kernel: [futex_wait+491/544] futex_wait+0x1eb/0x220
kernel: [default_wake_function+0/32] default_wake_function+0x0/0x20
kernel: [do_futex+116/208] do_futex+0x74/0xd0
kernel: [do_fork+250/501] do_fork+0xfa/0x1f5
kernel: [sys_futex+102/320] sys_futex+0x66/0x140
kernel: [syscall_call+7/11] syscall_call+0x7/0xb
kernel: java S C053EAD0 0 17850 17769 17862 17846
(NOTLB)
kernel: e3880eb8 00200086 e8253020 c053ead0 000027a4 c68d7864 e3880000
e3880000
kernel: 000002ba 658f8bbd 000027a4 c4cfa188 c4cfa060 658fa947
000027a4 e3880000
kernel: d305e654 fffffc03 082d6000 e3880000 e3880efc 00000000
e3880000 c0421196
kernel: Call Trace:
kernel: [schedule_timeout+118/192] schedule_timeout+0x76/0xc0
kernel: [find_extend_vma+41/144] find_extend_vma+0x29/0x90
kernel: [futex_wait+491/544] futex_wait+0x1eb/0x220
kernel: [default_wake_function+0/32] default_wake_function+0x0/0x20
kernel: [do_futex+116/208] do_futex+0x74/0xd0
kernel: [sys_futex+102/320] sys_futex+0x66/0x140
kernel: [sys_gettimeofday+59/144] sys_gettimeofday+0x3b/0x90
kernel: [syscall_call+7/11] syscall_call+0x7/0xb
kernel: java S C053EAA0 0 17862 17769 17864 17850
(NOTLB)
kernel: d37fceb8 00200086 f1f50530 c053eaa0 00000000 c014b8d5 d37fc000
d37fc000
kernel: 0000037c d37fc000 c014bb05 f1f50658 f1f50530 558dd2f8
00002791 d37fc000
kernel: 01aee0b8 00000000 086b5000 d37fc000 d37fcefc 00000000
d37fc000 c0421196
kernel: Call Trace:
kernel: [alloc_slabmgmt+85/96] alloc_slabmgmt+0x55/0x60
kernel: [cache_grow+261/448] cache_grow+0x105/0x1c0
kernel: [schedule_timeout+118/192] schedule_timeout+0x76/0xc0
kernel: [find_extend_vma+41/144] find_extend_vma+0x29/0x90
kernel: [futex_wait+491/544] futex_wait+0x1eb/0x220
kernel: [default_wake_function+0/32] default_wake_function+0x0/0x20
kernel: [do_futex+116/208] do_futex+0x74/0xd0
kernel: [do_fork+250/501] do_fork+0xfa/0x1f5
kernel: [sys_futex+102/320] sys_futex+0x66/0x140
kernel: [do_sched_setscheduler+122/208] do_sched_setscheduler+0x7a/0xd0
kernel: [syscall_call+7/11] syscall_call+0x7/0xb
kernel: java S C053EAA0 0 17864 17769 17865 17862
(NOTLB)
kernel: d35ceeb8 00200086 f0edf060 c053eaa0 00000000 c014b8d5 d35ce000
d35ce000
kernel: 000004ed d35ce000 c014bb05 f0edf188 f0edf060 1e4c64b8
00002791 d35ce000
kernel: 03c7609d 00000000 d35ceecc 00a4d995 d35ceecc 00000000
d35ce000 c0421170
kernel: Call Trace:
kernel: [alloc_slabmgmt+85/96] alloc_slabmgmt+0x55/0x60
kernel: [cache_grow+261/448] cache_grow+0x105/0x1c0
kernel: [schedule_timeout+80/192] schedule_timeout+0x50/0xc0
kernel: [process_timeout+0/16] process_timeout+0x0/0x10
kernel: [futex_wait+491/544] futex_wait+0x1eb/0x220
kernel: [default_wake_function+0/32] default_wake_function+0x0/0x20
kernel: [do_futex+116/208] do_futex+0x74/0xd0
kernel: [do_fork+250/501] do_fork+0xfa/0x1f5
kernel: [sys_futex+102/320] sys_futex+0x66/0x140
kernel: [syscall_call+7/11] syscall_call+0x7/0xb
kernel: java S C053EAD0 0 17865 17769 17871 17864
(NOTLB)
kernel: edf23eb8 00200086 f0edf570 c053ead0 000002f6 b6aa94ac edf23000
edf23000
kernel: 00000fc4 b6aa94ac 000002f6 f0edfba8 f0edfa80 b6aaad1c
000002f6 edf23000
kernel: 0019ad5c 00000000 086ac000 edf23000 edf23efc 00000000
edf23000 c0421196
kernel: Call Trace:
kernel: [schedule_timeout+118/192] schedule_timeout+0x76/0xc0
kernel: [__wake_up_common+67/112] __wake_up_common+0x43/0x70
kernel: [futex_wait+491/544] futex_wait+0x1eb/0x220
kernel: [default_wake_function+0/32] default_wake_function+0x0/0x20
kernel: [do_futex+116/208] do_futex+0x74/0xd0
kernel: [do_fork+250/501] do_fork+0xfa/0x1f5
kernel: [sys_futex+102/320] sys_futex+0x66/0x140
kernel: [do_sched_setscheduler+122/208] do_sched_setscheduler+0x7a/0xd0
kernel: [syscall_call+7/11] syscall_call+0x7/0xb
kernel: java S C053EAD0 0 17871 17769 17873 17865
(NOTLB)
kernel: e445eeb8 00200086 f2385060 c053ead0 000027a4 c3f730d9 e445e000
e445e000
kernel: 0000149c c60a98be 000027a4 ee430ba8 ee430a80 c60b6b60
000027a4 e445e000
kernel: c091cef1 fffffe14 e445eecc 00a4b701 e445eecc 00000000
e445e000 c0421170
kernel: Call Trace:
kernel: [schedule_timeout+80/192] schedule_timeout+0x50/0xc0
kernel: [process_timeout+0/16] process_timeout+0x0/0x10
kernel: [futex_wait+491/544] futex_wait+0x1eb/0x220
kernel: [default_wake_function+0/32] default_wake_function+0x0/0x20
kernel: [do_futex+116/208] do_futex+0x74/0xd0
kernel: [sys_futex+102/320] sys_futex+0x66/0x140
kernel: [syscall_call+7/11] syscall_call+0x7/0xb
kernel: java S C053EAA0 0 17873 17769 17901 17871
(NOTLB)
kernel: ceca9eb8 00200086 ee430060 c053eaa0 00002311 e57eb544 ceca9000
ceca9000
kernel: 00000855 bda4e8ff 00002311 ee430188 ee430060 4fb54e99
00002785 ceca9000
kernel: 00cfda47 00000000 08362000 ceca9000 ceca9efc 00000000
ceca9000 c0421196
kernel: Call Trace:
kernel: [schedule_timeout+118/192] schedule_timeout+0x76/0xc0
kernel: [find_extend_vma+41/144] find_extend_vma+0x29/0x90
kernel: [futex_wait+491/544] futex_wait+0x1eb/0x220
kernel: [default_wake_function+0/32] default_wake_function+0x0/0x20
kernel: [do_futex+116/208] do_futex+0x74/0xd0
kernel: [sys_futex+102/320] sys_futex+0x66/0x140
kernel: [sys_gettimeofday+59/144] sys_gettimeofday+0x3b/0x90
kernel: [syscall_call+7/11] syscall_call+0x7/0xb
kernel: java S C053EAA0 0 17901 17769 425 17873
(NOTLB)
kernel: e04ddeb8 00200086 f032a530 c053eaa0 00002777 47fee18b e04dd000
e04dd000
kernel: 0000050c 47fee18b 00002777 f032a658 f032a530 a8866baa
000027a4 e04dd000
kernel: 1dfc5a7a 00000000 e04ddecc 00a4c512 e04ddecc 00000000
e04dd000 c0421170
kernel: Call Trace:
kernel: [schedule_timeout+80/192] schedule_timeout+0x50/0xc0
kernel: [process_timeout+0/16] process_timeout+0x0/0x10
kernel: [futex_wait+491/544] futex_wait+0x1eb/0x220
kernel: [default_wake_function+0/32] default_wake_function+0x0/0x20
kernel: [do_futex+116/208] do_futex+0x74/0xd0
kernel: [do_fork+250/501] do_fork+0xfa/0x1f5
kernel: [sys_futex+102/320] sys_futex+0x66/0x140
kernel: [syscall_call+7/11] syscall_call+0x7/0xb
kernel: java S C053EAD0 0 425 17769 1569 17901
(NOTLB)
kernel: efb14eb8 00200086 e3354a40 c053ead0 000027a4 082e50f8 efb14000
efb14000
kernel: 0000127e 082e50f8 000027a4 f6244658 f6244530 082e7190
000027a4 efb14000
kernel: 08817d2f 00000000 efb14ecc 00a4bd8f efb14ecc 00000000
efb14000 c0421170
kernel: Call Trace:
kernel: [schedule_timeout+80/192] schedule_timeout+0x50/0xc0
kernel: [process_timeout+0/16] process_timeout+0x0/0x10
kernel: [futex_wait+491/544] futex_wait+0x1eb/0x220
kernel: [default_wake_function+0/32] default_wake_function+0x0/0x20
kernel: [do_futex+116/208] do_futex+0x74/0xd0
kernel: [sys_futex+102/320] sys_futex+0x66/0x140
kernel: [syscall_call+7/11] syscall_call+0x7/0xb
kernel: java S C053EAA0 0 1569 17769 1596 425
(NOTLB)
kernel: d066bf14 00200086 f0edf570 c053eaa0 c048fbc8 d066bef0 d066b000
d066b000
kernel: 000085df d066bf9c c0147bf7 f0edf698 f0edf570 55d6dac4
00002791 d066b000
kernel: 001cedf8 00000000 d066bf28 00a4da7f d066bf28 d066bf64
00007531 c0421170
kernel: Call Trace:
kernel: [__get_free_pages+39/64] __get_free_pages+0x27/0x40
kernel: [schedule_timeout+80/192] schedule_timeout+0x50/0xc0
kernel: [process_timeout+0/16] process_timeout+0x0/0x10
kernel: [do_poll+166/208] do_poll+0xa6/0xd0
kernel: [sys_poll+137/592] sys_poll+0x89/0x250
kernel: [__pollwait+0/208] __pollwait+0x0/0xd0
kernel: [syscall_call+7/11] syscall_call+0x7/0xb
kernel: java S C053EAA0 0 1596 17769 1599 1569
(NOTLB)
kernel: e448feb8 00200086 e7abf020 c053eaa0 eb79b700 e83052f5 e448f000
e448f000
kernel: 00000540 c011ce67 cfc7a020 e7abf148 e7abf020 e830a774
000027a2 e448f000
kernel: 000433ec 00000000 e448fecc 00a4b8d6 e448fecc 00000000
e448f000 c0421170
kernel: Call Trace:
kernel: [activate_task+103/128] activate_task+0x67/0x80
kernel: [schedule_timeout+80/192] schedule_timeout+0x50/0xc0
kernel: [process_timeout+0/16] process_timeout+0x0/0x10
kernel: [futex_wait+491/544] futex_wait+0x1eb/0x220
kernel: [default_wake_function+0/32] default_wake_function+0x0/0x20
kernel: [do_futex+116/208] do_futex+0x74/0xd0
kernel: [sys_futex+102/320] sys_futex+0x66/0x140
kernel: [syscall_call+7/11] syscall_call+0x7/0xb
kernel: java S C053EAA0 0 1599 17769 1596
(NOTLB)
kernel: f6008eb8 00200086 d06c5570 c053eaa0 eb79b700 a8863d52 f6008000
f6008000
kernel: 0000055e c011ce67 f032a530 d06c5698 d06c5570 a886913d
000027a4 f6008000
kernel: 0001b8c6 00000000 f6008ecc 00a4c030 f6008ecc 00000000
f6008000 c0421170
kernel: Call Trace:
kernel: [activate_task+103/128] activate_task+0x67/0x80
kernel: [schedule_timeout+80/192] schedule_timeout+0x50/0xc0
kernel: [process_timeout+0/16] process_timeout+0x0/0x10
kernel: [futex_wait+491/544] futex_wait+0x1eb/0x220
kernel: [default_wake_function+0/32] default_wake_function+0x0/0x20
kernel: [do_futex+116/208] do_futex+0x74/0xd0
kernel: [sys_futex+102/320] sys_futex+0x66/0x140
kernel: [syscall_call+7/11] syscall_call+0x7/0xb
kernel: getty S C053EAA0 0 23442 1 1070 18624
(NOTLB)
kernel: f699feac 00000086 f7951a40 c053eaa0 00000000 00000020 f699f000
f699f000
kernel: 0000628a c02c5842 c1813000 f7951b68 f7951a40 e6395180
00000e87 f699f000
kernel: 0014ed77 00000000 c027eab3 00000000 00000019 f53dc000
f699f000 c0421196
kernel: Call Trace:
kernel: [do_con_write+754/1504] do_con_write+0x2f2/0x5e0
kernel: [vgacon_cursor+419/560] vgacon_cursor+0x1a3/0x230
kernel: [schedule_timeout+118/192] schedule_timeout+0x76/0xc0
kernel: [set_cursor+90/128] set_cursor+0x5a/0x80
kernel: [read_chan+974/1856] read_chan+0x3ce/0x740
kernel: [default_wake_function+0/32] default_wake_function+0x0/0x20
kernel: [tty_read+155/224] tty_read+0x9b/0xe0
kernel: [vfs_read+181/400] vfs_read+0xb5/0x190
kernel: [sys_read+75/128] sys_read+0x4b/0x80
kernel: [syscall_call+7/11] syscall_call+0x7/0xb
kernel: cron S C053EAA0 0 1035 3487 1036
(NOTLB)
kernel: ddd69ee4 00000086 c2b2c060 c053eaa0 01000001 000000d0 ddd69000
ddd69000
kernel: 00000966 ddd69000 000000d0 c2b2c188 c2b2c060 3c9c9732
0000266f ddd69000
kernel: 0053dc04 00000000 00000246 f7ab53e4 00001000 00000000
00000000 c0170d41
kernel: Call Trace:
kernel: [pipe_wait+113/144] pipe_wait+0x71/0x90
kernel: [autoremove_wake_function+0/96] autoremove_wake_function
+0x0/0x60
kernel: [pipe_readv+504/864] pipe_readv+0x1f8/0x360
kernel: [pipe_read+55/64] pipe_read+0x37/0x40
kernel: [vfs_read+181/400] vfs_read+0xb5/0x190
kernel: [sys_read+75/128] sys_read+0x4b/0x80
kernel: [syscall_call+7/11] syscall_call+0x7/0xb
kernel: sh S C053EAA0 0 1036 1035 1039
(NOTLB)
kernel: df73df34 00000082 d2065a80 c053eaa0 ca7a94ec c0118c07 df73d000
df73d000
kernel: 00004583 00000001 00000007 d2065ba8 d2065a80 3ce2ef31
0000266f df73d000
kernel: 0022d813 00000000 0000266f d2065b2c 00000001 d2065a80
00000004 c0124daf
kernel: Call Trace:
kernel: [do_page_fault+359/1742] do_page_fault+0x167/0x6ce
kernel: [do_wait+655/1008] do_wait+0x28f/0x3f0
kernel: [preempt_schedule+77/96] preempt_schedule+0x4d/0x60
kernel: [default_wake_function+0/32] default_wake_function+0x0/0x20
kernel: [sys_wait4+67/80] sys_wait4+0x43/0x50
kernel: [sys_waitpid+39/43] sys_waitpid+0x27/0x2b
kernel: [syscall_call+7/11] syscall_call+0x7/0xb
kernel: run-parts S C053EAA0 0 1039 1036 1103
(NOTLB)
kernel: e6f26ebc 00000086 f55cf570 c053eaa0 000000d0 00000001 e6f26000
e6f26000
kernel: 00000dc7 c048fbc4 00000010 f55cf698 f55cf570 718a7b3d
00002672 e6f26000
kernel: 0022a4c1 00000000 00000000 d1fbd640 00000040 00000006
0000001a c0421196
kernel: Call Trace:
kernel: [schedule_timeout+118/192] schedule_timeout+0x76/0xc0
kernel: [pipe_poll+205/224] pipe_poll+0xcd/0xe0
kernel: [do_select+698/880] do_select+0x2ba/0x370
kernel: [__pollwait+0/208] __pollwait+0x0/0xd0
kernel: [sys_select+574/976] sys_select+0x23e/0x3d0
kernel: [syscall_call+7/11] syscall_call+0x7/0xb
kernel: cupsd S C053EAA0 0 1070 1 1598 23442
(NOTLB)
kernel: f65e2ebc 00000082 f78f1020 c053eaa0 00000001 c048fbc4 f65e2000
f65e2000
kernel: 000008e2 0c3cd223 000027a4 f78f1148 f78f1020 47c8af9c
000027a4 f65e2000
kernel: 18a2bacd 00000000 f65e2ed0 00a55fbb f65e2ed0 00000006
0000001a c0421170
kernel: Call Trace:
kernel: [schedule_timeout+80/192] schedule_timeout+0x50/0xc0
kernel: [process_timeout+0/16] process_timeout+0x0/0x10
kernel: [do_select+698/880] do_select+0x2ba/0x370
kernel: [__pollwait+0/208] __pollwait+0x0/0xd0
kernel: [sys_select+574/976] sys_select+0x23e/0x3d0
kernel: [syscall_call+7/11] syscall_call+0x7/0xb
kernel: slocate S C053EAA0 0 1103 1039 1104
(NOTLB)
kernel: f537cf34 00000086 f032aa40 c053eaa0 f515e90c c0118c07 f537c000
f537c000
kernel: 0000366b 00000001 00000007 f032ab68 f032aa40 7216f663
00002672 f537c000
kernel: 00222bf0 00000000 00002672 f032aaec 00000001 f032aa40
00000004 c0124daf
kernel: Call Trace:
kernel: [do_page_fault+359/1742] do_page_fault+0x167/0x6ce
kernel: [do_wait+655/1008] do_wait+0x28f/0x3f0
kernel: [preempt_schedule+77/96] preempt_schedule+0x4d/0x60
kernel: [default_wake_function+0/32] default_wake_function+0x0/0x20
kernel: [sys_wait4+67/80] sys_wait4+0x43/0x50
kernel: [sys_waitpid+39/43] sys_waitpid+0x27/0x2b
kernel: [syscall_call+7/11] syscall_call+0x7/0xb
kernel: updatedb D C053EAA0 0 1104 1103
(NOTLB)
kernel: f5ff8adc 00000082 f6495a40 c053eaa0 f7953040 f6495a40 f5ff8000
f5ff8000
kernel: 000039ee f5ff8000 c0420665 f6495b68 f6495a40 10a2561b
00002691 f5ff8000
kernel: d3ed0679 00000000 e3ca1cc4 f5ff8000 c0490cf4 00000286
c0490cfc c0420065
kernel: Call Trace:
kernel: [schedule+1093/1584] schedule+0x445/0x630
kernel: [__down+213/288] __down+0xd5/0x120
kernel: [default_wake_function+0/32] default_wake_function+0x0/0x20
kernel: [__down_failed+7/12] __down_failed+0x7/0xc
kernel: [.text.lock.inode+43/68] .text.lock.inode+0x2b/0x44
kernel: [shrink_icache_memory+69/80] shrink_icache_memory+0x45/0x50
kernel: [shrink_slab+308/416] shrink_slab+0x134/0x1a0
kernel: [try_to_free_pages+246/448] try_to_free_pages+0xf6/0x1c0
kernel: [__alloc_pages+516/1088] __alloc_pages+0x204/0x440
kernel: [kmem_getpages+57/192] kmem_getpages+0x39/0xc0
kernel: [pathrelse+48/80] pathrelse+0x30/0x50
kernel: [cache_grow+181/448] cache_grow+0xb5/0x1c0
kernel: [cache_alloc_refill+530/592] cache_alloc_refill+0x212/0x250
kernel: [kmem_cache_alloc+56/64] kmem_cache_alloc+0x38/0x40
kernel: [reiserfs_alloc_inode+24/48] reiserfs_alloc_inode+0x18/0x30
kernel: [alloc_inode+27/336] alloc_inode+0x1b/0x150
kernel: [iget5_locked+138/352] iget5_locked+0x8a/0x160
kernel: [get_new_inode+31/400] get_new_inode+0x1f/0x190
kernel: [reiserfs_init_locked_inode+0/32] reiserfs_init_locked_inode
+0x0/0x20
kernel: [reiserfs_find_actor+0/48] reiserfs_find_actor+0x0/0x30
kernel: [reiserfs_iget+72/192] reiserfs_iget+0x48/0xc0
kernel: [reiserfs_find_actor+0/48] reiserfs_find_actor+0x0/0x30
kernel: [reiserfs_init_locked_inode+0/32] reiserfs_init_locked_inode
+0x0/0x20
kernel: [reiserfs_lookup+256/400] reiserfs_lookup+0x100/0x190
kernel: [real_lookup+222/272] real_lookup+0xde/0x110
kernel: [do_lookup+157/176] do_lookup+0x9d/0xb0
kernel: [__link_path_walk+2133/4016] __link_path_walk+0x855/0xfb0
kernel: [dput+307/704] dput+0x133/0x2c0
kernel: [link_path_walk+87/288] link_path_walk+0x57/0x120
kernel: [path_lookup+134/352] path_lookup+0x86/0x160
kernel: [__user_walk+51/96] __user_walk+0x33/0x60
kernel: [vfs_lstat+28/96] vfs_lstat+0x1c/0x60
kernel: [sys_lstat64+24/64] sys_lstat64+0x18/0x40
kernel: [syscall_call+7/11] syscall_call+0x7/0xb
kernel: SysRq : HELP : loglevel0-8 reBoot tErm Full kIll saK showMem
Nice powerOff showPc unRaw Sync showTasks Unmount
kernel: SysRq : Emergency Sync
kernel: Emergency Sync complete
kernel: SysRq : Emergency Sync


/Andreas Sundstrom

2005-09-06 18:13:30

by J. Bruce Fields

[permalink] [raw]
Subject: Re: nfs4 client bug

On Mon, Sep 05, 2005 at 08:40:53PM -0700, Bret Towe wrote:
> Pid: 14169, comm: xmms Tainted: G M 2.6.13

Hm, can someone explain what that means? A proprietary module was
loaded then unloaded, maybe?

You may also want to retest with

http://www.citi.umich.edu/projects/nfsv4/linux/kernel-patches/2.6.13-1/linux-2.6.13-001-NFS_ALL_MODIFIED.dif

applied, to make sure there isn't a patch in Trond's series that already
fixes the bug.

--b.

2005-09-06 18:21:14

by Randy Dunlap

[permalink] [raw]
Subject: Re: nfs4 client bug

On Tue, 6 Sep 2005, J. Bruce Fields wrote:

> On Mon, Sep 05, 2005 at 08:40:53PM -0700, Bret Towe wrote:
> > Pid: 14169, comm: xmms Tainted: G M 2.6.13
>
> Hm, can someone explain what that means? A proprietary module was
> loaded then unloaded, maybe?

'M' means Machine Check, which sets the Tainted flag.
So the processor thought that there was some kind of problem.

(/we needs to update Documentation/oops-tracing.txt)

> You may also want to retest with
>
> http://www.citi.umich.edu/projects/nfsv4/linux/kernel-patches/2.6.13-1/linux-2.6.13-001-NFS_ALL_MODIFIED.dif
>
> applied, to make sure there isn't a patch in Trond's series that already
> fixes the bug.

--
~Randy

2005-09-06 18:30:13

by J. Bruce Fields

[permalink] [raw]
Subject: Re: nfs4 client bug

On Tue, Sep 06, 2005 at 11:21:09AM -0700, Randy.Dunlap wrote:
> On Tue, 6 Sep 2005, J. Bruce Fields wrote:
>
> > On Mon, Sep 05, 2005 at 08:40:53PM -0700, Bret Towe wrote:
> > > Pid: 14169, comm: xmms Tainted: G M 2.6.13
> >
> > Hm, can someone explain what that means? A proprietary module was
> > loaded then unloaded, maybe?
>
> 'M' means Machine Check, which sets the Tainted flag.
> So the processor thought that there was some kind of problem.

Does this NMI watchdog event ("NMI Watchdog detected LOCKUP on CPU0CPU
0") set that flag?

> (/we needs to update Documentation/oops-tracing.txt)

Oops, thanks, I overlooked that!

--b.

2005-09-06 18:40:10

by Randy Dunlap

[permalink] [raw]
Subject: Re: nfs4 client bug

On Tue, 6 Sep 2005, J. Bruce Fields wrote:

> On Tue, Sep 06, 2005 at 11:21:09AM -0700, Randy.Dunlap wrote:
> > On Tue, 6 Sep 2005, J. Bruce Fields wrote:
> >
> > > On Mon, Sep 05, 2005 at 08:40:53PM -0700, Bret Towe wrote:
> > > > Pid: 14169, comm: xmms Tainted: G M 2.6.13
> > >
> > > Hm, can someone explain what that means? A proprietary module was
> > > loaded then unloaded, maybe?
> >
> > 'M' means Machine Check, which sets the Tainted flag.
> > So the processor thought that there was some kind of problem.
>
> Does this NMI watchdog event ("NMI Watchdog detected LOCKUP on CPU0CPU
> 0") set that flag?

Not that I can see.

There should be a logged MCE (machine check exception) in there
somewhere AFAICT.

> > (/we needs to update Documentation/oops-tracing.txt)
>
> Oops, thanks, I overlooked that!
>
> --b.

--
~Randy

2005-09-06 18:53:26

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: nfs4 client bug

On Tue, 06 Sep 2005 14:30:08 EDT, "J. Bruce Fields" said:
> On Tue, Sep 06, 2005 at 11:21:09AM -0700, Randy.Dunlap wrote:
> > On Tue, 6 Sep 2005, J. Bruce Fields wrote:
> >
> > > On Mon, Sep 05, 2005 at 08:40:53PM -0700, Bret Towe wrote:
> > > > Pid: 14169, comm: xmms Tainted: G M 2.6.13
> > >
> > > Hm, can someone explain what that means? A proprietary module was
> > > loaded then unloaded, maybe?
> >
> > 'M' means Machine Check, which sets the Tainted flag.
> > So the processor thought that there was some kind of problem.
>
> Does this NMI watchdog event ("NMI Watchdog detected LOCKUP on CPU0CPU
> 0") set that flag?

Not directly - but if the MCE wedged a processor, that could cause the NMI
to fire complaining about a lockup. You should have a MCE logged someplace.


Attachments:
(No filename) (226.00 B)

2005-09-09 04:57:12

by Randy Dunlap

[permalink] [raw]
Subject: [PATCH] Doc: update oops-tracing.txt (Tainted flags)


From: Randy Dunlap <[email protected]>

Update Documentation/oops-tracing.txt:
- add descriptions of 3 more "Tainted" flags;
- fix some typos;

Signed-off-by: Randy Dunlap <[email protected]>
---

Documentation/oops-tracing.txt | 25 +++++++++++++++++--------
1 files changed, 17 insertions(+), 8 deletions(-)

diff -Naurp linux-2613-work/Documentation/oops-tracing.txt~doc_taint_update linux-2613-work/Documentation/oops-tracing.txt
--- linux-2613-work/Documentation/oops-tracing.txt~doc_taint_update 2005-08-28 16:41:01.000000000 -0700
+++ linux-2613-work/Documentation/oops-tracing.txt 2005-09-08 21:43:02.000000000 -0700
@@ -205,8 +205,8 @@ Phone: 701-234-7556
Tainted kernels:

Some oops reports contain the string 'Tainted: ' after the program
-counter, this indicates that the kernel has been tainted by some
-mechanism. The string is followed by a series of position sensitive
+counter. This indicates that the kernel has been tainted by some
+mechanism. The string is followed by a series of position-sensitive
characters, each representing a particular tainted value.

1: 'G' if all modules loaded have a GPL or compatible license, 'P' if
@@ -214,16 +214,25 @@ characters, each representing a particul
MODULE_LICENSE or with a MODULE_LICENSE that is not recognised by
insmod as GPL compatible are assumed to be proprietary.

- 2: 'F' if any module was force loaded by insmod -f, ' ' if all
+ 2: 'F' if any module was force loaded by "insmod -f", ' ' if all
modules were loaded normally.

3: 'S' if the oops occurred on an SMP kernel running on hardware that
- hasn't been certified as safe to run multiprocessor.
- Currently this occurs only on various Athlons that are not
- SMP capable.
+ hasn't been certified as safe to run multiprocessor.
+ Currently this occurs only on various Athlons that are not
+ SMP capable.
+
+ 4: 'R' if a module was force unloaded by "rmmod -f", ' ' if all
+ modules were unloaded normally.
+
+ 5: 'M' if any processor has reported a Machine Check Exception,
+ ' ' if no Machine Check Exceptions have occurred.
+
+ 6: 'B' if a page-release function has found a bad page reference or
+ some unexpected page flags.

The primary reason for the 'Tainted: ' string is to tell kernel
debuggers if this is a clean kernel or if anything unusual has
-occurred. Tainting is permanent, even if an offending module is
-unloading the tainted value remains to indicate that the kernel is not
+occurred. Tainting is permanent: even if an offending module is
+unloaded, the tainted value remains to indicate that the kernel is not
trustworthy.


---

2005-09-09 06:55:28

by Bret Towe

[permalink] [raw]
Subject: Re: nfs4 client bug

On 9/6/05, J. Bruce Fields <[email protected]> wrote:
> On Mon, Sep 05, 2005 at 08:40:53PM -0700, Bret Towe wrote:
> > Pid: 14169, comm: xmms Tainted: G M 2.6.13
>
> Hm, can someone explain what that means? A proprietary module was
> loaded then unloaded, maybe?
>
> You may also want to retest with
>
> http://www.citi.umich.edu/projects/nfsv4/linux/kernel-patches/2.6.13-1/linux-2.6.13-001-NFS_ALL_MODIFIED.dif
>
> applied, to make sure there isn't a patch in Trond's series that already
> fixes the bug.
>
> --b.
>

ive been running this since i got the url and so far i havent hit it
ive also been a bit busy so i havent been able to make sure its good
this weekend i should be able to test it and make sure its solved

2005-09-12 18:27:45

by Bret Towe

[permalink] [raw]
Subject: Re: nfs4 client bug

On 9/8/05, Bret Towe <[email protected]> wrote:
> On 9/6/05, J. Bruce Fields <[email protected]> wrote:
> > On Mon, Sep 05, 2005 at 08:40:53PM -0700, Bret Towe wrote:
> > > Pid: 14169, comm: xmms Tainted: G M 2.6.13
> >
> > Hm, can someone explain what that means? A proprietary module was
> > loaded then unloaded, maybe?
> >
> > You may also want to retest with
> >
> > http://www.citi.umich.edu/projects/nfsv4/linux/kernel-patches/2.6.13-1/linux-2.6.13-001-NFS_ALL_MODIFIED.dif
> >
> > applied, to make sure there isn't a patch in Trond's series that already
> > fixes the bug.
> >
> > --b.
> >
>
> ive been running this since i got the url and so far i havent hit it
> ive also been a bit busy so i havent been able to make sure its good
> this weekend i should be able to test it and make sure its solved
>
ran it pretty hard over the weekend and i had no crashs at all
so i think its safe to say this patch fixes the issues i was seeing

2005-09-12 18:29:13

by J. Bruce Fields

[permalink] [raw]
Subject: Re: nfs4 client bug

On Mon, Sep 12, 2005 at 11:27:37AM -0700, Bret Towe wrote:
> ran it pretty hard over the weekend and i had no crashs at all
> so i think its safe to say this patch fixes the issues i was seeing

Good, thanks for the testing.--b.