2003-07-19 20:08:40

by Alex Riesen

[permalink] [raw]
Subject: 2.6.0-test1+bk: NFS client-side freezes

Hi, all

I experienced BK freezing in D state while updating a repository on an
NFS volume.
I could see something being constantly transferred from the computer,
around 1Mb/s, by gkrellm judgement.
Also, BK reported corruptions in directories: it couldn't find some files.

I do not remember seeing them freezes before, but I have to admit I
never tried really hard (like running bk over nfs). There were certainly
none of them in 2.4, as I did run bk there over nfs.

The BK process unstuck after I pressed Alt-SysRq-T. That's what I got:

bk D CD245BD8 1346 1344 1345 (NOTLB)
cd245b74 00200086 d9fa7068 cd245bd8 00000000 00000000 00200202 cd245bd8
00200246 cd244000 cd245bd8 cd245c50 cd245bb8 dc8d0998 cd245bd8 00000064
00000004 cd244000 00000000 cd2478e0 c011a470 cd245c50 cd245c50 dbac34a0
Call Trace:
[_end+475503432/1070231984] __rpc_execute+0x118/0x340 [sunrpc]
[<dc8d0998>] __rpc_execute+0x118/0x340 [sunrpc]
[default_wake_function+0/48] default_wake_function+0x0/0x30
[<c011a470>] default_wake_function+0x0/0x30
[_end+475486030/1070231984] rpc_call_sync+0x8e/0xd0 [sunrpc]
[<dc8cc59e>] rpc_call_sync+0x8e/0xd0 [sunrpc]
[_end+475486768/1070231984] call_reserveresult+0x0/0xf0 [sunrpc]
[<dc8cc880>] call_reserveresult+0x0/0xf0 [sunrpc]
[_end+475500272/1070231984] rpc_run_timer+0x0/0x80 [sunrpc]
[<dc8cfd40>] rpc_run_timer+0x0/0x80 [sunrpc]
[_end+475662022/1070231984] nfs3_rpc_wrapper+0x46/0x90 [nfs]
[<dc8f7516>] nfs3_rpc_wrapper+0x46/0x90 [nfs]
[_end+475662575/1070231984] nfs3_proc_getattr+0x5f/0xa0 [nfs]
[<dc8f773f>] nfs3_proc_getattr+0x5f/0xa0 [nfs]
[_end+475632436/1070231984] __nfs_revalidate_inode+0x154/0x320 [nfs]
[<dc8f0184>] __nfs_revalidate_inode+0x154/0x320 [nfs]
[_end+475656538/1070231984] nfs_updatepage+0x27a/0x300 [nfs]
[<dc8f5faa>] nfs_updatepage+0x27a/0x300 [nfs]
[dput+48/544] dput+0x30/0x220
[<c0169810>] dput+0x30/0x220
[_end+475614753/1070231984] nfs_lookup_revalidate+0x231/0x650 [nfs]
[<dc8ebc71>] nfs_lookup_revalidate+0x231/0x650 [nfs]
[_end+475615139/1070231984] nfs_lookup_revalidate+0x3b3/0x650 [nfs]
[<dc8ebdf3>] nfs_lookup_revalidate+0x3b3/0x650 [nfs]
[update_wall_time+22/64] update_wall_time+0x16/0x40
[<c0125746>] update_wall_time+0x16/0x40
[do_timer+224/240] do_timer+0xe0/0xf0
[<c0125bb0>] do_timer+0xe0/0xf0
[_end+475510727/1070231984] rpcauth_lookup_credcache+0x217/0x260 [sunrpc]
[<dc8d2617>] rpcauth_lookup_credcache+0x217/0x260 [sunrpc]
[_end+475510911/1070231984] rpcauth_lookupcred+0x6f/0xa0 [sunrpc]
[<dc8d26cf>] rpcauth_lookupcred+0x6f/0xa0 [sunrpc]
[_end+475510727/1070231984] rpcauth_lookup_credcache+0x217/0x260 [sunrpc]
[<dc8d2617>] rpcauth_lookup_credcache+0x217/0x260 [sunrpc]
[_end+475510911/1070231984] rpcauth_lookupcred+0x6f/0xa0 [sunrpc]
[<dc8d26cf>] rpcauth_lookupcred+0x6f/0xa0 [sunrpc]
[_end+475511279/1070231984] put_rpccred+0x1f/0xe0 [sunrpc]
[<dc8d283f>] put_rpccred+0x1f/0xe0 [sunrpc]
[cached_lookup+79/128] cached_lookup+0x4f/0x80
[<c01601ef>] cached_lookup+0x4f/0x80
[__lookup_hash+101/208] __lookup_hash+0x65/0xd0
[<c0161235>] __lookup_hash+0x65/0xd0
[open_namei+307/1072] open_namei+0x133/0x430
[<c0161923>] open_namei+0x133/0x430
[filp_open+65/112] filp_open+0x41/0x70
[<c0151a41>] filp_open+0x41/0x70
[sys_open+85/144] sys_open+0x55/0x90
[<c0151f15>] sys_open+0x55/0x90
[syscall_call+7/11] syscall_call+0x7/0xb
[<c01094fb>] syscall_call+0x7/0xb

NFS: readdir reply truncated!

There was also frozen mutt, as I found later:

mutt D D8931B70 949 948 (NOTLB)
d8931b0c 00000082 d9fa7068 d8931b70 00000000 00000000 00000202 d8931b70
00000246 d8930000 d8931b70 d8931be8 d8931b50 dc8d0998 d8931b70 00000064
00000004 d8930000 00000000 db16e100 c011a470 d8931be8 d8931be8 dbac34a0
Call Trace:
[_end+475503432/1070231984] __rpc_execute+0x118/0x340 [sunrpc]
[<dc8d0998>] __rpc_execute+0x118/0x340 [sunrpc]
[default_wake_function+0/48] default_wake_function+0x0/0x30
[<c011a470>] default_wake_function+0x0/0x30
[_end+475486030/1070231984] rpc_call_sync+0x8e/0xd0 [sunrpc]
[<dc8cc59e>] rpc_call_sync+0x8e/0xd0 [sunrpc]
[_end+475486768/1070231984] call_reserveresult+0x0/0xf0 [sunrpc]
[<dc8cc880>] call_reserveresult+0x0/0xf0 [sunrpc]
[_end+475500272/1070231984] rpc_run_timer+0x0/0x80 [sunrpc]
[<dc8cfd40>] rpc_run_timer+0x0/0x80 [sunrpc]
[_end+475662022/1070231984] nfs3_rpc_wrapper+0x46/0x90 [nfs]
[<dc8f7516>] nfs3_rpc_wrapper+0x46/0x90 [nfs]
[_end+475662575/1070231984] nfs3_proc_getattr+0x5f/0xa0 [nfs]
[<dc8f773f>] nfs3_proc_getattr+0x5f/0xa0 [nfs]
[_end+475632436/1070231984] __nfs_revalidate_inode+0x154/0x320 [nfs]
[<dc8f0184>] __nfs_revalidate_inode+0x154/0x320 [nfs]
[do_anonymous_page+280/512] do_anonymous_page+0x118/0x200
[<c0144be8>] do_anonymous_page+0x118/0x200
[handle_mm_fault+226/384] handle_mm_fault+0xe2/0x180
[<c0145192>] handle_mm_fault+0xe2/0x180
[apic_timer_interrupt+26/32] apic_timer_interrupt+0x1a/0x20
[<c0109e8a>] apic_timer_interrupt+0x1a/0x20
[csum_partial_copy_generic+154/252] csum_partial_copy_generic+0x9a/0xfc
[<c01b008e>] csum_partial_copy_generic+0x9a/0xfc
[skb_copy_and_csum_bits+102/672] skb_copy_and_csum_bits+0x66/0x2a0
[<c0221276>] skb_copy_and_csum_bits+0x66/0x2a0
[dput+48/544] dput+0x30/0x220
[<c0169810>] dput+0x30/0x220
[_end+475615076/1070231984] nfs_lookup_revalidate+0x374/0x650 [nfs]
[<dc8ebdb4>] nfs_lookup_revalidate+0x374/0x650 [nfs]
[default_wake_function+42/48] default_wake_function+0x2a/0x30
[<c011a49a>] default_wake_function+0x2a/0x30
[__wake_up_common+58/96] __wake_up_common+0x3a/0x60
[<c011a4da>] __wake_up_common+0x3a/0x60
[kill_fasync+46/80] kill_fasync+0x2e/0x50
[<c0164a1e>] kill_fasync+0x2e/0x50
[bh_lru_install+203/272] bh_lru_install+0xcb/0x110
[<c01554db>] bh_lru_install+0xcb/0x110
[_end+475510727/1070231984] rpcauth_lookup_credcache+0x217/0x260 [sunrpc]
[<dc8d2617>] rpcauth_lookup_credcache+0x217/0x260 [sunrpc]
[_end+475510911/1070231984] rpcauth_lookupcred+0x6f/0xa0 [sunrpc]
[<dc8d26cf>] rpcauth_lookupcred+0x6f/0xa0 [sunrpc]
[do_lookup+89/160] do_lookup+0x59/0xa0
[<c0160559>] do_lookup+0x59/0xa0
[link_path_walk+308/2320] link_path_walk+0x134/0x910
[<c01606d4>] link_path_walk+0x134/0x910
[kmem_cache_alloc+326/400] kmem_cache_alloc+0x146/0x190
[<c013e136>] kmem_cache_alloc+0x146/0x190
[__user_walk+64/96] __user_walk+0x40/0x60
[<c0161380>] __user_walk+0x40/0x60
[vfs_stat+30/96] vfs_stat+0x1e/0x60
[<c015c32e>] vfs_stat+0x1e/0x60
[sys_stat64+27/64] sys_stat64+0x1b/0x40
[<c015c9cb>] sys_stat64+0x1b/0x40
[sys_swapon+1536/2064] sys_swapon+0x600/0x810
[<c0150000>] sys_swapon+0x600/0x810
[syscall_call+7/11] syscall_call+0x7/0xb
[<c01094fb>] syscall_call+0x7/0xb


The system is UP P3-700, preempt is on, and compiled with gcc-3.2.2 from
Gentoo. +bk patches are from July 19, and there was also some patches
from -mm tree (Con's scheduler work). I reverted the ones in the nfs
area, trying to find the problem, but no luck.
The server is plain 2.4.20.

$ lsmod
Module Size Used by
nfs 92972 1
nfsd 93264 2 [unsafe]
exportfs 5440 1 nfsd
lockd 58768 3 nfs,nfsd,[unsafe]
sunrpc 120004 7 nfs,nfsd,lockd
rtc 10724 0
ide_cd 37984 0
cdrom 33248 1 ide_cd
mousedev 6964 1
hid 23328 0
tulip 54272 0
crc32 3840 1 tulip
uhci_hcd 29704 0
usbcore 97940 4 hid,uhci_hcd

There were also some mentions about old mount:
nfs warning: mount version older than kernel
Mine is not quite that old: mount-2.11z.

.config attached.

-alex


Attachments:
(No filename) (7.56 kB)
.config (23.69 kB)
Download all attachments

2003-07-20 22:40:14

by Trond Myklebust

[permalink] [raw]
Subject: Re: 2.6.0-test1+bk: NFS client-side freezes


Please duplicate with the standard kernel.

Cheers,
Trond

2003-07-22 06:21:10

by Alex Riesen

[permalink] [raw]
Subject: Re: 2.6.0-test1+bk: NFS client-side freezes

[[email protected] removed from cc, useless without subscription]

Trond Myklebust, Mon, Jul 21, 2003 00:54:54 +0200:
>
> Please duplicate with the standard kernel.
>

Well, seem the assorted collection of -mm and -ac patches I did was the
problem. Please do not ask me, what patches these were :-/

I could not reproduce the freeze with plain 2.6.0-test1 whatever I
tried. Also interactive behaviour feels better, so I stick to it for a
while.

-alex