2011-12-05 16:50:23

by Frank van Maarseveen

[permalink] [raw]
Subject: 3.1.4: NFSv3 RPC scheduling issue?

After upgrading 50+ NFSv3 (over UDP) client machines from 3.0.x to
3.1.4 I occasionally noticed a machine with lots of processes hanging
in __rpc_execute() for a specific mount point with no progress at all.
Stack:

[<c17fe7e0>] schedule+0x30/0x50
[<c177e259>] rpc_wait_bit_killable+0x19/0x30
[<c17feeb5>] __wait_on_bit+0x45/0x70
[<c177e240>] ? rpc_release_task+0x110/0x110
[<c17fef3d>] out_of_line_wait_on_bit+0x5d/0x70
[<c177e240>] ? rpc_release_task+0x110/0x110
[<c108aed0>] ? autoremove_wake_function+0x40/0x40
[<c177e89b>] __rpc_execute+0xdb/0x1a0
...

Every reference to the specific mount point on the client machine hangs
and the server does not receive any related network traffic. The server
works fine for other identical client machines with the same export mounted.
Other mounts on the (now) broken client still work. Killing the hanging
client processes repairs the situation.

This has happened a couple of times on client machines with heavy (NFS)
load. The mount-point has originally been mounted by the automounter.

--
Frank


2011-12-06 08:11:17

by Frank van Maarseveen

[permalink] [raw]
Subject: Re: 3.1.4: NFSv3 RPC scheduling issue?

On Mon, Dec 05, 2011 at 06:39:36PM -0500, Trond Myklebust wrote:
> On Mon, 2011-12-05 at 17:50 +0100, Frank van Maarseveen wrote:
> > After upgrading 50+ NFSv3 (over UDP) client machines from 3.0.x to
> > 3.1.4 I occasionally noticed a machine with lots of processes hanging
> > in __rpc_execute() for a specific mount point with no progress at all.
> > Stack:
> >
> > [<c17fe7e0>] schedule+0x30/0x50
> > [<c177e259>] rpc_wait_bit_killable+0x19/0x30
> > [<c17feeb5>] __wait_on_bit+0x45/0x70
> > [<c177e240>] ? rpc_release_task+0x110/0x110
> > [<c17fef3d>] out_of_line_wait_on_bit+0x5d/0x70
> > [<c177e240>] ? rpc_release_task+0x110/0x110
> > [<c108aed0>] ? autoremove_wake_function+0x40/0x40
> > [<c177e89b>] __rpc_execute+0xdb/0x1a0
> > ...
> >
> > Every reference to the specific mount point on the client machine hangs
> > and the server does not receive any related network traffic. The server
> > works fine for other identical client machines with the same export mounted.
> > Other mounts on the (now) broken client still work. Killing the hanging
> > client processes repairs the situation.
> >
> > This has happened a couple of times on client machines with heavy (NFS)
> > load. The mount-point has originally been mounted by the automounter.
>
> An command of 'echo 0 > /proc/sys/sunrpc/rpc_debug', should display a

36477 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:none
36479 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 LOOKUP a:call_reserveresult q:xprt_sending
36484 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 LOOKUP a:call_reserveresult q:xprt_sending
36485 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 LOOKUP a:call_reserveresult q:xprt_sending
36486 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36487 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 LOOKUP a:call_reserveresult q:xprt_sending
36488 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36489 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 LOOKUP a:call_reserveresult q:xprt_sending
36490 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36491 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36492 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36493 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36494 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36495 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36496 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 GETATTR a:call_reserveresult q:xprt_sending
36497 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36498 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 LOOKUP a:call_reserveresult q:xprt_sending
36499 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36500 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36501 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36502 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36503 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 LOOKUP a:call_reserveresult q:xprt_sending
36504 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36505 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36506 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36507 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36508 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36509 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36510 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36511 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36512 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36513 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36514 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36515 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36516 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36517 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36518 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36519 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36523 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36560 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36561 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36562 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36563 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36564 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36565 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36566 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36576 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 GETATTR a:call_reserveresult q:xprt_sending
36577 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36578 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36579 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36580 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36581 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36582 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36583 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
36592 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 GETATTR a:call_reserveresult q:xprt_sending
36618 0001 -11 ffff88008dc9db60 (null) 0 ffffffff8193ba60 nfsv3 WRITE a:call_reserveresult q:xprt_sending
21609 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending

--
Frank

2011-12-11 12:40:11

by Frank van Maarseveen

[permalink] [raw]
Subject: Re: 3.1.4: NFSv3 RPC scheduling issue?

On Fri, Dec 09, 2011 at 10:10:01PM -0500, Trond Myklebust wrote:
> [...]
> I'm still mystified as to what is going on here...
>
> Would it be possible to upgrade some of your clients to 3.1.5 (which
> contains a fix for a sunrpc socket buffer problem) and then to add the
> following patch?

Did so, the mount locked up and still is, ready for some more
experimentation. I don't see any difference however. Did a
echo 0 >/proc/sys/sunrpc/rpc_debug afterwards (see below).

A recipe which seems to trigger the issue (at least occasionally) is

cd /mount-point
ssh server echo 3 \>/proc/sys/vm/drop_caches
echo 3 >/proc/sys/vm/drop_caches
for i in `seq 100`
do
du >/dev/null 2>&1 &
done

I'll try it on a pristine kernel to rule out some kernel patches (unlikely to
be the cause or trigger but just to be sure).

-pid- flgs status -client- --rqstp- -timeout ---ops--
29785 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:none
29902 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29903 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29907 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29920 0080 -11 f519b6b0 f2606980 0 c191c4ac nfsv3 GETATTR a:call_status q:xprt_sending
29921 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29922 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29923 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29924 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29925 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29926 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29927 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29928 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29929 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29930 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29931 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29932 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29933 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29934 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29935 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29936 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29937 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29938 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29939 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29940 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29941 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29942 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29943 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29944 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29945 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29946 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29947 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29948 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29949 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29950 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29951 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29952 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29953 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29954 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29955 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29956 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29957 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29958 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29959 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29960 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29961 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29962 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29963 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29964 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29965 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29966 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29967 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29968 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29969 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29970 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29971 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29972 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29973 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29974 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29975 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29976 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29977 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29978 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29979 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29980 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29981 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29982 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29983 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29984 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29985 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29986 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29987 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29988 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29989 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29990 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29991 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29992 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29993 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29994 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29995 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29996 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29997 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29998 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
29999 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
30000 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
30001 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
30002 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
30003 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
30004 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
30005 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
30006 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
30007 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
30008 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
30009 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
30010 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
30011 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
30012 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
30013 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
30014 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
30015 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
30025 0080 -11 f519b6b0 (null) 0 c191c4ac nfsv3 ACCESS a:call_reserveresult q:xprt_sending

--
Frank

2011-12-05 23:39:37

by Myklebust, Trond

[permalink] [raw]
Subject: Re: 3.1.4: NFSv3 RPC scheduling issue?

On Mon, 2011-12-05 at 17:50 +0100, Frank van Maarseveen wrote:
> After upgrading 50+ NFSv3 (over UDP) client machines from 3.0.x to
> 3.1.4 I occasionally noticed a machine with lots of processes hanging
> in __rpc_execute() for a specific mount point with no progress at all.
> Stack:
>
> [<c17fe7e0>] schedule+0x30/0x50
> [<c177e259>] rpc_wait_bit_killable+0x19/0x30
> [<c17feeb5>] __wait_on_bit+0x45/0x70
> [<c177e240>] ? rpc_release_task+0x110/0x110
> [<c17fef3d>] out_of_line_wait_on_bit+0x5d/0x70
> [<c177e240>] ? rpc_release_task+0x110/0x110
> [<c108aed0>] ? autoremove_wake_function+0x40/0x40
> [<c177e89b>] __rpc_execute+0xdb/0x1a0
> ...
>
> Every reference to the specific mount point on the client machine hangs
> and the server does not receive any related network traffic. The server
> works fine for other identical client machines with the same export mounted.
> Other mounts on the (now) broken client still work. Killing the hanging
> client processes repairs the situation.
>
> This has happened a couple of times on client machines with heavy (NFS)
> load. The mount-point has originally been mounted by the automounter.

An command of 'echo 0 > /proc/sys/sunrpc/rpc_debug', should display a
list of pending rpc_tasks as well as information on where they are
sleeping.
Can you please try this on one of the hanging clients and post the
resulting dump?

Cheers
Trond
--
Trond Myklebust
Linux NFS client maintainer

NetApp
[email protected]
http://www.netapp.com


2011-12-07 13:44:03

by Frank van Maarseveen

[permalink] [raw]
Subject: Re: 3.1.4: NFSv3 RPC scheduling issue?

On Tue, Dec 06, 2011 at 02:57:43PM -0500, Trond Myklebust wrote:
> On Tue, 2011-12-06 at 09:11 +0100, Frank van Maarseveen wrote:
> > On Mon, Dec 05, 2011 at 06:39:36PM -0500, Trond Myklebust wrote:
> > > On Mon, 2011-12-05 at 17:50 +0100, Frank van Maarseveen wrote:
> > > > After upgrading 50+ NFSv3 (over UDP) client machines from 3.0.x to
> > > > 3.1.4 I occasionally noticed a machine with lots of processes hanging
> > > > in __rpc_execute() for a specific mount point with no progress at all.
> > > > Stack:
> > > >
> > > > [<c17fe7e0>] schedule+0x30/0x50
> > > > [<c177e259>] rpc_wait_bit_killable+0x19/0x30
> > > > [<c17feeb5>] __wait_on_bit+0x45/0x70
> > > > [<c177e240>] ? rpc_release_task+0x110/0x110
> > > > [<c17fef3d>] out_of_line_wait_on_bit+0x5d/0x70
> > > > [<c177e240>] ? rpc_release_task+0x110/0x110
> > > > [<c108aed0>] ? autoremove_wake_function+0x40/0x40
> > > > [<c177e89b>] __rpc_execute+0xdb/0x1a0
> > > > ...
> > > >
> > > > Every reference to the specific mount point on the client machine hangs
> > > > and the server does not receive any related network traffic. The server
> > > > works fine for other identical client machines with the same export mounted.
> > > > Other mounts on the (now) broken client still work. Killing the hanging
> > > > client processes repairs the situation.
> > > >
> > > > This has happened a couple of times on client machines with heavy (NFS)
> > > > load. The mount-point has originally been mounted by the automounter.
> > >
> > > An command of 'echo 0 > /proc/sys/sunrpc/rpc_debug', should display a
> >
> > 36477 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:none
> > 36479 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 LOOKUP a:call_reserveresult q:xprt_sending
> > 36484 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 LOOKUP a:call_reserveresult q:xprt_sending
> > 36485 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 LOOKUP a:call_reserveresult q:xprt_sending
> > 36486 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36487 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 LOOKUP a:call_reserveresult q:xprt_sending
> > 36488 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36489 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 LOOKUP a:call_reserveresult q:xprt_sending
> > 36490 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36491 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36492 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36493 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36494 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36495 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36496 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 GETATTR a:call_reserveresult q:xprt_sending
> > 36497 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36498 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 LOOKUP a:call_reserveresult q:xprt_sending
> > 36499 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36500 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36501 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36502 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36503 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 LOOKUP a:call_reserveresult q:xprt_sending
> > 36504 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36505 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36506 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36507 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36508 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36509 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36510 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36511 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36512 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36513 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36514 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36515 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36516 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36517 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36518 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36519 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36523 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36560 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36561 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36562 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36563 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36564 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36565 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36566 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36576 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 GETATTR a:call_reserveresult q:xprt_sending
> > 36577 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36578 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36579 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36580 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36581 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36582 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36583 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> > 36592 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 GETATTR a:call_reserveresult q:xprt_sending
> > 36618 0001 -11 ffff88008dc9db60 (null) 0 ffffffff8193ba60 nfsv3 WRITE a:call_reserveresult q:xprt_sending
> > 21609 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
>
> Hmm... Is this the full dump from that client? I have a theory about
> what is going on with the second dump that you showed, but I really do
> not understand this one... If the above trace is complete, then it would
> indicate that the value of xprt->snd_task has been corrupted somehow.

I did three more 'echo 0 > /proc/sys/sunrpc/rpc_debug' before repairing
the mount and outputs differ slightly (file '1' contains above dump, 2,
3 and 4 are the others):

diff 1 2:
| 58d57
| < 21609 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
diff 1 3:
| 47d46
| < 36566 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
| 58c57
| < 21609 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
| ---
| > 33046 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
diff 1 4:
| 16d15
| < 36496 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 GETATTR a:call_reserveresult q:xprt_sending
| 47d45
| < 36566 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
| 58c56,57
| < 21609 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
| ---
| > 33046 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
| > 34798 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending

--
Frank

2011-12-10 03:10:23

by Myklebust, Trond

[permalink] [raw]
Subject: Re: 3.1.4: NFSv3 RPC scheduling issue?

On Wed, 2011-12-07 at 14:43 +0100, Frank van Maarseveen wrote:
> I did three more 'echo 0 > /proc/sys/sunrpc/rpc_debug' before repairing
> the mount and outputs differ slightly (file '1' contains above dump, 2,
> 3 and 4 are the others):
>
> diff 1 2:
> | 58d57
> | < 21609 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> diff 1 3:
> | 47d46
> | < 36566 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> | 58c57
> | < 21609 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> | ---
> | > 33046 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> diff 1 4:
> | 16d15
> | < 36496 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 GETATTR a:call_reserveresult q:xprt_sending
> | 47d45
> | < 36566 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> | 58c56,57
> | < 21609 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> | ---
> | > 33046 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> | > 34798 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending

I'm still mystified as to what is going on here...

Would it be possible to upgrade some of your clients to 3.1.5 (which
contains a fix for a sunrpc socket buffer problem) and then to add the
following patch?

Cheers
Trond

8<-------------------------------------------------------------------------------
>From 72a29422b5d4e00c5aae6466b415cb5b075bb032 Mon Sep 17 00:00:00 2001
From: Trond Myklebust <[email protected]>
Date: Fri, 9 Dec 2011 13:40:57 -0500
Subject: [PATCH 1/2] debugging

- Ensure that we clean up in case the call to queue xprt->task_cleanup
fails (although it is hard to see how it could).
- Ensure that we don't hold onto a UDP congestion slot if we're queued up
waiting for the socket lock.
- Add debugging printks to check for the above two conditions.

- Add further debugging printks to track slot allocation
---
net/sunrpc/xprt.c | 56 +++++++++++++++++++++++++++++++++++++++-------------
1 files changed, 42 insertions(+), 14 deletions(-)

diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index c64c0ef..284eb42 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -44,6 +44,8 @@
#include <linux/workqueue.h>
#include <linux/net.h>
#include <linux/ktime.h>
+#include <linux/ratelimit.h>
+#include <linux/printk.h>

#include <linux/sunrpc/clnt.h>
#include <linux/sunrpc/metrics.h>
@@ -66,6 +68,7 @@ static void xprt_init(struct rpc_xprt *xprt, struct net *net);
static void xprt_request_init(struct rpc_task *, struct rpc_xprt *);
static void xprt_connect_status(struct rpc_task *task);
static int __xprt_get_cong(struct rpc_xprt *, struct rpc_task *);
+static void __xprt_put_cong(struct rpc_xprt *, struct rpc_rqst *);

static DEFINE_SPINLOCK(xprt_list_lock);
static LIST_HEAD(xprt_list);
@@ -227,15 +230,28 @@ out_sleep:
}
EXPORT_SYMBOL_GPL(xprt_reserve_xprt);

+static void __xprt_clear_locked(struct rpc_xprt *xprt)
+{
+ smp_mb__before_clear_bit();
+ clear_bit(XPRT_LOCKED, &xprt->state);
+ smp_mb__after_clear_bit();
+}
+
+static void xprt_run_task_cleanup(struct rpc_xprt *xprt)
+{
+ if (queue_work(rpciod_workqueue, &xprt->task_cleanup) == 0) {
+ __xprt_clear_locked(xprt);
+ pr_err("%s: RPC task cleanup failed to start!\n", __func__);
+ }
+}
+
static void xprt_clear_locked(struct rpc_xprt *xprt)
{
xprt->snd_task = NULL;
- if (!test_bit(XPRT_CLOSE_WAIT, &xprt->state) || xprt->shutdown) {
- smp_mb__before_clear_bit();
- clear_bit(XPRT_LOCKED, &xprt->state);
- smp_mb__after_clear_bit();
- } else
- queue_work(rpciod_workqueue, &xprt->task_cleanup);
+ if (test_bit(XPRT_CLOSE_WAIT, &xprt->state) && !xprt->shutdown)
+ xprt_run_task_cleanup(xprt);
+ else
+ __xprt_clear_locked(xprt);
}

/*
@@ -271,12 +287,18 @@ out_sleep:
dprintk("RPC: %5u failed to lock transport %p\n", task->tk_pid, xprt);
task->tk_timeout = 0;
task->tk_status = -EAGAIN;
- if (req == NULL)
- priority = RPC_PRIORITY_LOW;
- else if (!req->rq_ntrans)
+ priority = RPC_PRIORITY_LOW;
+ if (req != NULL) {
priority = RPC_PRIORITY_NORMAL;
- else
- priority = RPC_PRIORITY_HIGH;
+ if (req->rq_ntrans)
+ priority = RPC_PRIORITY_HIGH;
+ if (req->rq_cong) {
+ __xprt_put_cong(xprt, req);
+ pr_err("%s: don't wait for a transport write lock "
+ "while holding a congestion slot.\n",
+ __func__);
+ }
+ }
rpc_sleep_on_priority(&xprt->sending, task, NULL, priority);
return 0;
}
@@ -636,7 +658,7 @@ void xprt_force_disconnect(struct rpc_xprt *xprt)
set_bit(XPRT_CLOSE_WAIT, &xprt->state);
/* Try to schedule an autoclose RPC call */
if (test_and_set_bit(XPRT_LOCKED, &xprt->state) == 0)
- queue_work(rpciod_workqueue, &xprt->task_cleanup);
+ xprt_run_task_cleanup(xprt);
xprt_wake_pending_tasks(xprt, -EAGAIN);
spin_unlock_bh(&xprt->transport_lock);
}
@@ -663,7 +685,7 @@ void xprt_conditional_disconnect(struct rpc_xprt *xprt, unsigned int cookie)
set_bit(XPRT_CLOSE_WAIT, &xprt->state);
/* Try to schedule an autoclose RPC call */
if (test_and_set_bit(XPRT_LOCKED, &xprt->state) == 0)
- queue_work(rpciod_workqueue, &xprt->task_cleanup);
+ xprt_run_task_cleanup(xprt);
xprt_wake_pending_tasks(xprt, -EAGAIN);
out:
spin_unlock_bh(&xprt->transport_lock);
@@ -681,7 +703,7 @@ xprt_init_autodisconnect(unsigned long data)
goto out_abort;
spin_unlock(&xprt->transport_lock);
set_bit(XPRT_CONNECTION_CLOSE, &xprt->state);
- queue_work(rpciod_workqueue, &xprt->task_cleanup);
+ xprt_run_task_cleanup(xprt);
return;
out_abort:
spin_unlock(&xprt->transport_lock);
@@ -942,6 +964,8 @@ static struct rpc_rqst *xprt_dynamic_alloc_slot(struct rpc_xprt *xprt, gfp_t gfp
{
struct rpc_rqst *req = ERR_PTR(-EAGAIN);

+ if (atomic_read(&xprt->num_reqs) > xprt->max_reqs)
+ pr_err_ratelimited("%s: xprt->num_reqs is out of range!\n", __func__);
if (!atomic_add_unless(&xprt->num_reqs, 1, xprt->max_reqs))
goto out;
req = kzalloc(sizeof(struct rpc_rqst), gfp_flags);
@@ -955,6 +979,8 @@ out:

static bool xprt_dynamic_free_slot(struct rpc_xprt *xprt, struct rpc_rqst *req)
{
+ if (atomic_read(&xprt->num_reqs) < xprt->min_reqs)
+ pr_err_ratelimited("%s: xprt->num_reqs is out of range!\n", __func__);
if (atomic_add_unless(&xprt->num_reqs, -1, xprt->min_reqs)) {
kfree(req);
return true;
@@ -1042,6 +1068,8 @@ struct rpc_xprt *xprt_alloc(struct net *net, size_t size,
xprt->max_reqs = num_prealloc;
xprt->min_reqs = num_prealloc;
atomic_set(&xprt->num_reqs, num_prealloc);
+ if (num_prealloc < 1)
+ pr_err("%s: too few preallocated slots!\n", __func__);

return xprt;

--
1.7.7.3



--
Trond Myklebust
Linux NFS client maintainer

NetApp
[email protected]
http://www.netapp.com


2011-12-06 19:57:59

by Myklebust, Trond

[permalink] [raw]
Subject: Re: 3.1.4: NFSv3 RPC scheduling issue?

On Tue, 2011-12-06 at 09:11 +0100, Frank van Maarseveen wrote:
> On Mon, Dec 05, 2011 at 06:39:36PM -0500, Trond Myklebust wrote:
> > On Mon, 2011-12-05 at 17:50 +0100, Frank van Maarseveen wrote:
> > > After upgrading 50+ NFSv3 (over UDP) client machines from 3.0.x to
> > > 3.1.4 I occasionally noticed a machine with lots of processes hanging
> > > in __rpc_execute() for a specific mount point with no progress at all.
> > > Stack:
> > >
> > > [<c17fe7e0>] schedule+0x30/0x50
> > > [<c177e259>] rpc_wait_bit_killable+0x19/0x30
> > > [<c17feeb5>] __wait_on_bit+0x45/0x70
> > > [<c177e240>] ? rpc_release_task+0x110/0x110
> > > [<c17fef3d>] out_of_line_wait_on_bit+0x5d/0x70
> > > [<c177e240>] ? rpc_release_task+0x110/0x110
> > > [<c108aed0>] ? autoremove_wake_function+0x40/0x40
> > > [<c177e89b>] __rpc_execute+0xdb/0x1a0
> > > ...
> > >
> > > Every reference to the specific mount point on the client machine hangs
> > > and the server does not receive any related network traffic. The server
> > > works fine for other identical client machines with the same export mounted.
> > > Other mounts on the (now) broken client still work. Killing the hanging
> > > client processes repairs the situation.
> > >
> > > This has happened a couple of times on client machines with heavy (NFS)
> > > load. The mount-point has originally been mounted by the automounter.
> >
> > An command of 'echo 0 > /proc/sys/sunrpc/rpc_debug', should display a
>
> 36477 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:none
> 36479 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 LOOKUP a:call_reserveresult q:xprt_sending
> 36484 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 LOOKUP a:call_reserveresult q:xprt_sending
> 36485 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 LOOKUP a:call_reserveresult q:xprt_sending
> 36486 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36487 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 LOOKUP a:call_reserveresult q:xprt_sending
> 36488 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36489 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 LOOKUP a:call_reserveresult q:xprt_sending
> 36490 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36491 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36492 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36493 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36494 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36495 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36496 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 GETATTR a:call_reserveresult q:xprt_sending
> 36497 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36498 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 LOOKUP a:call_reserveresult q:xprt_sending
> 36499 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36500 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36501 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36502 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36503 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 LOOKUP a:call_reserveresult q:xprt_sending
> 36504 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36505 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36506 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36507 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36508 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36509 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36510 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36511 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36512 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36513 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36514 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36515 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36516 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36517 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36518 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36519 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36523 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36560 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36561 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36562 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36563 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36564 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36565 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36566 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36576 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 GETATTR a:call_reserveresult q:xprt_sending
> 36577 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36578 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36579 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36580 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36581 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36582 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36583 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending
> 36592 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 GETATTR a:call_reserveresult q:xprt_sending
> 36618 0001 -11 ffff88008dc9db60 (null) 0 ffffffff8193ba60 nfsv3 WRITE a:call_reserveresult q:xprt_sending
> 21609 0080 -11 ffff88008dc9db60 (null) 0 ffffffff81a68860 nfsv3 ACCESS a:call_reserveresult q:xprt_sending

Hmm... Is this the full dump from that client? I have a theory about
what is going on with the second dump that you showed, but I really do
not understand this one... If the above trace is complete, then it would
indicate that the value of xprt->snd_task has been corrupted somehow.

Trond

--
Trond Myklebust
Linux NFS client maintainer

NetApp
[email protected]
http://www.netapp.com


2011-12-11 18:10:45

by Frank van Maarseveen

[permalink] [raw]
Subject: Re: 3.1.4: NFSv3 RPC scheduling issue?

On Sun, Dec 11, 2011 at 01:40:08PM +0100, Frank van Maarseveen wrote:
> On Fri, Dec 09, 2011 at 10:10:01PM -0500, Trond Myklebust wrote:
> > [...]
> > I'm still mystified as to what is going on here...
> >
> > Would it be possible to upgrade some of your clients to 3.1.5 (which
> > contains a fix for a sunrpc socket buffer problem) and then to add the
> > following patch?
>
> Did so, the mount locked up and still is, ready for some more
> experimentation. I don't see any difference however. Did a
> echo 0 >/proc/sys/sunrpc/rpc_debug afterwards (see below).
>
> A recipe which seems to trigger the issue (at least occasionally) is
>
> cd /mount-point
> ssh server echo 3 \>/proc/sys/vm/drop_caches
> echo 3 >/proc/sys/vm/drop_caches
> for i in `seq 100`
> do
> du >/dev/null 2>&1 &
> done
>
> I'll try it on a pristine kernel to rule out some kernel patches (unlikely to
> be the cause or trigger but just to be sure).

Tried, same result: my own NFS client patches seem not to make any
difference, as I expected. The ICMP port unreachable (see my other mail)
go away when I stop ypbind and they are triggered by "ypwhich" commands
too so I consider them no longer relevant.

Not much output this time after "echo 0 >/proc/sys/sunrpc/rpc_debug". I
tried twice:

-pid- flgs status -client- --rqstp- -timeout ---ops--
16020 0080 -11 f4778230 f325d0a0 0 c191b4ac nfsv3 GETATTR a:call_status q:xprt_sending
16038 0080 -11 f4778230 (null) 0 c191b4ac nfsv3 GETATTR a:call_reserveresult q:none
16041 0080 -11 f4778230 (null) 0 c191b4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
16045 0080 -11 f4778230 (null) 0 c191b4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
16048 0080 -11 f4778230 (null) 0 c191b4ac nfsv3 READDIRPLUS a:call_reserveresult q:xprt_sending
16060 0080 -11 f4778230 (null) 0 c191b4ac nfsv3 ACCESS a:call_reserveresult q:xprt_sending
16062 0080 -11 f4778230 (null) 0 c191b4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
16069 0080 -11 f4778230 (null) 0 c191b4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
-pid- flgs status -client- --rqstp- -timeout ---ops--
16020 0080 -11 f4778230 f325d0a0 0 c191b4ac nfsv3 GETATTR a:call_status q:xprt_sending
16038 0080 -11 f4778230 (null) 0 c191b4ac nfsv3 GETATTR a:call_reserveresult q:none
16041 0080 -11 f4778230 (null) 0 c191b4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
16045 0080 -11 f4778230 (null) 0 c191b4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
16048 0080 -11 f4778230 (null) 0 c191b4ac nfsv3 READDIRPLUS a:call_reserveresult q:xprt_sending
16060 0080 -11 f4778230 (null) 0 c191b4ac nfsv3 ACCESS a:call_reserveresult q:xprt_sending
16062 0080 -11 f4778230 (null) 0 c191b4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
16069 0080 -11 f4778230 (null) 0 c191b4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending


The NFS client mounts from a machine holding many virtual NFS servers
using an separate IP address for every export. When access on the client
hangs then the same export is still mountable on this NFS client using
a different server IP address (one NIC at both sides btw.). The dead
virtual server IP address seems only dead for NFS RPC and only from the
client in question: there is no traffic going out. Ping, rpcinfo et al
just work. Mount on the client in trouble using the dead IP address but
specifying a different virtual server export produces some traffic and
then gets stuck too, I guess at the point when kernel needs to do NFS RPC.

So, kernel NFS RPC from client drops dead for a specific server IP address.

--
Frank

2011-12-11 14:09:27

by Frank van Maarseveen

[permalink] [raw]
Subject: Re: 3.1.4: NFSv3 RPC scheduling issue?

On Fri, Dec 09, 2011 at 10:10:01PM -0500, Trond Myklebust wrote:
> [...]
> I'm still mystified as to what is going on here...

There is some network traffic I don't understand but it might be relevant.

Client machine: 172.17.1.49
server hardware: 172.17.2.241

The server has an additional IP addres for every exported mount point. The
one which is mounted and blocked on the client doesn't show up in traffic
below. The server itself serves NIS too but the client is bound to a
different physical machine for this. The following was captured (added
some tshark -V output to clarify a bit):

Frame Time Data
1 0.000000 172.17.1.49 -> 172.17.255.255 Portmap V2 CALLIT Call
--------------------------------------------------------------------
User Datagram Protocol, Src Port: 38242 (38242), Dst Port: sunrpc (111)
Remote Procedure Call, Type:Call XID:0x42d68775
Program: Portmap (100000)
Program Version: 2
Procedure: CALLIT (5)
Credentials
Flavor: AUTH_UNIX (1)
Length: 40
Verifier
Flavor: AUTH_NULL (0)
Length: 0
Portmap DOMAIN_NONACK call
[Program Version: 2]
[V2 Procedure: CALLIT (5)]
Program: YPSERV (100004)
Version: 2
Procedure: DOMAIN_NONACK (2)
Argument length: 8
Domain: TSKU
length: 4
contents: TSKU

2 0.000544 172.17.2.241 -> 172.17.1.49 UDP Source port: rda Destination port: 38242
-----------------------------------------------------------------------------------------
User Datagram Protocol, Src Port: rda (630), Dst Port: 38242 (38242)
Data (36 bytes) // no clue from tshark
0000 42 d6 87 75 00 00 00 01 00 00 00 00 00 00 00 00 B..u............
0010 00 00 00 00 00 00 00 00 00 00 03 5b 00 00 00 04 ...........[....
0020 00 00 00 01 ....


3 0.000740 172.17.1.49 -> 172.17.2.241 ICMP Destination unreachable (Port unreachable)
-------------------------------------------------------------------------------------------
Internet Control Message Protocol
Type: 3 (Destination unreachable)
Code: 3 (Port unreachable)
User Datagram Protocol, Src Port: rda (630), Dst Port: 38242 (38242)

4 5.016300 Dell_e7:1f:f0 -> Dell_38:31:b8 ARP Who has 172.17.1.49? Tell 172.17.2.241
5 5.016443 Dell_38:31:b8 -> Dell_e7:1f:f0 ARP 172.17.1.49 is at 00:14:22:38:31:b8
6 20.001647 172.17.1.49 -> 172.17.255.255 Portmap V2 CALLIT Call
7 20.002132 172.17.2.241 -> 172.17.1.49 UDP Source port: rda Destination port: 49408
8 20.002375 172.17.1.49 -> 172.17.2.241 ICMP Destination unreachable (Port unreachable)
9 25.013384 Dell_38:31:b8 -> Dell_e7:1f:f0 ARP Who has 172.17.2.241? Tell 172.17.1.49
10 25.013397 Dell_e7:1f:f0 -> Dell_38:31:b8 ARP 172.17.2.241 is at 00:18:8b:e7:1f:f0
11 40.003525 172.17.1.49 -> 172.17.255.255 Portmap V2 CALLIT Call
12 40.004003 172.17.2.241 -> 172.17.1.49 UDP Source port: rda Destination port: 51755
13 40.004249 172.17.1.49 -> 172.17.2.241 ICMP Destination unreachable (Port unreachable)

The UDP port seems random and doesn't show up in rpcinfo -p for client
or server. What bothers me is the ICMP Destination unreachable. It
applies to a port (code=3) but it might have some adverse effect on other
traffic from the same hardware to the client. And I don't understand
why the ICMP error occurs. But maybe this is all irrelevant.

--
Frank

2011-12-06 09:04:53

by Frank van Maarseveen

[permalink] [raw]
Subject: Re: 3.1.4: NFSv3 RPC scheduling issue?

On Mon, Dec 05, 2011 at 06:39:36PM -0500, Trond Myklebust wrote:
> On Mon, 2011-12-05 at 17:50 +0100, Frank van Maarseveen wrote:
> > After upgrading 50+ NFSv3 (over UDP) client machines from 3.0.x to
> > 3.1.4 I occasionally noticed a machine with lots of processes hanging
> > in __rpc_execute() for a specific mount point with no progress at all.
> > Stack:
> >
> > [<c17fe7e0>] schedule+0x30/0x50
> > [<c177e259>] rpc_wait_bit_killable+0x19/0x30
> > [<c17feeb5>] __wait_on_bit+0x45/0x70
> > [<c177e240>] ? rpc_release_task+0x110/0x110
> > [<c17fef3d>] out_of_line_wait_on_bit+0x5d/0x70
> > [<c177e240>] ? rpc_release_task+0x110/0x110
> > [<c108aed0>] ? autoremove_wake_function+0x40/0x40
> > [<c177e89b>] __rpc_execute+0xdb/0x1a0
> > ...
> >
> > Every reference to the specific mount point on the client machine hangs
> > and the server does not receive any related network traffic. The server
> > works fine for other identical client machines with the same export mounted.
> > Other mounts on the (now) broken client still work. Killing the hanging
> > client processes repairs the situation.
> >
> > This has happened a couple of times on client machines with heavy (NFS)
> > load. The mount-point has originally been mounted by the automounter.
>
> An command of 'echo 0 > /proc/sys/sunrpc/rpc_debug', should display a
> list of pending rpc_tasks as well as information on where they are
> sleeping.
> Can you please try this on one of the hanging clients and post the
> resulting dump?

Here's another one:
-pid- flgs status -client- --rqstp- -timeout ---ops--
28050 0080 -11 c2d3c460 (null) 0 c191c4ac nfsv3 ACCESS a:call_reserveresult q:none
28074 0080 -11 c2d3c460 c8e82000 0 c191c4ac nfsv3 LOOKUP a:call_status q:xprt_sending
28078 0080 -11 c2d3c460 (null) 0 c191c4ac nfsv3 LOOKUP a:call_reserveresult q:xprt_sending
28080 0080 -11 c2d3c460 (null) 0 c191c4ac nfsv3 ACCESS a:call_reserveresult q:xprt_sending
28085 0001 -11 c2d3c460 (null) 0 c182e34c nfsv3 READ a:call_reserveresult q:xprt_sending
28086 0001 -11 c2d3c460 (null) 0 c182e34c nfsv3 READ a:call_reserveresult q:xprt_sending
28087 0001 -11 c2d3c460 (null) 0 c182e34c nfsv3 READ a:call_reserveresult q:xprt_sending
28089 0001 -11 c2d3c460 (null) 0 c182e34c nfsv3 READ a:call_reserveresult q:xprt_sending
28090 0001 -11 c2d3c460 (null) 0 c182e34c nfsv3 READ a:call_reserveresult q:xprt_sending
28091 0001 -11 c2d3c460 (null) 0 c182e34c nfsv3 READ a:call_reserveresult q:xprt_sending
28092 0001 -11 c2d3c460 (null) 0 c182e34c nfsv3 READ a:call_reserveresult q:xprt_sending
28093 0001 -11 c2d3c460 (null) 0 c182e34c nfsv3 READ a:call_reserveresult q:xprt_sending
28094 0001 -11 c2d3c460 (null) 0 c182e34c nfsv3 READ a:call_reserveresult q:xprt_sending
28095 0001 -11 c2d3c460 (null) 0 c182e34c nfsv3 READ a:call_reserveresult q:xprt_sending
28096 0001 -11 c2d3c460 (null) 0 c182e34c nfsv3 READ a:call_reserveresult q:xprt_sending
28097 0001 -11 c2d3c460 (null) 0 c182e34c nfsv3 READ a:call_reserveresult q:xprt_sending
28098 0001 -11 c2d3c460 (null) 0 c182e34c nfsv3 READ a:call_reserveresult q:xprt_sending
28099 0001 -11 c2d3c460 (null) 0 c182e34c nfsv3 READ a:call_reserveresult q:xprt_sending
28100 0001 -11 c2d3c460 (null) 0 c182e34c nfsv3 READ a:call_reserveresult q:xprt_sending
28106 0001 -11 c2d3c460 (null) 0 c182e34c nfsv3 READ a:call_reserveresult q:xprt_sending
28107 0001 -11 c2d3c460 (null) 0 c182e34c nfsv3 READ a:call_reserveresult q:xprt_sending
28108 0001 -11 c2d3c460 (null) 0 c182e34c nfsv3 READ a:call_reserveresult q:xprt_sending
28109 0001 -11 c2d3c460 (null) 0 c182e34c nfsv3 READ a:call_reserveresult q:xprt_sending
28111 0001 -11 c2d3c460 (null) 0 c182e34c nfsv3 READ a:call_reserveresult q:xprt_sending
28112 0001 -11 c2d3c460 (null) 0 c182e34c nfsv3 READ a:call_reserveresult q:xprt_sending
28113 0001 -11 c2d3c460 (null) 0 c182e34c nfsv3 READ a:call_reserveresult q:xprt_sending
28114 0001 -11 c2d3c460 (null) 0 c182e34c nfsv3 READ a:call_reserveresult q:xprt_sending
28115 0001 -11 c2d3c460 (null) 0 c182e34c nfsv3 READ a:call_reserveresult q:xprt_sending
28116 0001 -11 c2d3c460 (null) 0 c182e34c nfsv3 READ a:call_reserveresult q:xprt_sending
28117 0001 -11 c2d3c460 (null) 0 c182e34c nfsv3 READ a:call_reserveresult q:xprt_sending
28118 0001 -11 c2d3c460 (null) 0 c182e34c nfsv3 READ a:call_reserveresult q:xprt_sending
28119 0001 -11 c2d3c460 (null) 0 c182e34c nfsv3 READ a:call_reserveresult q:xprt_sending
28120 0001 -11 c2d3c460 (null) 0 c182e34c nfsv3 READ a:call_reserveresult q:xprt_sending
28121 0001 -11 c2d3c460 (null) 0 c182e34c nfsv3 READ a:call_reserveresult q:xprt_sending
28131 0080 -11 c2d3c460 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
28144 0080 -11 c2d3c460 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
28145 0080 -11 c2d3c460 (null) 0 c191c4ac nfsv3 ACCESS a:call_reserveresult q:xprt_sending
28169 0080 -11 c2d3c460 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
28170 0080 -11 c2d3c460 (null) 0 c191c4ac nfsv3 ACCESS a:call_reserveresult q:xprt_sending
28207 0080 -11 c2d3c460 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
28210 0080 -11 c2d3c460 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
28228 0080 -11 c2d3c460 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
28237 0080 -11 c2d3c460 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
28297 0080 -11 c2d3c460 (null) 0 c191c4ac nfsv3 LOOKUP a:call_reserveresult q:xprt_sending
28306 0080 -11 c2d3c460 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
28311 0080 -11 c2d3c460 (null) 0 c191c4ac nfsv3 LOOKUP a:call_reserveresult q:xprt_sending
28385 0080 -11 c2d3c460 (null) 0 c191c4ac nfsv3 GETATTR a:call_reserveresult q:xprt_sending
28401 0080 -11 c2d3c460 (null) 0 c191c4ac nfsv3 ACCESS a:call_reserveresult q:xprt_sending
28915 0080 -11 c2d3c460 (null) 0 c191c4ac nfsv3 ACCESS a:call_reserveresult q:xprt_sending
29279 0080 -11 c2d3c460 (null) 0 c191c4ac nfsv3 ACCESS a:call_reserveresult q:xprt_sending
29393 0080 -11 c2d3c460 (null) 0 c191c4ac nfsv3 ACCESS a:call_reserveresult q:xprt_sending
29469 0080 -11 c2d3c460 (null) 0 c191c4ac nfsv3 ACCESS a:call_reserveresult q:xprt_sending
37587 0080 -11 c2d3c460 (null) 0 c191c4ac nfsv3 FSSTAT a:call_reserveresult q:xprt_sending


--
Frank